High network latency after incoming UDP bursts [Solved]


This is a crosspost from https://unix.stackexchange.com/questions/411996/high-network-latency-after-incoming-udp-bursts

I have an application sending UDP packets over a RAW socket (defining ETH, IP and UDP headers manually), with up to 1 gigabit speed over a gigabit interface (PHY ID 01410dd1). Target system: Linux 4.11 / Cyclone V.

The same application receives UDP packets over a UDP socket with small bursts (hundreds of packets at once, then again hundreds of packets after some delay, and so on).

This all works well for smaller bursts, but for some reason receiving too many UDP packets at once brings the network interface in a strange state, which I hope someone can explain to me.

All network connections to the target suddenly have a very high latency, where it seems that a packet is only sent out upon receiving another packet. Pinging the target shows this:

ping -i 1 <target>
64 bytes from <target>: icmp_seq=143 ttl=64 time=1000 ms

ping -i 3 <target>
64 bytes from <target>: icmp_seq=143 ttl=64 time=3000 ms

And in wireshark:

0.00:00.000 ICMP Echo (ping) request, seq=1/1024, ttl=64
0.00:01.000 ICMP Echo (ping) request, seq=2/1024, ttl=64
0.00:01.001 ICMP Echo (ping) reply, seq=1/1024, ttl=64
0.00:02.000 ICMP Echo (ping) request, seq=3/1024, ttl=64
0.00:02.001 ICMP Echo (ping) reply, seq=2/1024, ttl=64

For a ping timeout of 3s:

0.00:00.000 ICMP Echo (ping) request, seq=1/1024, ttl=64
0.00:03.000 ICMP Echo (ping) request, seq=2/1024, ttl=64
0.00:03.001 ICMP Echo (ping) reply, seq=1/1024, ttl=64
0.00:06.000 ICMP Echo (ping) request, seq=3/1024, ttl=64
0.00:06.001 ICMP Echo (ping) reply, seq=2/1024, ttl=64

Closing the application has no effect, bringing the network interface down and up again (ifconfig eth0 down; ifconfig eth0 up) solves the issues and brings the latency back to normal.

This affects all traffic, so logging in over ssh takes a long time, as does sending a file using scp, where the transfer takes a while to start, but is sent with the same throughput as usual.

top shows nothing, the system is idle and behaves normal besides the high network latency.

ifconfig before high latency:

eth0      Link encap:Ethernet  HWaddr xx:xx:xx:xx:xx:xx
          inet addr:xxx.xxx.xxx.xxx  Bcast:xxx.xxx.xxx.255 Mask:
          RX packets:28450933 errors:0 dropped:0 overruns:0 frame:0
          TX packets:312497021 errors:134 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1991541985 (1.8 GiB)  TX bytes:3038468710 (2.8 GiB)
          Interrupt:27 Base address:0x6000 

ifconfig during high latency:

eth0      Link encap:Ethernet  HWaddr xx:xx:xx:xx:xx:xx  
          inet addr:xxx.xxx.xxx.xxx  Bcast:xxx.xxx.xxx.255  Mask:
          RX packets:28476996 errors:0 dropped:0 overruns:0 frame:0
          TX packets:312589025 errors:134 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1993344757 (1.8 GiB)  TX bytes:3333157248 (3.1 GiB)
          Interrupt:27 Base address:0x6000 

ethtool statistics during high latency:

mmc_tx_octetcount_gb: 295080166
mmc_tx_framecount_gb: 92109
mmc_tx_broadcastframe_g: 2
mmc_tx_multicastframe_g: 0
mmc_tx_64_octets_gb: 1459
mmc_tx_65_to_127_octets_gb: 313
mmc_tx_128_to_255_octets_gb: 94
mmc_tx_256_to_511_octets_gb: 31
mmc_tx_512_to_1023_octets_gb: 8807
mmc_tx_1024_to_max_octets_gb: 0
mmc_tx_unicast_gb: 92107
mmc_tx_multicast_gb: 0
mmc_tx_broadcast_gb: 2
mmc_tx_underflow_error: 0
mmc_tx_singlecol_g: 0
mmc_tx_multicol_g: 0
mmc_tx_deferred: 0
mmc_tx_latecol: 0
mmc_tx_exesscol: 0
mmc_tx_carrier_error: 0
mmc_tx_octetcount_g: 295080166
mmc_tx_framecount_g: 92109
mmc_tx_excessdef: 0
mmc_tx_pause_frame: 0
mmc_tx_vlan_frame_g: 0
mmc_rx_framecount_gb: 31891
mmc_rx_octetcount_gb: 2341252
mmc_rx_octetcount_g: 2341252
mmc_rx_broadcastframe_g: 26
mmc_rx_multicastframe_g: 0
mmc_rx_crc_error: 0
mmc_rx_align_error: 0
mmc_rx_run_error: 0
mmc_rx_jabber_error: 0
mmc_rx_undersize_g: 0
mmc_rx_oversize_g: 0
mmc_rx_64_octets_gb: 2243
mmc_rx_65_to_127_octets_gb: 29640
mmc_rx_128_to_255_octets_gb: 2
mmc_rx_256_to_511_octets_gb: 6
mmc_rx_512_to_1023_octets_gb: 0
mmc_rx_1024_to_max_octets_gb: 0
mmc_rx_unicast_g: 31865
mmc_rx_length_error: 0
mmc_rx_autofrangetype: 0
mmc_rx_pause_frames: 0
mmc_rx_fifo_overflow: 0
mmc_rx_vlan_frames_gb: 0
mmc_rx_watchdog_error: 0
mmc_rx_ipc_intr_mask: 1073692671
mmc_rx_ipc_intr: 0
mmc_rx_ipv4_gd: 31861
mmc_rx_ipv4_hderr: 0
mmc_rx_ipv4_nopay: 0
mmc_rx_ipv4_frag: 0
mmc_rx_ipv4_udsbl: 0
mmc_rx_ipv4_gd_octets: 1757432
mmc_rx_ipv4_hderr_octets: 0
mmc_rx_ipv4_nopay_octets: 0
mmc_rx_ipv4_frag_octets: 0
mmc_rx_ipv4_udsbl_octets: 0
mmc_rx_ipv6_gd_octets: 0
mmc_rx_ipv6_hderr_octets: 0
mmc_rx_ipv6_nopay_octets: 0
mmc_rx_ipv6_gd: 0
mmc_rx_ipv6_hderr: 0
mmc_rx_ipv6_nopay: 0
mmc_rx_udp_gd: 31788
mmc_rx_udp_err: 0
mmc_rx_tcp_gd: 0
mmc_rx_tcp_err: 0
mmc_rx_icmp_gd: 73
mmc_rx_icmp_err: 0
mmc_rx_udp_gd_octets: 1114188
mmc_rx_udp_err_octets: 0
mmc_rx_tcp_gd_octets: 0
mmc_rx_tcp_err_octets: 0
mmc_rx_icmp_gd_octets: 6024
mmc_rx_icmp_err_octets: 0
tx_underflow: 0
tx_carrier: 0
tx_losscarrier: 0
vlan_tag: 0
tx_deferred: 0
tx_vlan: 0
tx_jabber: 0
tx_frame_flushed: 0
tx_payload_error: 0
tx_ip_header_error: 0
rx_desc: 0
sa_filter_fail: 0
overflow_error: 0
ipc_csum_error: 0
rx_collision: 0
rx_crc_errors: 0
dribbling_bit: 0
rx_length: 0
rx_mii: 0
rx_multicast: 0
rx_gmac_overflow: 0
rx_watchdog: 0
da_rx_filter_fail: 0
sa_rx_filter_fail: 0
rx_missed_cntr: 0
rx_overflow_cntr: 0
rx_vlan: 0
tx_undeflow_irq: 0
tx_process_stopped_irq: 0
tx_jabber_irq: 0
rx_overflow_irq: 0
rx_buf_unav_irq: 0
rx_process_stopped_irq: 0
rx_watchdog_irq: 0
tx_early_irq: 0
fatal_bus_error_irq: 0
rx_early_irq: 19
threshold: 320
tx_pkt_n: 92109
rx_pkt_n: 26175
normal_irq_n: 3860
rx_normal_irq_n: 2426
napi_poll: 4202
tx_normal_irq_n: 1438
tx_clean: 4494
tx_set_ic_bit: 1439
irq_receive_pmt_irq_n: 0
mmc_tx_irq_n: 0
mmc_rx_irq_n: 0
mmc_rx_csum_offload_irq_n: 0
irq_tx_path_in_lpi_mode_n: 0
irq_tx_path_exit_lpi_mode_n: 0
irq_rx_path_in_lpi_mode_n: 0
irq_rx_path_exit_lpi_mode_n: 0
phy_eee_wakeup_error_n: 0
ip_hdr_err: 0
ip_payload_err: 0
ip_csum_bypassed: 0
ipv4_pkt_rcvd: 26145
ipv6_pkt_rcvd: 0
no_ptp_rx_msg_type_ext: 26145
ptp_rx_msg_type_sync: 0
ptp_rx_msg_type_follow_up: 0
ptp_rx_msg_type_delay_req: 0
ptp_rx_msg_type_delay_resp: 0
ptp_rx_msg_type_pdelay_req: 0
ptp_rx_msg_type_pdelay_resp: 0
ptp_rx_msg_type_pdelay_follow_up: 0
ptp_rx_msg_type_announce: 0
ptp_rx_msg_type_management: 0
ptp_rx_msg_pkt_reserved_type: 0
ptp_frame_type: 0
ptp_ver: 0
timestamp_dropped: 0
av_pkt_rcvd: 0
av_tagged_pkt_rcvd: 0
vlan_tag_priority_val: 0
l3_filter_match: 0
l4_filter_match: 0
l3_l4_filter_no_match: 0
irq_pcs_ane_n: 0
irq_pcs_link_n: 0
irq_rgmii_n: 0
mtl_tx_status_fifo_full: 0
mtl_tx_fifo_not_empty: 0
mmtl_fifo_ctrl: 0
mtl_tx_fifo_read_ctrl_write: 0
mtl_tx_fifo_read_ctrl_wait: 0
mtl_tx_fifo_read_ctrl_read: 0
mtl_tx_fifo_read_ctrl_idle: 0
mac_tx_in_pause: 0
mac_tx_frame_ctrl_xfer: 0
mac_tx_frame_ctrl_idle: 0
mac_tx_frame_ctrl_wait: 0
mac_tx_frame_ctrl_pause: 0
mac_gmii_tx_proto_engine: 0
mtl_rx_fifo_fill_level_full: 0
mtl_rx_fifo_fill_above_thresh: 0
mtl_rx_fifo_fill_below_thresh: 0
mtl_rx_fifo_fill_level_empty: 1
mtl_rx_fifo_read_ctrl_flush: 0
mtl_rx_fifo_read_ctrl_read_data: 0
mtl_rx_fifo_read_ctrl_status: 1
mtl_rx_fifo_read_ctrl_idle: 0
mtl_rx_fifo_ctrl_active: 0
mac_rx_frame_ctrl_fifo: 0
mac_gmii_rx_proto_engine: 0
tx_tso_frames: 0
tx_tso_nfrags: 0

ethtool features:

rx-checksumming: on
tx-checksumming: off
        tx-checksum-ipv4: off [requested on]
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: off [requested on]
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
        tx-tcp-segmentation: off [fixed]
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: off [fixed]
        tx-tcp6-segmentation: off [fixed]
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on [fixed]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-sctp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]

ethtool -g eth0 gives me Cannot get device ring settings: Operation not supported

ss reports a very high number for udpInErrors (which in RFC 1213 is described as The number of received UDP datagrams that could not be delivered for reasons other than the lack of an application at the destination port.), and that same high number for UdpRcvbufErrors (for which I cannot find a description).

Limiting or even dropping all incoming traffic to that port using iptables (both on INPUT as well as PREROUTING chain) doesn’t solve the issue, while configuring the device as a bridge and then limiting bridged traffic using ebtables does solve the issue.

What does this mean? I would assume that there is no hardware-related problem, and that somewhere between OSI layers 2 and 3, this problem is introduced. Where can I find more information or ways to analyze this problem? Where can I find a detailed description of how the Linux kernel handles data between these two layers?

The issue seems to be solved by switching to kernel 4.13, most likely due to changes in the DWMAC driver.