This showed up in our testing in the last couple days. You unplug and replug, it sets up the driver right that time, and life is good, and we achieve 100+Mbit/sec speeds with VERY low latency.
I can duplicate it every time now, but was faarrrr too tired after debugging the problem for 2 days straight to figure out where it was, but I think it should be easy to find.
Basically, the debloat package in cerowrt would trigger it - but only under those circumstances. Saw it, maybe, rarely, on 100Mbit on the wan port. It explains a lot - reports of dhcp problems, etc, etc.
In doing far more extensive testing, I was able to not only crash the wan port in a little over 500 packets, but actually get an kernel oops, even without fiddling with these parameters. I will upload the oops later (it was not very revealing regardless).
These routers do have a patch to their mac address creation routine, but I don’t think that is the problem, what I am thinking is that there is a real & SUBTLE problem in the reset routines (and/or the ring buffer) that happens when there is lots of other traffic on the wan.
I will look at this MUCH harder after I get caught up on sleep and back to my own lab.
Although you can ifconfig down and up and get back in business, this is the oops I get
ADDRCONF (NETDEV_UP): ge00: link is not ready
ar71xx: pll_reg 0xb8050014: 0x11110000
ge00: link up (1000Mbps/Full duplex)
ADDRCONF (NETDEV_CHANGE): ge00: link becomes ready
———–[ cut here ]———–
WARNING: at net/sched/sch_generic.c:256 dev_watchdog+0x16c/0x274()
NETDEV WATCHDOG: ge00 (ag71xx): transmit queue 0 timed out
Modules linked in: gpio_buttons xt_hashlimit ip6t_REJECT ip6t_LOG
ip6t_rt ip6t_hbh ip6t_mh ip6t_ipv6header ip6t_frag ip6t_eui64
ip6t_ah ip6table_raw ip6_queue ip6table_mangle ip6table_filter
ip6_tables nf_conntrack_ipv6 nf_defrag_ipv6 ipt_SET ipt_set
ip_set_setlist ip_set_portmap ip_set_nethash ip_set_macipmap
ip_set_iptreemap ip_set_iptree ip_set_ipportnethash
ip_set_ipportiphash ip_set_ipporthash ip_set_ipmap ip_set_iphash
ip_set nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp
xt_HL xt_hl ipt_ECN xt_CLASSIFY xt_time xt_tcpmss xt_statistic
xt_mark xt_length ipt_ecn xt_DSCP xt_dscp xt_string xt_layer7
xt_quota xt_pkttype xt_physdev xt_owner ipt_MASQUERADE iptable_nat
nf_nat xt_recent xt_helper xt_connmark xt_connbytes xt_conntrack
xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4
nf_conntrack pppoe pppox ipt_REJECT xt_TCPMSS ipt_LOG xt_comment
xt_multiport xt_mac xt_limit iptable_mangle iptable_filter
ip_tables xt_tcpudp x_tables ifb sit tunnel4 tun ppp_async
ppp_generic slhc vfat fat autofs4 ath9k ath9k_common ath9k_hw ath
nls_utf8 nls_iso8859_2 nls_iso8859_15 nls_iso8859_13
nls_iso8859_1 nls_cp437 mac80211 ts_fsm ts_bm ts_kmp crc_ccitt
cfg80211 compat arc4 aes_generic crypto_algapi ipv6 usb_storage
ohci_hcd ehci_hcd sd_mod ext4 jbd2 usbcore scsi_mod nls_base
mbcache crc16 leds_gpio button_hotplug gpio_keys_polled
input_polldev input_core
Call Trace:
[<8026ce24>] dump_stack+0x8/0x34
[<80075238>] warn_slowpath_common+0x78/0xa4
[<800752ec>] warn_slowpath_fmt+0x2c/0x38
[<801ef7d0>] dev_watchdog+0x16c/0x274
[<8007f468>] run_timer_softirq+0x14c/0x1ec
[<8007a9f4>] __do_softirq+0xac/0x15c
[<8007abfc>] do_softirq+0x48/0x68
[<800610e0>] plat_irq_dispatch+0x4c/0x17c
[<8006258c>] ret_from_irq+0x0/0x4
[<80062780>] r4k_wait+0x20/0x40
[<800640fc>] cpu_idle+0x24/0x44
[<802fc8d8>] start_kernel+0x36c/0x38c
–[ end trace d0fa80935a954c41 ]–
ge00: tx timeout
ge00: tx timeout
ge00: tx timeout
ge00: tx timeout
ge00: tx timeout
ge00: tx timeout
ge00: tx timeout
ge00: tx timeout
ge00: tx timeout
ge00: tx timeout
ge00: tx timeout
ge00: tx timeout
ge00: link down
ADDRCONF (NETDEV_UP): ge00: link is not ready
ar71xx: pll_reg 0xb8050014: 0x11110000
ge00: link up (1000Mbps/Full duplex)
ADDRCONF (NETDEV_CHANGE): ge00: link becomes ready
ge00: no IPv6 routers present
ge00: link down
ar71xx: pll_reg 0xb8050014: 0x11110000
ge00: link up (1000Mbps/Full duplex)
http://huchra.bufferbloat.net/~cero1/cerowrt-wndr3700-1.0rc2/
Although it fixes nearly all the other outstanding priority 1 bugs, this one is a showstopper, so I’m cancelling the rc2 release and going on to rc3 this week.
Either this means that GigE/100Mbit is not correctly being detected, or that the switch configuration for the blinkenlights is wrong, or….
case \$devtype in
0) ethtool -G \$DEV tx 4 ;
ip link set \$DEV txqueuelen 8;;
ge00 Link encap:Ethernet HWaddr C4:3D:C7:98:69:15
inet addr:172.30.42.45 Bcast:172.30.42.63 Mask:255.255.255.224
inet6 addr: fe80::c63d:c7ff:fe98:6915⁄64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:4 errors:0 dropped:0 overruns:0 frame:0
TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:8
RX bytes:872 (872.0 B) TX bytes:1146 (1.1 KiB)
Interrupt:5
WARNING: at net/sched/sch_generic.c:256 dev_watchdog+0x16c/0x274()
NETDEV WATCHDOG: ge00 (ag71xx): transmit queue 0 timed out
Modules linked in: gpio_buttons xt_hashlimit ip6t_REJECT ip6t_LOG
ip6t_rt ip6t_hbh ip6t_mh ip6t_ipv6header ip6t_frag ip6t_eui64
ip6t_ah ip6table_raw ip6_queue ip6table_mangle ip6table_filter
ip6_tables nf_conntrack_ipv6 nf_defrag_ipv6 ipt_SET ipt_set
ip_set_setlist ip_set_portmap ip_set_nethash ip_set_macipmap
ip_set_iptreemap ip_set_iptree ip_set_ipportnethash
ip_set_ipportiphash ip_set_ipporthash ip_set_ipmap ip_set_iphash
ip_set nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp
xt_HL xt_hl ipt_ECN xt_CLASSIFY xt_time xt_tcpmss xt_statistic
xt_mark xt_length ipt_ecn xt_DSCP xt_dscp xt_string xt_layer7
xt_quota xt_pkttype xt_physdev xt_owner ipt_MASQUERADE iptable_nat
nf_nat xt_recent xt_helper xt_connmark xt_connbytes xt_conntrack
xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4
nf_conntrack pppoe pppox ipt_REJECT xt_TCPMSS ipt_LOG xt_comment
xt_multiport xt_mac xt_limit iptable_mangle iptable_filter
ip_tables xt_tcpudp x_tables ifb sit tunnel4 tun ppp_async
ppp_generic slhc vfat fat autofs4 ath9k ath9k_common ath9k_hw ath
nls_utf8 nls_iso8859_2 nls_iso8859_15 nls_iso8859_13
nls_iso8859_1 nls_cp437 mac80211 ts_fsm ts_bm ts_kmp crc_ccitt
cfg80211 compat arc4 aes_generic crypto_algapi ipv6 usb_storage
ohci_hcd ehci_hcd sd_mod ext4 jbd2 usbcore scsi_mod nls_base
mbcache crc16 leds_gpio button_hotplug gpio_keys_polled
input_polldev input_core
Call Trace:
[<8026ce24>] dump_stack+0x8/0x34
[<80075238>] warn_slowpath_common+0x78/0xa4
[<800752ec>] warn_slowpath_fmt+0x2c/0x38
[<801ef7d0>] dev_watchdog+0x16c/0x274
[<8007f468>] run_timer_softirq+0x14c/0x1ec
[<8007a9f4>] __do_softirq+0xac/0x15c
[<8007abfc>] do_softirq+0x48/0x68
[<800610e0>] plat_irq_dispatch+0x4c/0x17c
[<8006258c>] ret_from_irq+0x0/0x4
[<80062780>] r4k_wait+0x20/0x40
[<800640fc>] cpu_idle+0x24/0x44
[<802fc8d8>] start_kernel+0x36c/0x38c
–[ end trace 99006a3a445e09e2 ]–
ge00: tx timeout
but changing txqueuelen seems to work.
I note that I rename devices in part because I can never remember which is the lan/wan ports and in part to make firewall rules easier. ge00 is the wan port. Which also has the patch to give it a unique mac…
Moving on to attempting this change much earlier in the boot.
all Linux network drivers should do that, actually. The problems we are having with the ar71xx… is just the first, in a long road, towards getting there.
I ran into similar problems attempting to debloat the common laptop ‘e1000’ driver, where it worked at 100Mbit, but failed to do TSO offload properly at gigE speeds, with buffers below 64.
I moved resetting ethtool to VERY early in the boot sequence in the S10boot script… and got a working boot. Need to do some load testing…
killall -q hotplug2
# Change device buffers
ethtool -G eth0 tx 4
ethtool -G eth1 tx 4
# change device names
/sbin/fixeth
root@OpenWrt:~# ifconfig se00
se00 Link encap:Ethernet HWaddr C6:3D:C7:98:69:14
inet addr:172.30.42.33 Bcast:172.30.42.63 Mask:255.255.255.224
inet6 addr: fe80::c43d:c7ff:fe98:6914⁄64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1116 errors:0 dropped:7 overruns:12 frame:0
TX packets:1059 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:8
RX bytes:105475 (103.0 KiB) TX bytes:113162 (110.5 KiB)
Interrupt:4
root@OpenWrt:~# ifconfig ge00
ge00 Link encap:Ethernet HWaddr C4:3D:C7:98:69:15
inet addr:172.30.42.45 Bcast:172.30.42.63 Mask:255.255.255.224
inet6 addr: fe80::c63d:c7ff:fe98:6915⁄64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:86 errors:0 dropped:0 overruns:0 frame:0
TX packets:77 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:8
RX bytes:11302 (11.0 KiB) TX bytes:11454 (11.1 KiB)
This is a major bug in Linux (and other OS’s design), and we’ll be discussing this at LPC, I hope.
And yes, the default txqueuelen the driver should return should depend on the link speed; this is a “safe” change. Cutting to 100 packets at 100Mbps, and (maybe) 10 at 10Mbps should have the same effect as we have now without any chance of introducing problems. It’s certainly the short term “hack” until we have more intelligent buffer management across the rings an transmit queue (and queue disciplines).
Which is what I’m doing in cerowrt 1.0.
but bql eliminates the need, and the default is now 64 anyway.