Bug #360

Patchwork [OpenWrt-Devel] ar71xx TCP/IPsec unaligned instructions

Added by Ketan Kulkarni on Apr 11, 2012. Updated on Dec 22, 2013.
In Progress Normal Dave Täht

Description

Good to have patch -
http://patchwork.midlink.org/patch/1721/ This patch is for IPv4 (I am yet to test it on v6)

Tested (without patch) on 3.3-1.4
Here is what traps of “unaligned_instructions” shows -
(dmesg.logs)


[ 803.894531] Stack : 00000000 803011fc 86adfaa0 86adfaa0 869ced60 8721bf00 86210900 86210900
[ 803.894531] 86b2f882 86b2f86e 80360000 80231494 80213490 00000001 8721bf00 80360000
[ 803.894531] 80213888 87d48000 8721bf00 c0a80121 00000000 8721bf00 86b2f882 c0a80121
[ 803.894531] 86210900 8721bf00 86b2f882 8023350c 00000000 86adfa90 80213490 80000000
[ 803.894531] c0a80121 0000d859 00000003 80000000 8721bf00 8029be84 00000001 80301614
[ 803.894531] …
[ 803.894531] Call Trace:
[ 803.894531] [<8022abb4>] tcp_rcv_established+0x34/0x690
[ 803.894531] [<80231494>] tcp_v4_do_rcv+0x40/0x244
[ 803.894531] [<8023350c>] tcp_v4_rcv+0x59c/0x944
[ 803.894531] [<802139f8>] ip_local_deliver_finish+0x170/0x29c
[ 803.894531] [<801e8290>] __netif_receive_skb+0x458/0x4c8
..
..
[ 803.894531] Stack : 00000000 803011fc 86adfaa0 86adfaa0 869ced60 8721bf00 86210900 86210900
[ 803.894531] 86b2f882 86b2f86e 80360000 80231494 80213490 00000001 8721bf00 80360000
[ 803.894531] 80213888 87d48000 8721bf00 c0a80121 00000000 8721bf00 86b2f882 c0a80121
[ 803.894531] 86210900 8721bf00 86b2f882 8023350c 00000000 86adfa90 80213490 80000000
[ 803.894531] c0a80121 0000d859 00000003 80000000 8721bf00 8029be84 00000001 80301614
[ 803.894531] …
[ 803.894531] Call Trace:
[ 803.894531] [<8022abb4>] tcp_rcv_established+0x34/0x690
[ 803.894531] [<80231494>] tcp_v4_do_rcv+0x40/0x244
[ 803.894531] [<8023350c>] tcp_v4_rcv+0x59c/0x944
[ 803.894531] [<802139f8>] ip_local_deliver_finish+0x170/0x29c
[ 803.894531] [<801e8290>] __netif_receive_skb+0x458/0x4c8
..
..

Hereis what oprofiler shows (oprof_raw.txt) -

CPU: MIPS 24K, speed 676 MHz (estimated)
Counted INSTRUCTIONS events (1-0 Instructions completed) with a unit mask of 0x00 (No unit mask) count 100000
samples % app name symbol name
——————————————————————————-
548 7.6977 vmlinux __do_softirq
548 100.000 vmlinux __do_softirq [self]
——————————————————————————-
370 5.1974 libuClibc-0.9.33.so /lib/libuClibc-0.9.33.so
370 100.000 libuClibc-0.9.33.so /lib/libuClibc-0.9.33.so [self]
——————————————————————————-
334 4.6917 vmlinux __copy_user
334 100.000 vmlinux __copy_user [self]
——————————————————————————-
276 3.8769 ip_tables /ip_tables
276 100.000 ip_tables /ip_tables [self]
——————————————————————————-
271 3.8067 vmlinux csum_partial
271 100.000 vmlinux csum_partial [self]
——————————————————————————-
246 3.4555 vmlinux __copy_user_inatomic
246 100.000 vmlinux __copy_user_inatomic [self]
——————————————————————————-
241 3.3853 nf_conntrack /nf_conntrack
241 100.000 nf_conntrack /nf_conntrack [self]
——————————————————————————-
222 3.1184 dropbear /usr/sbin/dropbear
222 100.000 dropbear /usr/sbin/dropbear [self]
——————————————————————————-
216 3.0341 vmlinux tick_nohz_idle_enter
216 100.000 vmlinux tick_nohz_idle_enter [self]
——————————————————————————-
205 2.8796 vmlinux tick_nohz_idle_exit
205 100.000 vmlinux tick_nohz_idle_exit [self]
——————————————————————————-
185 2.5987 vmlinux r4k_wait
185 100.000 vmlinux r4k_wait [self]
——————————————————————————-
154 2.1632 vmlinux finish_task_switch.constprop.62
154 100.000 vmlinux finish_task_switch.constprop.62 [self]
——————————————————————————-
116 1.6294 busybox /bin/busybox

116 100.000 busybox /bin/busybox [self]
——————————————————————————-
93 1.3064 ld-uClibc-0.9.33.so /lib/ld-uClibc-0.9.33.so
93 100.000 ld-uClibc-0.9.33.so /lib/ld-uClibc-0.9.33.so [self]
——————————————————————————-
92 1.2923 ath9k /ath9k
92 100.000 ath9k /ath9k [self]
——————————————————————————-
84 1.1799 vmlinux local_bh_enable
84 100.000 vmlinux local_bh_enable [self]
——————————————————————————-
82 1.1518 vmlinux __wake_up_sync_key
82 100.000 vmlinux __wake_up_sync_key [self]
——————————————————————————-
72 1.0114 vmlinux __bzero
72 100.000 vmlinux __bzero [self]
——————————————————————————-
69 0.9692 vmlinux nf_iterate
69 100.000 vmlinux nf_iterate [self]
——————————————————————————-
63 0.8850 libdns.so.93.1.0 /usr/lib/libdns.so.93.1.0
63 100.000 libdns.so.93.1.0 /usr/lib/libdns.so.93.1.0 [self]
——————————————————————————-
56 0.7866 vmlinux ag71xx_poll
56 100.000 vmlinux ag71xx_poll [self]
——————————————————————————-
56 0.7866 vmlinux tcp_rcv_established
56 100.000 vmlinux tcp_rcv_established [self]
——————————————————————————-
55 0.7726 oprofiled /usr/bin/oprofiled
55 100.000 oprofiled /usr/bin/oprofiled [self]
——————————————————————————-
53 0.7445 mac80211 /mac80211
53 100.000 mac80211 /mac80211 [self]
——————————————————————————-
53 0.7445 vmlinux do_ade

53 100.000 vmlinux do_ade [self]

Attachments

  • dmesg.logs (application/octet-stream; 168.4 kiB) Ketan Kulkarni Apr 11, 2012
  • ip6_mc_input-copy_dest_addr.patch (text/x-patch; 742 bytes) Patch to copy destination address in ipv6_chk_mcast_addr before calling helper functions. Robert Bradley Apr 11, 2012
  • skb_cb_packed.patch (text/x-patch; 676 bytes) Pack tcp_skb_cb and udp_skb_cb Robert Bradley Apr 13, 2012
  • igmp_tcp_alignment.patch (text/x-patch; 1.6 kiB) igmp* packed, tcp_parse_aligned_timestamp handles unaligned timestamps Robert Bradley Apr 15, 2012
  • asm.patch (text/x-patch; 3.9 kiB) Cleaned up asm checksumming patch (for reference) - assembles now, but not trusted or really necessary. Robert Bradley Apr 16, 2012
  • ip6_route_input.patch (text/x-patch; 2.0 kiB) Unaligned access patches to ip6_route_input and __ipv6_addr_type Robert Bradley Apr 18, 2012
  • ipv6_monster_patch.patch (text/x-patch; 5.4 kiB) Monster ipv6 patch replacing virtually every *(__be32 *) instance with __get_unaligned_cpu32() Robert Bradley Apr 18, 2012
  • skb_flow_dissect.patch (text/x-patch; 1.7 kiB) Mark vlan_hdr as packed and patch skb_flow_dissect() for unaligned access Robert Bradley Apr 18, 2012

History

Updated by Dave Täht on Apr 11, 2012.
[49066.542968] Call Trace:
[49066.542968] [<831a32e8>] ipv6_chk_mcast_addr+0x3c/0x170 [ipv6]
[49066.542968] [<83186338>] ip6_mc_input+0xd4/0x4dc [ipv6]
[49066.542968] [<801e8320>] __netif_receive_skb+0x458/0x4c8
[49066.542968] [<801d06ec>] ag71xx_poll+0x3b0/0x66c
[49066.542968] [<801e8740>] net_rx_action+0x88/0x1c8
[49066.542968] [<80076e14>] __do_softirq+0xa0/0x154
[49066.542968] [<80077020>] do_softirq+0x48/0x68
[49066.542968] [<80077254>] irq_exit+0x4c/0xb0
[49066.542968] [<80062d4c>] ret_from_irq+0x0/0x4
[49066.542968] [<80062f40>] r4k_wait+0x20/0x40
[49066.542968] [<80064a1c>] cpu_idle+0x38/0x70
[49066.542968] [<8031a8c8>] start_kernel+0x37c/0x39c
[49066.542968]
[49066.542968]
[49066.542968] Code: 00000000 08c68ccb 8e10000c <8e430004> 8e440000 00621826 8e020000 00821026 00621025
[49066.542968] Cpu 0
[49066.542968] \$ 0 : 00000000 00000001 00000000 00000000
[49066.542968] \$ 4 : 00010006 82b24086 00000000 802ffc84
[49066.542968] \$ 8 : fe800000 00000000 02e081ff fe3319ca
[49066.542968] \$12 : 00000014 0014002b 00000001 00000001
[49066.542968] \$16 : 8315b380 00000000 82b24086 00000000
[49066.542968] \$20 : 00000000 80300000 80320000 80300b38
[49066.542968] \$24 : 00000000 83186264
[49066.542968] \$28 : 802fe000 802ffc90 80315c80 831a32e0
[49066.542968] Hi : 000002cf
[49066.542968] Lo : 04270c00
[49066.542968] epc : 831a32ec ipv6_chk_mcast_addr+0x40/0x170 [ipv6]
[49066.542968] Tainted: G O
[49066.542968] ra : 831a32e0 ipv6_chk_mcast_addr+0x34/0x170 [ipv6]
[49066.542968] Status: 1000fc03 KERNEL EXL IE
[49066.542968] Cause : 00800010
[49066.542968] BadVA : 82b24086
[49066.542968] PrId : 00019374 (MIPS 24Kc)
[49066.542968] Modules linked in: cls_fw sch_htb sch_red sch_sfq cls_flow cls_u32 sch_qfq gpio_buttons ath79_wdt xt_hashlimit ip6t_REJECT ip6t_LOG ip6t_rt ip6t_hbh ip6t_mh ip6t_ipv6header ip6t_frag ip6t_eui64 ip6t_ah ip6table_raw ip6_queue ip6table_mangle ip6table_filter ip6_tables nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp xt_policy xt_esp ipt_ah xt_HL xt_hl xt_ecn ipt_ECN xt_CLASSIFY xt_time xt_tcpmss xt_statistic xt_mark xt_length xt_DSCP xt_dscp xt_string xt_layer7 xt_quota xt_pkttype xt_physdev xt_owner ipt_MASQUERADE iptable_nat nf_nat xt_recent xt_helper xt_connmark xt_connbytes pptp xt_conntrack xt_CT xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack pppoe pppox ipt_REJECT xt_TCPMSS ipt_LOG xt_comment xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables ip_gre gre ifb sit ipcomp xfrm4_tunnel xfrm4_mode_tunnel xfrm4_mode_transport xfrm4_mode_beet esp4 ah4 tunnel4 tun tcp_ledbat(O) ppp_async ppp_generic slhc xfrm_user xfrm_ipcomp af_key vfat fat autofs4 button_hotplug(O) ath9k(O) ath9k_common(O) ath9k_hw(O) ath(O) nls_utf8 nls_iso8859_2 nls_iso8859_15 nls_iso8859_13 nls_iso8859_1 nls_cp437 mac80211(O) ts_fsm ts_bm ts_kmp crc_ccitt ipv6 input_polldev cfg80211(O) compat(O) input_core chainiv eseqiv crypto_wq sha1_generic krng rng md5 hmac des_generic cbc authenc arc4 aes_generic crypto_blkcipher cryptomgr aead usb_storage ohci_hcd ehci_hcd sd_mod ext4 jbd2 mbcache usbcore usb_common scsi_mod nls_base crc16 zlib_deflate crypto_hash crypto_algapi ledtrig_timer ledtrig_default_on leds_gpio gpio_button_hotplug(O)
[49066.542968] Process swapper (pid: 0, threadinfo=802fe000, task=803020c0, tls=00000000)
[49066.542968] Stack : fe3319ca 0c000000 00000000 80300000 83f93000 83f93000 8035d470 82b24086
[49066.542968] 82b2406e 83186338 00000002 00000000 00000000 831858a0 80000000 800ee148
[49066.542968] 80300b00 80300b20 000086dd 83d44000 00000000 801e8320 83d44000 80096fc4
[49066.542968] 00000000 00000000 80300b20 80091b2c 4f859d0a 1669a157 83d4438c 83d44380
[49066.542968] 83f93000 83d44000 00000001 83d44000 0000007c 000007b0 83d44238 801d06ec
[49066.542968] …
[49066.542968] Call Trace:
[49066.542968] [<831a32ec>] ipv6_chk_mcast_addr+0x40/0x170 [ipv6]
[49066.542968] [<83186338>] ip6_mc_input+0xd4/0x4dc [ipv6]
[49066.542968] [<801e8320>] __netif_receive_skb+0x458/0x4c8
[49066.542968] [<801d06ec>] ag71xx_poll+0x3b0/0x66c
[49066.542968] [<801e8740>] net_rx_action+0x88/0x1c8
[49066.542968] [<80076e14>] __do_softirq+0xa0/0x154
[49066.542968] [<80077020>] do_softirq+0x48/0x68
[49066.542968] [<80077254>] irq_exit+0x4c/0xb0
[49066.542968] [<80062d4c>] ret_from_irq+0x0/0x4
[49066.542968] [<80062f40>] r4k_wait+0x20/0x40
[49066.542968] [<80064a1c>] cpu_idle+0x38/0x70
[49066.542968] [<8031a8c8>] start_kernel+0x37c/0x39c
[49066.542968]
[49066.542968]
[49066.542968] Code: 08c68ccb 8e10000c 8e430004 <8e440000> 00621826 8e020000 00821026 00621025 8e440008
[49066.542968] Cpu 0
[49066.542968] \$ 0 : 00000000 00000001 00000000 00000000
[49066.542968] \$ 4 : ff020000 82b24086 00000000 802ffc84
[49066.542968] \$ 8 : fe800000 00000000 02e081ff fe3319ca
[49066.542968] \$12 : 00000014 0014002b 00000001 00000001
[49066.542968] \$16 : 8315b380 00000000 82b24086 00000000
[49066.542968] \$20 : 00000000 80300000 80320000 80300b38
[49066.542968] \$24 : 00000000 83186264
[49066.542968] \$28 : 802fe000 802ffc90 80315c80 831a32e0
[49066.542968] Hi : 000002cf
[49066.542968] Lo : 04270c00
[49066.542968] epc : 831a3300 ipv6_chk_mcast_addr+0x54/0x170 [ipv6]
[49066.542968] Tainted: G O
[49066.542968] ra : 831a32e0 ipv6_chk_mcast_addr+0x34/0x170 [ipv6]
[49066.542968] Status: 1000fc03 KERNEL EXL IE
[49066.542968] Cause : 00800010
[49066.542968] BadVA : 82b2408e
[49066.542968] PrId : 00019374 (MIPS 24Kc)
[49066.542968] Modules linked in: cls_fw sch_htb sch_red sch_sfq cls_flow cls_u32 sch_qfq gpio_buttons ath79_wdt xt_hashlimit ip6t_REJECT ip6t_LOG ip6t_rt ip6t_hbh ip6t_mh ip6t_ipv6header ip6t_frag ip6t_eui64 ip6t_ah ip6table_raw ip6_queue ip6table_mangle ip6table_filter ip6_tables nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp xt_policy xt_esp ipt_ah xt_HL xt_hl xt_ecn ipt_ECN xt_CLASSIFY xt_time xt_tcpmss xt_statistic xt_mark xt_length xt_DSCP xt_dscp xt_string xt_layer7 xt_quota xt_pkttype xt_physdev xt_owner ipt_MASQUERADE iptable_nat nf_nat xt_recent xt_helper xt_connmark xt_connbytes pptp xt_conntrack xt_CT xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack pppoe pppox ipt_REJECT xt_TCPMSS ipt_LOG xt_comment xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables ip_gre gre ifb sit ipcomp xfrm4_tunnel xfrm4_mode_tunnel xfrm4_mode_transport xfrm4_mode_beet esp4 ah4 tunnel4 tun tcp_ledbat(O) ppp_async ppp_generic slhc xfrm_user xfrm_ipcomp af_key vfat fat autofs4 button_hotplug(O) ath9k(O) ath9k_common(O) ath9k_hw(O) ath(O) nls_utf8 nls_iso8859_2 nls_iso8859_15 nls_iso8859_13 nls_iso8859_1 nls_cp437 mac80211(O) ts_fsm ts_bm ts_kmp crc_ccitt ipv6 input_polldev cfg80211(O) compat(O) input_core chainiv eseqiv crypto_wq sha1_generic krng rng md5 hmac des_generic cbc authenc arc4 aes_generic crypto_blkcipher cryptomgr aead usb_storage ohci_hcd ehci_hcd sd_mod ext4 jbd2 mbcache usbcore usb_common scsi_mod nls_base crc16 zlib_deflate crypto_hash crypto_algapi ledtrig_timer ledtrig_default_on leds_gpio gpio_button_hotplug(O)
[49066.542968] Process swapper (pid: 0, threadinfo=802fe000, task=803020c0, tls=00000000)
[49066.542968] Stack : fe3319ca 0c000000 00000000 80300000 83f93000 83f93000 8035d470 82b24086
[49066.542968] 82b2406e 83186338 00000002 00000000 00000000 831858a0 80000000 800ee148
[49066.542968] 80300b00 80300b20 000086dd 83d44000 00000000 801e8320 83d44000 80096fc4
[49066.542968] 00000000 00000000 80300b20 80091b2c 4f859d0a 1669a157 83d4438c 83d44380
[49066.542968] 83f93000 83d44000 00000001 83d44000 0000007c 000007b0 83d44238 801d06ec
[49066.542968] …
[49066.542968] Call Trace:
[49066.542968] [<831a3300>] ipv6_chk_mcast_addr+0x54/0x170 [ipv6]
[49066.542968] [<83186338>] ip6_mc_input+0xd4/0x4dc [ipv6]
[49066.542968] [<801e8320>] __netif_receive_skb+0x458/0x4c8
[49066.542968] [<801d06ec>] ag71xx_poll+0x3b0/0x66c
[49066.542968] [<801e8740>] net_rx_action+0x88/0x1c8
[49066.542968] [<80076e14>] __do_softirq+0xa0/0x154
[49066.542968] [<80077020>] do_softirq+0x48/0x68
[49066.542968] [<80077254>] irq_exit+0x4c/0xb0
[49066.542968] [<80062d4c>] ret_from_irq+0x0/0x4
[49066.542968] [<80062f40>] r4k_wait+0x20/0x40
[49066.542968] [<80064a1c>] cpu_idle+0x38/0x70
[49066.542968] [<8031a8c8>] start_kernel+0x37c/0x39c
[49066.542968]
[49066.542968]
[49066.542968] Code: 8e020000 00821026 00621025 <8e440008> 8e030008 00831826 00431025 8e44000c 8e03000c
[49066.542968] Cpu 0
[49066.542968] \$ 0 : 00000000 00000001 00000000 00000000
[49066.546875] \$ 4 : 00000000 82b24086 00000000 802ffc84
[49066.546875] \$ 8 : fe800000 00000000 02e081ff fe3319ca
[49066.546875] \$12 : 00000014 0014002b 00000001 00000001
[49066.546875] \$16 : 8315b380 00000000 82b24086 00000000
[49066.546875] \$20 : 00000000 80300000 80320000 80300b38
[49066.546875] \$24 : 00000000 83186264
[49066.546875] \$28 : 802fe000 802ffc90 80315c80 831a32e0
[49066.546875] Hi : 000002cf
[49066.546875] Lo : 04270c00
[49066.546875] epc : 831a3310 ipv6_chk_mcast_addr+0x64/0x170 [ipv6]
[49066.546875] Tainted: G O
[49066.546875] ra : 831a32e0 ipv6_chk_mcast_addr+0x34/0x170 [ipv6]
[49066.546875] Status: 1000fc03 KERNEL EXL IE
[49066.546875] Cause : 00800010
[49066.546875] BadVA : 82b24092
[49066.546875] PrId : 00019374 (MIPS 24Kc)
[49066.546875] Modules linked in: cls_fw sch_htb sch_red sch_sfq cls_flow cls_u32 sch_qfq gpio_buttons ath79_wdt xt_hashlimit ip6t_REJECT ip6t_LOG ip6t_rt ip6t_hbh ip6t_mh ip6t_ipv6header ip6t_frag ip6t_eui64 ip6t_ah ip6table_raw ip6_queue ip6table_mangle ip6table_filter ip6_tables nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp xt_policy xt_esp ipt_ah xt_HL xt_hl xt_ecn ipt_ECN xt_CLASSIFY xt_time xt_tcpmss xt_statistic xt_mark xt_length xt_DSCP xt_dscp xt_string xt_layer7 xt_quota xt_pkttype xt_physdev xt_owner ipt_MASQUERADE iptable_nat nf_nat xt_recent xt_helper xt_connmark xt_connbytes pptp xt_conntrack xt_CT xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack pppoe pppox ipt_REJECT xt_TCPMSS ipt_LOG xt_comment xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables ip_gre gre ifb sit ipcomp xfrm4_tunnel xfrm4_mode_tunnel xfrm4_mode_transport xfrm4_mode_beet esp4 ah4 tunnel4 tun tcp_ledbat(O) ppp_async ppp_generic slhc xfrm_user xfrm_ipcomp af_key vfat fat autofs4 button_hotplug(O) ath9k(O) ath9k_common(O) ath9k_hw(O) ath(O) nls_utf8 nls_iso8859_2 nls_iso8859_15 nls_iso8859_13 nls_iso8859_1 nls_cp437 mac80211(O) ts_fsm ts_bm ts_kmp crc_ccitt ipv6 input_polldev cfg80211(O) compat(O) input_core chainiv eseqiv crypto_wq sha1_generic krng rng md5 hmac des_generic cbc authenc arc4 aes_generic crypto_blkcipher cryptomgr aead usb_storage ohci_hcd ehci_hcd sd_mod ext4 jbd2 mbcache usbcore usb_common scsi_mod nls_base crc16 zlib_deflate crypto_hash crypto_algapi ledtrig_timer ledtrig_default_on leds_gpio gpio_button_hotplug(O)
[49066.546875] Process swapper (pid: 0, threadinfo=802fe000, task=803020c0, tls=00000000)
[49066.546875] Stack : fe3319ca 0c000000 00000000 80300000 83f93000 83f93000 8035d470 82b24086
[49066.546875] 82b2406e 83186338 00000002 00000000 00000000 831858a0 80000000 800ee148
[49066.546875] 80300b00 80300b20 000086dd 83d44000 00000000 801e8320 83d44000 80096fc4
[49066.546875] 00000000 00000000 80300b20 80091b2c 4f859d0a 1669a157 83d4438c 83d44380
[49066.546875] 83f93000 83d44000 00000001 83d44000 0000007c 000007b0 83d44238 801d06ec
[49066.546875] …
[49066.546875] Call Trace:
[49066.546875] [<831a3310>] ipv6_chk_mcast_addr+0x64/0x170 [ipv6]
[49066.546875] [<83186338>] ip6_mc_input+0xd4/0x4dc [ipv6]
[49066.546875] [<801e8320>] __netif_receive_skb+0x458/0x4c8
[49066.546875] [<801d06ec>] ag71xx_poll+0x3b0/0x66c
[49066.546875] [<801e8740>] net_rx_action+0x88/0x1c8
[49066.546875] [<80076e14>] __do_softirq+0xa0/0x154
[49066.546875] [<80077020>] do_softirq+0x48/0x68
[49066.546875] [<80077254>] irq_exit+0x4c/0xb0
[49066.546875] [<80062d4c>] ret_from_irq+0x0/0x4
[49066.546875] [<80062f40>] r4k_wait+0x20/0x40
[49066.546875] [<80064a1c>] cpu_idle+0x38/0x70
[49066.546875] [<8031a8c8>] start_kernel+0x37c/0x39c
[49066.546875]
[49066.546875]
[49066.546875] Code: 8e030008 00831826 00431025 <8e44000c> 8e03000c 00831826 00431025 10400006 00000000
root@OpenWrt:/sys/kernel/debug/mips# dmesg

[49066.539062] Code: 8e030008 00831826 00431025 <8e44000c> 8e03000c 00831826 00431025 10400006 00000000
[49066.539062] Cpu 0
[49066.539062] \$ 0 : 00000000 00000001 00000000 00010004
[49066.539062] \$ 4 : 00010006 82b24086 00000000 802ffc84
[49066.539062] \$ 8 : fe800000 00000000 02e081ff fe3319ca
[49066.539062] \$12 : 00000014 0014002b 00000001 00000001
[49066.539062] \$16 : 8315b380 00000000 82b24086 00000000
[49066.539062] \$20 : 00000000 80300000 80320000 80300b38
[49066.539062] \$24 : 00000000 83186264
[49066.542968] \$28 : 802fe000 802ffc90 80315c80 831a32e0
[49066.542968] Hi : 000002cf
[49066.542968] Lo : 04270c00
[49066.542968] epc : 831a32e8 ipv6_chk_mcast_addr+0x3c/0x170 [ipv6]
[49066.542968] Tainted: G O
[49066.542968] ra : 831a32e0 ipv6_chk_mcast_addr+0x34/0x170 [ipv6]
[49066.542968] Status: 1000fc03 KERNEL EXL IE
[49066.542968] Cause : 00800010
[49066.542968] BadVA : 82b2408a
[49066.542968] PrId : 00019374 (MIPS 24Kc)
[49066.542968] Modules linked in: cls_fw sch_htb sch_red sch_sfq cls_flow cls_u32 sch_qfq gpio_buttons ath79_wdt xt_hashlimit ip6t_REJECT ip6t_LOG ip6t_rt ip6t_hbh ip6t_mh ip6t_ipv6header ip6t_frag ip6t_eui64 ip6t_ah ip6table_raw ip6_queue ip6table_mangle ip6table_filter ip6_tables nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp xt_policy xt_esp ipt_ah xt_HL xt_hl xt_ecn ipt_ECN xt_CLASSIFY xt_time xt_tcpmss xt_statistic xt_mark xt_length xt_DSCP xt_dscp xt_string xt_layer7 xt_quota xt_pkttype xt_physdev xt_owner ipt_MASQUERADE iptable_nat nf_nat xt_recent xt_helper xt_connmark xt_connbytes pptp xt_conntrack xt_CT xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack pppoe pppox ipt_REJECT xt_TCPMSS ipt_LOG xt_comment xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables ip_gre gre ifb sit ipcomp xfrm4_tunnel xfrm4_mode_tunnel xfrm4_mode_transport xfrm4_mode_beet esp4 ah4 tunnel4 tun tcp_ledbat(O) ppp_async ppp_generic slhc xfrm_user xfrm_ipcomp af_key vfat fat autofs4 button_hotplug(O) ath9k(O) ath9k_common(O) ath9k_hw(O) ath(O) nls_utf8 nls_iso8859_2 nls_iso8859_15 nls_iso8859_13 nls_iso8859_1 nls_cp437 mac80211(O) ts_fsm ts_bm ts_kmp crc_ccitt ipv6 input_polldev cfg80211(O) compat(O) input_core chainiv eseqiv crypto_wq sha1_generic krng rng md5 hmac des_generic cbc authenc arc4 aes_generic crypto_blkcipher cryptomgr aead usb_storage ohci_hcd ehci_hcd sd_mod ext4 jbd2 mbcache usbcore usb_common scsi_mod nls_base crc16 zlib_deflate crypto_hash crypto_algapi ledtrig_timer ledtrig_default_on leds_gpio gpio_button_hotplug(O)
[49066.542968] Process swapper (pid: 0, threadinfo=802fe000, task=803020c0, tls=00000000)
[49066.542968] Stack : fe3319ca 0c000000 00000000 80300000 83f93000 83f93000 8035d470 82b24086
[49066.542968] 82b2406e 83186338 00000002 00000000 00000000 831858a0 80000000 800ee148
[49066.542968] 80300b00 80300b20 000086dd 83d44000 00000000 801e8320 83d44000 80096fc4
[49066.542968] 00000000 00000000 80300b20 80091b2c 4f859d0a 1669a157 83d4438c 83d44380
[49066.542968] 83f93000 83d44000 00000001 83d44000 0000007c 000007b0 83d44238 801d06ec
[49066.542968] …
[49066.542968] Call Trace:
[49066.542968] [<831a32e8>] ipv6_chk_mcast_addr+0x3c/0x170 [ipv6]
[49066.542968] [<83186338>] ip6_mc_input+0xd4/0x4dc [ipv6]
[49066.542968] [<801e8320>] __netif_receive_skb+0x458/0x4c8
[49066.542968] [<801d06ec>] ag71xx_poll+0x3b0/0x66c
[49066.542968] [<801e8740>] net_rx_action+0x88/0x1c8
[49066.542968] [<80076e14>] __do_softirq+0xa0/0x154
[49066.542968] [<80077020>] do_softirq+0x48/0x68
[49066.542968] [<80077254>] irq_exit+0x4c/0xb0
[49066.542968] [<80062d4c>] ret_from_irq+0x0/0x4
[49066.542968] [<80062f40>] r4k_wait+0x20/0x40
[49066.542968] [<80064a1c>] cpu_idle+0x38/0x70
[49066.542968] [<8031a8c8>] start_kernel+0x37c/0x39c
[49066.542968]
[49066.542968]
[49066.542968] Code: 00000000 08c68ccb 8e10000c <8e430004> 8e440000 00621826 8e020000 00821026 00621025
[49066.542968] Cpu 0
[49066.542968] \$ 0 : 00000000 00000001 00000000 00000000
[49066.542968] \$ 4 : 00010006 82b24086 00000000 802ffc84
[49066.542968] \$ 8 : fe800000 00000000 02e081ff fe3319ca
[49066.542968] \$12 : 00000014 0014002b 00000001 00000001
[49066.542968] \$16 : 8315b380 00000000 82b24086 00000000
[49066.542968] \$20 : 00000000 80300000 80320000 80300b38
[49066.542968] \$24 : 00000000 83186264
[49066.542968] \$28 : 802fe000 802ffc90 80315c80 831a32e0
[49066.542968] Hi : 000002cf
[49066.542968] Lo : 04270c00
[49066.542968] epc : 831a32ec ipv6_chk_mcast_addr+0x40/0x170 [ipv6]
[49066.542968] Tainted: G O
[49066.542968] ra : 831a32e0 ipv6_chk_mcast_addr+0x34/0x170 [ipv6]
[49066.542968] Status: 1000fc03 KERNEL EXL IE
[49066.542968] Cause : 00800010
[49066.542968] BadVA : 82b24086
[49066.542968] PrId : 00019374 (MIPS 24Kc)
[49066.542968] Modules linked in: cls_fw sch_htb sch_red sch_sfq cls_flow cls_u32 sch_qfq gpio_buttons ath79_wdt xt_hashlimit ip6t_REJECT ip6t_LOG ip6t_rt ip6t_hbh ip6t_mh ip6t_ipv6header ip6t_frag ip6t_eui64 ip6t_ah ip6table_raw ip6_queue ip6table_mangle ip6table_filter ip6_tables nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp xt_policy xt_esp ipt_ah xt_HL xt_hl xt_ecn ipt_ECN xt_CLASSIFY xt_time xt_tcpmss xt_statistic xt_mark xt_length xt_DSCP xt_dscp xt_string xt_layer7 xt_quota xt_pkttype xt_physdev xt_owner ipt_MASQUERADE iptable_nat nf_nat xt_recent xt_helper xt_connmark xt_connbytes pptp xt_conntrack xt_CT xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack pppoe pppox ipt_REJECT xt_TCPMSS ipt_LOG xt_comment xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables ip_gre gre ifb sit ipcomp xfrm4_tunnel xfrm4_mode_tunnel xfrm4_mode_transport xfrm4_mode_beet esp4 ah4 tunnel4 tun tcp_ledbat(O) ppp_async ppp_generic slhc xfrm_user xfrm_ipcomp af_key vfat fat autofs4 button_hotplug(O) ath9k(O) ath9k_common(O) ath9k_hw(O) ath(O) nls_utf8 nls_iso8859_2 nls_iso8859_15 nls_iso8859_13 nls_iso8859_1 nls_cp437 mac80211(O) ts_fsm ts_bm ts_kmp crc_ccitt ipv6 input_polldev cfg80211(O) compat(O) input_core chainiv eseqiv crypto_wq sha1_generic krng rng md5 hmac des_generic cbc authenc arc4 aes_generic crypto_blkcipher cryptomgr aead usb_storage ohci_hcd ehci_hcd sd_mod ext4 jbd2 mbcache usbcore usb_common scsi_mod nls_base crc16 zlib_deflate crypto_hash crypto_algapi ledtrig_timer ledtrig_default_on leds_gpio gpio_button_hotplug(O)
[49066.542968] Process swapper (pid: 0, threadinfo=802fe000, task=803020c0, tls=00000000)
[49066.542968] Stack : fe3319ca 0c000000 00000000 80300000 83f93000 83f93000 8035d470 82b24086
[49066.542968] 82b2406e 83186338 00000002 00000000 00000000 831858a0 80000000 800ee148
[49066.542968] 80300b00 80300b20 000086dd 83d44000 00000000 801e8320 83d44000 80096fc4
[49066.542968] 00000000 00000000 80300b20 80091b2c 4f859d0a 1669a157 83d4438c 83d44380
[49066.542968] 83f93000 83d44000 00000001 83d44000 0000007c 000007b0 83d44238 801d06ec
[49066.542968] …
[49066.542968] Call Trace:
[49066.542968] [<831a32ec>] ipv6_chk_mcast_addr+0x40/0x170 [ipv6]
[49066.542968] [<83186338>] ip6_mc_input+0xd4/0x4dc [ipv6]
[49066.542968] [<801e8320>] __netif_receive_skb+0x458/0x4c8
[49066.542968] [<801d06ec>] ag71xx_poll+0x3b0/0x66c
[49066.542968] [<801e8740>] net_rx_action+0x88/0x1c8
[49066.542968] [<80076e14>] __do_softirq+0xa0/0x154
[49066.542968] [<80077020>] do_softirq+0x48/0x68
[49066.542968] [<80077254>] irq_exit+0x4c/0xb0
[49066.542968] [<80062d4c>] ret_from_irq+0x0/0x4
[49066.542968] [<80062f40>] r4k_wait+0x20/0x40
[49066.542968] [<80064a1c>] cpu_idle+0x38/0x70
[49066.542968] [<8031a8c8>] start_kernel+0x37c/0x39c
[49066.542968]
[49066.542968]
[49066.542968] Code: 08c68ccb 8e10000c 8e430004 <8e440000> 00621826 8e020000 00821026 00621025 8e440008
[49066.542968] Cpu 0
[49066.542968] \$ 0 : 00000000 00000001 00000000 00000000
[49066.542968] \$ 4 : ff020000 82b24086 00000000 802ffc84
[49066.542968] \$ 8 : fe800000 00000000 02e081ff fe3319ca
[49066.542968] \$12 : 00000014 0014002b 00000001 00000001
[49066.542968] \$16 : 8315b380 00000000 82b24086 00000000
[49066.542968] \$20 : 00000000 80300000 80320000 80300b38
[49066.542968] \$24 : 00000000 83186264
[49066.542968] \$28 : 802fe000 802ffc90 80315c80 831a32e0
[49066.542968] Hi : 000002cf
[49066.542968] Lo : 04270c00
[49066.542968] epc : 831a3300 ipv6_chk_mcast_addr+0x54/0x170 [ipv6]
[49066.542968] Tainted: G O
[49066.542968] ra : 831a32e0 ipv6_chk_mcast_addr+0x34/0x170 [ipv6]
[49066.542968] Status: 1000fc03 KERNEL EXL IE
[49066.542968] Cause : 00800010
[49066.542968] BadVA : 82b2408e
[49066.542968] PrId : 00019374 (MIPS 24Kc)
[49066.542968] Modules linked in: cls_fw sch_htb sch_red sch_sfq cls_flow cls_u32 sch_qfq gpio_buttons ath79_wdt xt_hashlimit ip6t_REJECT ip6t_LOG ip6t_rt ip6t_hbh ip6t_mh ip6t_ipv6header ip6t_frag ip6t_eui64 ip6t_ah ip6table_raw ip6_queue ip6table_mangle ip6table_filter ip6_tables nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp xt_policy xt_esp ipt_ah xt_HL xt_hl xt_ecn ipt_ECN xt_CLASSIFY xt_time xt_tcpmss xt_statistic xt_mark xt_length xt_DSCP xt_dscp xt_string xt_layer7 xt_quota xt_pkttype xt_physdev xt_owner ipt_MASQUERADE iptable_nat nf_nat xt_recent xt_helper xt_connmark xt_connbytes pptp xt_conntrack xt_CT xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack pppoe pppox ipt_REJECT xt_TCPMSS ipt_LOG xt_comment xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables ip_gre gre ifb sit ipcomp xfrm4_tunnel xfrm4_mode_tunnel xfrm4_mode_transport xfrm4_mode_beet esp4 ah4 tunnel4 tun tcp_ledbat(O) ppp_async ppp_generic slhc xfrm_user xfrm_ipcomp af_key vfat fat autofs4 button_hotplug(O) ath9k(O) ath9k_common(O) ath9k_hw(O) ath(O) nls_utf8 nls_iso8859_2 nls_iso8859_15 nls_iso8859_13 nls_iso8859_1 nls_cp437 mac80211(O) ts_fsm ts_bm ts_kmp crc_ccitt ipv6 input_polldev cfg80211(O) compat(O) input_core chainiv eseqiv crypto_wq sha1_generic krng rng md5 hmac des_generic cbc authenc arc4 aes_generic crypto_blkcipher cryptomgr aead usb_storage ohci_hcd ehci_hcd sd_mod ext4 jbd2 mbcache usbcore usb_common scsi_mod nls_base crc16 zlib_deflate crypto_hash crypto_algapi ledtrig_timer ledtrig_default_on leds_gpio gpio_button_hotplug(O)
[49066.542968] Process swapper (pid: 0, threadinfo=802fe000, task=803020c0, tls=00000000)
[49066.542968] Stack : fe3319ca 0c000000 00000000 80300000 83f93000 83f93000 8035d470 82b24086
[49066.542968] 82b2406e 83186338 00000002 00000000 00000000 831858a0 80000000 800ee148
[49066.542968] 80300b00 80300b20 000086dd 83d44000 00000000 801e8320 83d44000 80096fc4
[49066.542968] 00000000 00000000 80300b20 80091b2c 4f859d0a 1669a157 83d4438c 83d44380
[49066.542968] 83f93000 83d44000 00000001 83d44000 0000007c 000007b0 83d44238 801d06ec
[49066.542968] …
[49066.542968] Call Trace:
[49066.542968] [<831a3300>] ipv6_chk_mcast_addr+0x54/0x170 [ipv6]
[49066.542968] [<83186338>] ip6_mc_input+0xd4/0x4dc [ipv6]
[49066.542968] [<801e8320>] __netif_receive_skb+0x458/0x4c8
[49066.542968] [<801d06ec>] ag71xx_poll+0x3b0/0x66c
[49066.542968] [<801e8740>] net_rx_action+0x88/0x1c8
[49066.542968] [<80076e14>] __do_softirq+0xa0/0x154
[49066.542968] [<80077020>] do_softirq+0x48/0x68
[49066.542968] [<80077254>] irq_exit+0x4c/0xb0
[49066.542968] [<80062d4c>] ret_from_irq+0x0/0x4
[49066.542968] [<80062f40>] r4k_wait+0x20/0x40
[49066.542968] [<80064a1c>] cpu_idle+0x38/0x70
[49066.542968] [<8031a8c8>] start_kernel+0x37c/0x39c
[49066.542968]
[49066.542968]
[49066.542968] Code: 8e020000 00821026 00621025 <8e440008> 8e030008 00831826 00431025 8e44000c 8e03000c
[49066.542968] Cpu 0
[49066.542968] \$ 0 : 00000000 00000001 00000000 00000000
[49066.546875] \$ 4 : 00000000 82b24086 00000000 802ffc84
[49066.546875] \$ 8 : fe800000 00000000 02e081ff fe3319ca
[49066.546875] \$12 : 00000014 0014002b 00000001 00000001
[49066.546875] \$16 : 8315b380 00000000 82b24086 00000000
[49066.546875] \$20 : 00000000 80300000 80320000 80300b38
[49066.546875] \$24 : 00000000 83186264
[49066.546875] \$28 : 802fe000 802ffc90 80315c80 831a32e0
[49066.546875] Hi : 000002cf
[49066.546875] Lo : 04270c00
[49066.546875] epc : 831a3310 ipv6_chk_mcast_addr+0x64/0x170 [ipv6]
[49066.546875] Tainted: G O
[49066.546875] ra : 831a32e0 ipv6_chk_mcast_addr+0x34/0x170 [ipv6]
[49066.546875] Status: 1000fc03 KERNEL EXL IE
[49066.546875] Cause : 00800010
[49066.546875] BadVA : 82b24092
[49066.546875] PrId : 00019374 (MIPS 24Kc)
[49066.546875] Modules linked in: cls_fw sch_htb sch_red sch_sfq cls_flow cls_u32 sch_qfq gpio_buttons ath79_wdt xt_hashlimit ip6t_REJECT ip6t_LOG ip6t_rt ip6t_hbh ip6t_mh ip6t_ipv6header ip6t_frag ip6t_eui64 ip6t_ah ip6table_raw ip6_queue ip6table_mangle ip6table_filter ip6_tables nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp xt_policy xt_esp ipt_ah xt_HL xt_hl xt_ecn ipt_ECN xt_CLASSIFY xt_time xt_tcpmss xt_statistic xt_mark xt_length xt_DSCP xt_dscp xt_string xt_layer7 xt_quota xt_pkttype xt_physdev xt_owner ipt_MASQUERADE iptable_nat nf_nat xt_recent xt_helper xt_connmark xt_connbytes pptp xt_conntrack xt_CT xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack pppoe pppox ipt_REJECT xt_TCPMSS ipt_LOG xt_comment xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables ip_gre gre ifb sit ipcomp xfrm4_tunnel xfrm4_mode_tunnel xfrm4_mode_transport xfrm4_mode_beet esp4 ah4 tunnel4 tun tcp_ledbat(O) ppp_async ppp_generic slhc xfrm_user xfrm_ipcomp af_key vfat fat autofs4 button_hotplug(O) ath9k(O) ath9k_common(O) ath9k_hw(O) ath(O) nls_utf8 nls_iso8859_2 nls_iso8859_15 nls_iso8859_13 nls_iso8859_1 nls_cp437 mac80211(O) ts_fsm ts_bm ts_kmp crc_ccitt ipv6 input_polldev cfg80211(O) compat(O) input_core chainiv eseqiv crypto_wq sha1_generic krng rng md5 hmac des_generic cbc authenc arc4 aes_generic crypto_blkcipher cryptomgr aead usb_storage ohci_hcd ehci_hcd sd_mod ext4 jbd2 mbcache usbcore usb_common scsi_mod nls_base crc16 zlib_deflate crypto_hash crypto_algapi ledtrig_timer ledtrig_default_on leds_gpio gpio_button_hotplug(O)
[49066.546875] Process swapper (pid: 0, threadinfo=802fe000, task=803020c0, tls=00000000)
[49066.546875] Stack : fe3319ca 0c000000 00000000 80300000 83f93000 83f93000 8035d470 82b24086
[49066.546875] 82b2406e 83186338 00000002 00000000 00000000 831858a0 80000000 800ee148
[49066.546875] 80300b00 80300b20 000086dd 83d44000 00000000 801e8320 83d44000 80096fc4
[49066.546875] 00000000 00000000 80300b20 80091b2c 4f859d0a 1669a157 83d4438c 83d44380
[49066.546875] 83f93000 83d44000 00000001 83d44000 0000007c 000007b0 83d44238 801d06ec
[49066.546875] …
[49066.546875] Call Trace:
[49066.546875] [<831a3310>] ipv6_chk_mcast_addr+0x64/0x170 [ipv6]
[49066.546875] [<83186338>] ip6_mc_input+0xd4/0x4dc [ipv6]
[49066.546875] [<801e8320>] __netif_receive_skb+0x458/0x4c8
[49066.546875] [<801d06ec>] ag71xx_poll+0x3b0/0x66c
[49066.546875] [<801e8740>] net_rx_action+0x88/0x1c8
[49066.546875] [<80076e14>] __do_softirq+0xa0/0x154
[49066.546875] [<80077020>] do_softirq+0x48/0x68
[49066.546875] [<80077254>] irq_exit+0x4c/0xb0
[49066.546875] [<80062d4c>] ret_from_irq+0x0/0x4
[49066.546875] [<80062f40>] r4k_wait+0x20/0x40
[49066.546875] [<80064a1c>] cpu_idle+0x38/0x70
[49066.546875] [<8031a8c8>] start_kernel+0x37c/0x39c
[49066.546875]
[49066.546875]
[49066.546875] Code: 8e030008 00831826 00431025 <8e44000c> 8e03000c 00831826 00431025 10400006 00000000

Updated by Dave Täht on Apr 11, 2012.
I had great difficulty in tweaking the kernel to not hang completely when looking at unaligned traps. Ketkulka found a way to at least survive light traffic.

echo 1 1 1 1 > /proc/sys/kernel/printk
echo 2 > /sys/kernel/debug/mips/unaligned_action
logread -f

If your box stays up, you can then run moderate amounts of traffic through it

Updated by Robert Bradley on Apr 11, 2012.
If it’s what I think it is, it’s down to the fact that ipv6_chk_mcast_addr is passed a pointer into the header structure, which is probably unaligned. The attached patch is ugly and untested (I’m on x86/x64 which doesn’t help!), but should get around this issue through copying the address before calling ip6_mc_input.

I’ve also included a patch to mark in6_addr as packed, which might help with other functions. I’d suggest adding __packed to the ipv6hdr struct too, but 902-unaligned_access_hacks.patch does that already.

Updated by Dave Täht on Apr 11, 2012.
Heh. I got so far as to build a patch, brick two routers, make a newbie mistake, twice, and then succeeded in eliminating the multicast error another way.

Now I get a different one, which I think is solved by one of the other patches in flight… It’s really awesome to finally get some insight into this.

Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<8022acc0>] tcp_rcv_established+0xa8/0x690
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<80231524>] tcp_v4_do_rcv+0x40/0x244
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<8023359c>] tcp_v4_rcv+0x59c/0x944
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<80213a88>] ip_local_deliver_finish+0x170/0x29c
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<801e8320>] __netif_receive_skb+0x458/0x4c8
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<801d0708>] ag71xx_poll+0x3cc/0x66c
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<801e8740>] net_rx_action+0x88/0x1c8
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<80076e14>] __do_softirq+0xa0/0x154
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<80077020>] do_softirq+0x48/0x68
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<80077254>] irq_exit+0x4c/0xb0
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<80062d4c>] ret_from_irq+0x0/0x4
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<80262910>] maybe_add_creds+0x2c/0x8c
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<8026407c>] unix_dgram_sendmsg+0x3a4/0x52c
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<801d877c>] sock_sendmsg+0x84/0xa4
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<801d9f10>] sys_sendto+0xcc/0x10c
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<801d9f64>] sys_send+0x14/0x20
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062] [<80069e44>] stack_done+0x20/0x40
Apr 12 00:08:55 OpenWrt kern.alert kernel: [ 602.039062]

Updated by Dave Täht on Apr 11, 2012.
I note in doing the original patch I only did the right hand side as that seemed to always come from the skb, and the left from an aligned structure.

Remaining bad boys are:

__udp6_lib_rcv

tcp_validate_incoming
tcp_rcv_established (which calls the above, so I think the original patch under discussion is at least part of the right thing)

In addition to the printk trick

logread -f | grep -A 3 "Call Trace" | grep -v "Call Trace"

is very helpful. I’m going to load it up with some different kinds of traffic, tunnelling, fw rules, etc, to see if anything else jumps out of the woodwork. If you (and Ketkulka) want to take the patch burden overnight, go fer it (it takes hours to build kernels for cerowrt)

Updated by Dave Täht on Apr 11, 2012.
Can be triggered with a

rdisc6 eth0

pr 12 00:31:19 OpenWrt kern.alert kernel: [ 1946.695312] [<83097dd0>] ndisc_rcv+0x404/0xf20 [ipv6]
Apr 12 00:31:19 OpenWrt kern.alert kernel: [ 1946.695312] [<8309eba4>] icmpv6_err_convert+0x934/0xba4 [ipv6]

I’m not sure when this happens

[<83095818>] ipv6_setsockopt+0x114/0x44c [ipv6]

I confess I’m looking for more stuff in the hot path than this…

netperf -H 172.30.42.1 -t TCP_MAERTS really triggers this one:

Apr 12 00:34:56 OpenWrt kern.alert kernel: [ 2163.253906] [<8022acc0>] tcp_rcv_established+0xa8/0x690
Apr 12 00:34:56 OpenWrt kern.alert kernel: [ 2163.253906] [<80231524>] tcp_v4_do_rcv+0x40/0x244
Apr 12 00:34:56 OpenWrt kern.alert kernel: [ 2163.253906] [<801db3b0>] release_sock+0xa8/0x10c

Lick that and netperf should get better…

Updated by Dave Täht on Apr 11, 2012.
Apr 12 00:40:01 OpenWrt kern.alert kernel: [ 2468.203125] [<801e6e94>] skb_flow_dissect+0x384/0x400
Apr 12 00:40:01 OpenWrt kern.alert kernel: [ 2468.203125] [<82a273f8>] 0x82a273f8

With some serious traffic through the wired and wireless interfaces through the default aqms…

Originally I didn’t want to declare in_addr6 as packed as per your patch, because using 32 bit fetches is GOOD… well… lets see what else breaks.

Updated by Dave Täht on Apr 11, 2012.
Oh man, netperf over ipv6 hurts it bad. That explains a LOT

Apr 12 00:49:58 OpenWrt kern.alert kernel: [ 3065.675781] [<8309bb14>] __udp6_lib_rcv+0x3fc/0x7e4 [ipv6]

A mere ssh works, but it locks up the box for multiple seconds if I hit it with netperf over ipv6.

An easy way to test ivp6 locally is just to do a

laptop# ip -6 addr add 2001:db8:1::264 dev whatever
cero # ip -6 addr add 2001:db8:1::164 dev whatever

and then you can ssh, web, etc between your two machines

Updated by Dave Täht on Apr 11, 2012.
Looks like this function has 4 traps in it…

Apr 12 00:56:44 OpenWrt kern.alert kernel: [ 3471.902343] [<8309bad4>] __udp6_lib_rcv+0x3bc/0x7e4 [ipv6]
Apr 12 00:56:44 OpenWrt kern.alert kernel: [ 3471.902343] [<83085bc8>] ip6_rcv_finish+0x2a8/0x4a8 [ipv6]
Apr 12 00:56:44 OpenWrt kern.alert kernel: [ 3471.902343] Cause : 00800010

Apr 12 00:56:44 OpenWrt kern.alert kernel: [ 3471.906250] [<8309bae4>] __udp6_lib_rcv+0x3cc/0x7e4 [ipv6]
Apr 12 00:56:44 OpenWrt kern.alert kernel: [ 3471.906250] [<83085bc8>] ip6_rcv_finish+0x2a8/0x4a8 [ipv6]
Apr 12 00:56:44 OpenWrt kern.alert kernel: [ 3471.906250]

Apr 12 00:56:44 OpenWrt kern.alert kernel: [ 3471.906250] [<8309bb04>] __udp6_lib_rcv+0x3ec/0x7e4 [ipv6]
Apr 12 00:56:44 OpenWrt kern.alert kernel: [ 3471.906250] [<83085bc8>] ip6_rcv_finish+0x2a8/0x4a8 [ipv6]
Apr 12 00:56:44 OpenWrt kern.alert kernel: [ 3471.906250]

Apr 12 00:56:44 OpenWrt kern.alert kernel: [ 3471.906250] [<8309bb14>] __udp6_lib_rcv+0x3fc/0x7e4 [ipv6]
Apr 12 00:56:44 OpenWrt kern.alert kernel: [ 3471.906250] [<83085bc8>] ip6_rcv_finish+0x2a8/0x4a8 [ipv6]
Apr 12 00:56:44 OpenWrt kern.alert kernel: [ 3471.906250]

Updated by Dave Täht on Apr 11, 2012.
Some more problem children… I knew we had issues here but had no insight

this is with a ssh connection via ipv6 just catting /etc/passwd

Apr 12 01:45:03 OpenWrt kern.alert kernel: [ 6370.410156] [<80265630>] __inet6_lookup_established+0x3c/0x5f4
Apr 12 01:45:03 OpenWrt kern.alert kernel: [ 6370.410156] [<830a7cf4>] ipv6_frag_exit+0x2b08/0x35c4 [ipv6]
0 OpenWrt kern.alert kernel: [ 6377.667968] Cause : 00800010

Apr 12 01:45:10 OpenWrt kern.alert kernel: [ 6377.667968] [<80225e5c>] tcp_validate_incoming+0x5c/0x370
Apr 12 01:45:10 OpenWrt kern.alert kernel: [ 6377.667968] [<8022b298>] tcp_rcv_established+0x680/0x690
Apr 12 01:45:10 OpenWrt kern.alert kernel: [ 6377.667968] [<830a75f4>] ipv6_frag_exit+0x2408/0x35c4 [ipv6]

update…

I was wrong on inet6_addr_equal, the unaligned check needs to be on both the left and right hand side. It is called in the very ugly INET6_MATCH macro..

Updated by Dave Täht on Apr 11, 2012.
So I’ve folded in the first patch we started with, but hopefully went one better by changing the macro rather than the code. I fear that will break something else but, we’ll see. I’ve also updated the first patch with the easy stuff I could find.

However, finding what is wrong with udp6_lib_rcv - which is in the hotpath - more or less eludes me. Is it something as simple as the skb being unaligned?

I see elsewhere that perhaps the assembly for csum_ipv6_magic might be called unaligned but didn’t check further.

So anyway, doing another build with just the fixes so far, I hope this nails a few of the problems… IF it works, I’ll put the patches and kernel out around 1AM my time.

Updated by Dave Täht on Apr 11, 2012.
After I get this build, I plan to disassemble the resulting vmlinux and see if I can spot exactly where exactly the offending code sequences are.

Assuming it boots and doesn’t die, I’ll put it up for further testing pleasure. I’ve noted problems elsewhere with vpn tunnels which possibly might extend to regular (eg - 6in4, 6to4) tunnels, which I am sure I’ll be too tired to test myself.

I really don’t want to stay up much past midnight.

I’ve attached the second patch under test. This replaces the patch we started with entirely, and does a couple other new things.

Updated by Dave Täht on Apr 11, 2012.
net/ipv4/tcp.c: In function ‘tcp_gro_receive’:
net/ipv4/tcp.c:2837:21: error: lvalue required as left operand of assignment
CC net/netfilter/nf_log.o
make[7]: ***** [net/ipv4/tcp.o] Error 1
make[6]: ***** [net/ipv4] Error 2
make[6]: ***** Waiting for unfinished jobs….

Should have tried compiling on x86, first. It’s late, I have not enough braincells to continue myself

Updated by Dave Täht on Apr 12, 2012.
@robert: Well, I’m trying the in_6 patch. It certainly is a rather big hammer to try.
Updated by Robert Bradley on Apr 12, 2012.
Yes, the in6_addr patch did seem a little extreme, but ought to get every function that touches it with little effort. It also should require less maintenance than 0002-More-ipv6-unaligned-access-hacks.patch. On the other hand, either of these are a lot better than the original fix!

https://dev.openwrt.org/browser/trunk/target/linux/ar71xx/files/drivers/net/ag71xx/ag71xx_main.c?rev=20506

That actually copied the entire packet within the ethernet driver so that it was aligned. That was later backed out in revision 21166 since copying/moving the entire packet was slower than the exceptions.

Updated by Dave Täht on Apr 12, 2012.
yea, that was terrible

I made some progress with the combination of patches so far, but I think some dissasembly is now required. tcp_validate_incoming is on the call path,

This is a test against ipv4

Apr 12 20:52:00 OpenWrt kern.alert kernel: [ 606.476562] [<8309be9c>] __udp6_lib_rcv+0x3dc/0x7e4 [ipv6]
Apr 12 20:52:00 OpenWrt kern.alert kernel: [ 606.476562] [<83085c68>] ip6_rcv_finish+0x2a8/0x4a8 [ipv6]
Apr 12 20:52:00 OpenWrt kern.alert kernel: [ 606.476562]

Apr 12 20:52:00 OpenWrt kern.alert kernel: [ 606.476562] [<8309beac>] __udp6_lib_rcv+0x3ec/0x7e4 [ipv6]
Apr 12 20:52:00 OpenWrt kern.alert kernel: [ 606.476562] [<83085c68>] ip6_rcv_finish+0x2a8/0x4a8 [ipv6]
Apr 12 20:52:00 OpenWrt kern.alert kernel: [ 606.476562]

Apr 12 20:52:00 OpenWrt kern.alert kernel: [ 606.476562] [<8309bebc>] __udp6_lib_rcv+0x3fc/0x7e4 [ipv6]
Apr 12 20:52:00 OpenWrt kern.alert kernel: [ 606.476562] [<83085c68>] ip6_rcv_finish+0x2a8/0x4a8 [ipv6]
Apr 12 20:52:00 OpenWrt kern.alert kernel: [ 606.476562]

Apr 12 20:52:01 OpenWrt kern.alert kernel: [ 607.121093] [<80225e60>] tcp_validate_incoming+0x5c/0x370
Apr 12 20:52:01 OpenWrt kern.alert kernel: [ 607.121093] [<8022b2a0>] tcp_rcv_established+0x684/0x694
Apr 12 20:52:01 OpenWrt kern.alert kernel: [ 607.121093] [<80231534>] tcp_v4_do_rcv+0x40/0x244

Apr 12 20:52:01 OpenWrt kern.alert kernel: [ 607.121093] [<80225e80>] tcp_validate_incoming+0x7c/0x370
Apr 12 20:52:01 OpenWrt kern.alert kernel: [ 607.121093] [<8022b2a0>] tcp_rcv_established+0x684/0x694
Apr 12 20:52:01 OpenWrt kern.alert kernel: [ 607.121093] [<80231534>] tcp_v4_do_rcv+0x40/0x244

Apr 12 20:52:01 OpenWrt kern.alert kernel: [ 607.121093] [<80225e88>] tcp_validate_incoming+0x84/0x370
Apr 12 20:52:01 OpenWrt kern.alert kernel: [ 607.121093] [<8022b2a0>] tcp_rcv_established+0x684/0x694
Apr 12 20:52:01 OpenWrt kern.alert kernel: [ 607.121093] [<80231534>] tcp_v4_do_rcv+0x40/0x244

Updated by Dave Täht on Apr 12, 2012.
Updated by Dave Täht on Apr 12, 2012.
Updated by Dave Täht on Apr 12, 2012.
I have pushed out 3.3.1-9 which contains some patch sets making ipv6 only have 4x the traps ipv4 does. In either case, it’s a LOT of traps on the ethernet side.

(wireless is much better)

In a 60 second 93Mbit test of netperf directly to the router…

proto traps
ipv4: 1529909
ipv6: 6080546

So throwing 100,000 unaligned traps/sec is rather bad.
25,000 is also bad.

I haven’t time to pursue this further this week. I have put up a copy of 3.3.1-9
(because it IS better) http://huchra.bufferbloat.net/~cero1/3.3/3.3.1-9/

and there is also a copy of vmlinux in there for your profiling pleasure, but perhaps examining the assembly will lend clue.

Updated by Robert Bradley on Apr 13, 2012.
I haven’t managed to disassemble vmlinux here, but staring at the functions in question, the only idea I could come up with was to try blindly packing tcp_skb_cb and udp_skb_cb. I cannot see why this could possibly work, mind you. The only hope is that MIPS GCC is somehow misaligning some of the fields, which should not be the case!
Updated by Dave Täht on Apr 13, 2012.
That scares me.
Updated by Dave Täht on Apr 13, 2012.
I don’t think packing that is a good idea

#define TCPCB_TAGBITS           0x07    /* All tag bits                 */
        __u8            ip_dsfield;     /* IPv4 tos or IPv6 dsfield     */
        /* 1 byte hole */
#define TCPCB_EVER_RETRANS      0x80    /* Ever retransmitted frame     */
#define TCPCB_RETRANS           (TCPCB_SACKED_RETRANS|TCPCB_EVER_RETRANS)

        __u32           ack_seq;        /* Sequence number ACK'd        */
Updated by Robert Bradley on Apr 13, 2012.
If it helps, I can’t see how that could possibly be true either. I’m just assuming that given we’re limited to __udp_lib_rcv() and any inline functions, the only options are:

  • Fields in udphdr (uh) - already marked as packed
  • saddr/@daddr, which should be covered by thein6_addr@ patch
  • Fields in skb
  • Fields within skb->cb
  • The sock structure

I cannot seriously imagine that skb or sock are unaligned, and so we’re stuck with anything inside skb->cb - which should still be aligned! Either way, if this turns out not to be a dead end, it will scare me too…

(As for the one byte hole in tcp_skb_cb, we could add an unused __u8 as padding, but I still don’t like it.)

Updated by Dave Täht on Apr 13, 2012.
BTW, see bug #352.

ipv6 with the current patch set, is at 185Mbit/sec, up from 110.

Not bad. :) I think we can improve both ipv4 and ipv6 a lot more.

I think however that some dissassembly and kernel symbols is now required.

I honestly don’t know how to do that in openwrt. and even if I could I suspect we’d run out of memory.

Updated by Dave Täht on Apr 14, 2012.
from skbuff.h… need to look into this

* Since an ethernet header is 14 bytes network drivers often end up with
* the IP header at an unaligned offset. The IP header can be aligned by
* shifting the start of the packet by 2 bytes. Drivers should do this
* with:
*
* skb_reserve(skb, NET_IP_ALIGN);
*
* The downside to this alignment of the IP header is that the DMA is now
* unaligned. On some architectures the cost of an unaligned DMA is high
* and this cost outweighs the gains made by aligning the IP header.
*
* Since this trade off varies between architectures, we allow NET_IP_ALIGN
* to be overridden.
*/
#ifndef NET_IP_ALIGN
#define NET_IP_ALIGN 2
#endif

Updated by Robert Bradley on Apr 14, 2012.
Apparently this was looked at two years ago by the OpenWRT developers (https://dev.openwrt.org/changeset/20892/trunk/target/linux/ar71xx/files/drivers/net/ag71xx/ag71xx_main.c). The revision log there claims that ar71xx can’t do misaligned DMA, and if that’s true, NET_IP_ALIGN isn’t going to help. In other words, we’re stuck having to track down unaligned accesses to packet data (which should not be affecting the stuff stored in skb or skb->cb).
Updated by Dave Täht on Apr 14, 2012.
The attach patch… doesn’t appear to have done any good performance-wise.

Still see the 4 traps out of the udp6 routine, and that appears to be our baddest boy. The tcp stuff is an artifact of running the test on the router, and relatively unimportant compared to that.

My three thoughts are 1) could still be saddr or daddr being accessed unaligned
2) I stuck up a version compiled for debugging with this patch and the previous ones.

http://huchra.bufferbloat.net/~cero1/3.3/3.3.2-1/

I plan to rip it out in the next version, but it does rule out some stuff, I think.

3) figuring out how to get to the symbol table for this stuff would help. Regular gdb didn’t find the relevant symbols, perhaps running the cross gdb would work

Updated by Robert Bradley on Apr 14, 2012.
I think the answer may be me being slightly unobservant. The __udp6_lib_rcv function calls udp6_csum_init, an inline function that calls csum_ipv6_magic. And for MIPS, this is implemented in asm, which ignores my packing. This probably hits the IPv4 side too, except there you have shorter addresses (our 100k/s v6 traps vs. 25k/s IPv4?).

I believe that we need to do one of the following to /arch/mips/include/asm/checksum.h:

  • Remove #define _HAVE_ARCH_IPV6_CSUM and the csum_ipv6_magic function from this file entirely. This falls back to the generic C code in /include/net/ip6_checksum.h.
  • Insert a new variant of csum_ipv6_magic that can handle unaligned addresses. I’ve included a possible version below (not in patch format yet). If the pointers are aligned, we keep using the faster version in the kernel. If not, we call a new variant by me (which may be risky) which calculates the offset, loads in a pair of words at a time and shifts them appropriately.
static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
                                          const struct in6_addr *daddr,
                                          __u32 len, unsigned short proto,
                                          __wsum sum)
{
        if ((saddr & 3) || (daddr & 3)) {
            /* Unaligned to word boundaries... */
            return csum_ipv6_unaligned_magic(saddr, daddr, len, proto, sum);
        }
        __asm__(
        "      .set push            # csum_ipv6_magic\n"
        "      .set noreorder      \n"
        "      .set noat            \n"
        "      addu %0, %5        # proto (long in network byte order)\n"
        "      sltu $1, %0, %5    \n"
        "      addu %0, $1        \n"

        "      addu %0, %6        # csum\n"
        "      sltu $1, %0, %6    \n"
        "      lw     %1, 0(%2)    # four words source address\n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      lw     %1, 4(%2)    \n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      lw     %1, 8(%2)    \n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      lw     %1, 12(%2)      \n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      lw     %1, 0(%3)    \n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      lw     %1, 4(%3)    \n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      lw     %1, 8(%3)    \n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      lw     %1, 12(%3)      \n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      addu %0, $1        # Add final carry\n"
        "      .set pop"
        : "=r" (sum), "=r" (proto)
        : "r" (saddr), "r" (daddr),
          "" (htonl(len)), "1" (htonl(proto)), "r" (sum));

        return csum_fold(sum);
}

static __inline__ __sum16 csum_ipv6_unaligned_magic(struct in6_addr *saddr,
                                          struct in6_addr *daddr,
                                          __u32 len, unsigned short proto,
                                          __wsum sum)
{
        __asm__(
        "      .set push            # csum_ipv6_magic\n"
        "      .set noreorder      \n"
        "      .set noat            \n"
        "      addu %0, %5        # proto (long in network byte order)\n"
        "      sltu $1, %0, %5    \n"
        "      addu %0, $1        \n"

        "      addu %0, %6        # csum\n"
        "      sltu $1, %0, %6    \n"
        "      andi %8, %2, 3      # calc offset for saddr\n"
        "      sub   %2, %2, %8   # align pointer\n"
        "      sll   %8, %8, 8     # calc shift left amount\n"
        "      lui   %9, 32       # calc shift right amount\n"
        "      srl   %9, 16       \n"
        "      subi %9, %9, %8    \n"
        "      lw     %1, 0(%2)    # four words source address\n"
        "      lw     %10, 4(%2)      # and next word\n"
        "      sllv %1, %1, %8    # shift word1 left\n"
        "      srlv %10, %10, %9    # shift word2 right\n"
        "      or     %1, %1, %10    # or together\n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      lw     %1, 4(%2)    \n"
        "      lw     %10, 8(%2)      \n"
        "      sllv %1, %1, %8    \n"
        "      srlv %11, %10, %9    \n"
        "      or     %1, %1, %11    \n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      lw     %1, 8(%2)    \n"
        "      lw     %10, 12(%2)    \n"
        "      sllv %1, %1, %8    \n"
        "      srlv %11, %10, %9    \n"
        "      or     %1, %1, %11    \n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      lw     %1, 12(%2)      \n"
        "      lw     %10, 16(%2)    \n"
        "      sllv %1, %1, %8    \n"
        "      srlv %11, %10, %9    \n"
        "      or     %1, %1, %11    \n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      srl   %8, %8, 8     # Undo damage to saddr\n"
        "      addu  %2, %2, %8   \n"

        "      andi %8, %3, 3      # calc offset for daddr\n"
        "      sub   %3, %3, %8   # align pointer\n"
        "      sll   %8, %8, 8     # calc shift left amount\n"
        "      lui   %9, 32       # calc shift right amount\n"
        "      srl   %9, 16       \n"
        "      subi %9, %9, %8    \n"
        "      lw     %1, 0(%3)    # four words source address\n"
        "      lw     %10, 4(%3)      # and next word\n"
        "      sllv %1, %1, %8    # shift word1 left\n"
        "      srlv %10, %10, %9    # shift word2 right\n"
        "      or     %1, %1, %10    # or together\n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      lw     %1, 4(%3)    \n"
        "      lw     %10, 8(%3)      \n"
        "      sllv %1, %1, %8    \n"
        "      srlv %10, %10, %9    \n"
        "      or     %1, %1, %10    \n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      lw     %1, 8(%3)    \n"
        "      lw     %10, 12(%3)     \n"
        "      sllv %1, %1, %8    \n"
        "      srlv %10, %10, %9    \n"
        "      or     %1, %1, %10    \n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      lw     %1, 12(%3)       \n"
        "      lw     %10, 16(%3)     \n"
        "      sllv %1, %1, %8    \n"
        "      srlv %10, %10, %9    \n"
        "      or     %1, %1, %10    \n"
        "      addu %0, $1        \n"
        "      addu %0, %1        \n"
        "      sltu $1, %0, %1    \n"

        "      srl   %8, %8, 8     # Undo damage to daddr\n"
        "      addu  %3, %3, %8   \n"

        "      addu %0, $1        # Add final carry\n"
        "      .set pop"
        : "=r" (sum), "=r" (proto)
        : "r" (saddr), "r" (daddr),
          "" (htonl(len)), "1" (htonl(proto)), "r" (sum));

        return csum_fold(sum);
}
Updated by Robert Bradley on Apr 15, 2012.
Updated by Robert Bradley on Apr 15, 2012.
I have created two patches corresponding to the two options in comment 28. Needless to say, these cannot both be applied at the same time! One other thing to note is that the asm option is big-endian specific, which isn’t an issue for ar71xx, but is worth remembering if this ends up being submitted somewhere else.
Updated by Dave Täht on Apr 15, 2012.
Jeebus, robert, what do you do when not hacking mips assembly?

I’m um, er, reluctant to just toss a rewritten assembly routine in that has been tested, but what the heck - I’ll do a build with it tonight and see what happens. Otherwise, falling back to the C routine ought to ‘just work’. I may just fire off two builds to save on test time.

But:

Yes! the checksum routine looks like a very good candidate for this bug. Historically it’s always been a major bottleneck in the first place.

Updated by Dave Täht on Apr 15, 2012.
do you have to tell gcc you are scribbling on regs %8,%9,%10 somehow? Or are they garunteed to be scratch by the api?
Updated by Robert Bradley on Apr 15, 2012.
According to ftp://ftp.linux-mips.org/pub/linux/mips/doc/ABI2/MIPS-N32-ABI-Handbook.pdf, they’re used for function arguments (which we don’t have in this case), and are caller-saved in any case. If you’re worried about it, I can move these to the temporary registers %12-%15, or just stick to the generic C version.

I think the saddr section of that code also uses %11, but that can be safely replaced by %10. My original thoughts were to try and avoid loading the same word twice, which means you need to store the original word somewhere. In the end, I used the simpler approach of simply reloading it.

Updated by Robert Bradley on Apr 15, 2012.
… and the patches for those (replacing the old patch for unaligned calculations via MIPS asm).
Updated by Dave Täht on Apr 15, 2012.
just so you know that patch had one char of trailing whitespace on it on every line. There are ways to avoid that in most editors… I’m trying to finish up a patch to sfq to let me switch two behaviors at runtime and also doing something else… I think what I’ll do is fire off a build with your asm patch (sans whitespace), cross fingers and see what pops out in 3 hours.
Updated by Robert Bradley on Apr 15, 2012.
Hang on … GCC does have a way to mark clobbered registers (http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#ss5.3). In case it matters, I have an updated patch using registers 12-14 and marking those as clobbered.

Sorry for not seeing that sooner!

Updated by Robert Bradley on Apr 15, 2012.
By the way, thanks for spotting the trailing whitespace, too. (Too late for the previous comment, though.)
Updated by Dave Täht on Apr 15, 2012.
Wasn’t me. Git griped about the whitespace. As would an upstream maintainer.

as an emacs user I’ve had pure hell in making kernel code look right to those folks, myself….

I had fixed that on your previous ‘temponly’ patch and started a build, so, what the heck, I’ll try that one first, then replace it with this successor…

… after I catch up on the ABI a bit. It would be good to know if gcc was smart enough nowadays to figure this out.

Updated by Dave Täht on Apr 15, 2012.
OR: I could just see if that build blew up. You want to fix it and supply a new patch?

/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h: In function ‘csum_ipv6_magic’:
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:206:13: error: invalid operands to binary & (have ‘const struct in6_addr *‘ and ‘int’)
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:206:28: error: invalid operands to binary & (have ‘const struct in6_addr *‘ and ‘int’)
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:208:3: error: implicit declaration of function ‘csum_ipv6_unaligned_magic’ [-Werror=implicit-function-declaration]
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h: At top level:
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:269:27: error: conflicting types for ‘csum_ipv6_unaligned_magic’
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:208:10: note: previous implicit declaration of ‘csum_ipv6_unaligned_magic’ was here
In file included from include/net/checksum.h:26:0,
from include/linux/skbuff.h:28,
from include/linux/netlink.h:155,
from include/linux/rtnetlink.h:5,
from crypto/algapi.c:19:
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h: In function ‘csum_ipv6_magic’:
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:206:13: error: invalid operands to binary & (have ‘const struct in6_addr *‘ and ‘int’)
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:206:28: error: invalid operands to binary & (have ‘const struct in6_addr *‘ and ‘int’)
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:208:3: error: implicit declaration of function ‘csum_ipv6_unaligned_magic’ [-Werror=implicit-function-declaration]
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h: At top level:
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:269:27: error: conflicting types for ‘csum_ipv6_unaligned_magic’
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:208:10: note: previous implicit declaration of ‘csum_ipv6_unaligned_magic’ was here
cc1: some warnings being treated as errors

make[6]: ***** [crypto/crypto_wq.o] Error 1
make[6]: ***** Waiting for unfinished jobs….
cc1: some warnings being treated as errors

make[6]: ***** [crypto/scatterwalk.o] Error 1
CC [M] fs/btrfs/root-tree.o
CC [M] fs/btrfs/dir-item.o
LD [M] fs/autofs4/autofs4.o
CC [M] fs/btrfs/file-item.o
CC [M] drivers/block/loop.o
cc1: some warnings being treated as errors

make[6]: ***** [crypto/algapi.o] Error 1
make[5]: ***** [crypto] Error 2
make[5]: ***** Waiting for unfinished jobs….
CC [M] fs/btrfs/inode-item.o
CC [M] fs/btrfs/inode-map.o
CC [M] fs/btrfs/disk-io.o
CC [M] fs/btrfs/transaction.o
CC [M] fs/btrfs/inode.o
CC [M] fs/btrfs/file.o
CC [M] fs/btrfs/tree-defrag.o
CC [M] drivers/input/input.o
CC [M] drivers/input/input-compat.o
CC [M] drivers/leds/leds-gpio.o
CC [M] drivers/input/input-mt.o
CC [M] drivers/leds/leds-wndr3700-usb.o
CC [M] drivers/leds/ledtrig-timer.o
CC [M] drivers/leds/ledtrig-default-on.o
CC [M] drivers/leds/ledtrig-netdev.o
CC [M] drivers/input/ff-core.o
CC [M] drivers/input/input-polldev.o
CC [M] fs/btrfs/extent_map.o
In file included from include/net/checksum.h:26:0,
from include/linux/skbuff.h:28,
from include/linux/if_ether.h:134,
from include/linux/netdevice.h:29,
from drivers/leds/ledtrig-netdev.c:25:
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h: In function ‘csum_ipv6_magic’:
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:206:13: error: invalid operands to binary & (have ‘const struct in6_addr *‘ and ‘int’)
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:206:28: error: invalid operands to binary & (have ‘const struct in6_addr *‘ and ‘int’)
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:208:3: error: implicit declaration of function ‘csum_ipv6_unaligned_magic’ [-Werror=implicit-function-declaration]
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h: At top level:
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:269:27: error: conflicting types for ‘csum_ipv6_unaligned_magic’
/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:208:10: note: previous implicit declaration of ‘csum_ipv6_unaligned_magic’ was here
CC [M] drivers/input/keyboard/gpio_keys_polled.o
cc1: some warnings being treated as errors

Updated by Dave Täht on Apr 15, 2012.
Just needed to declare the function ahead of the call. I can patch your patch…

or just go to lunch…

/asm/checksum.h:208:3: error: implicit declaration of function ‘csum_ipv6_unaligned_magic’ [-Werror=implicit-function-declaration]

Updated by Dave Täht on Apr 15, 2012.
Going back to the ipv4 traps, the csum routine over there is already patched by felix’s original patch, so I suspect the 25k/sec traps on ipv4 are coming from someplace that’s too hard to see right now, with all the other traps.
Updated by Dave Täht on Apr 15, 2012.
Never mind, I’m fixing it.Too excited at the prospect at seeing another 40Mbit/sec out of this puppy. Wet paint….
Updated by Robert Bradley on Apr 15, 2012.
You’ll need to cast saddr and daddr to __u32 (or int) in the alignment test, too.
Updated by Dave Täht on Apr 15, 2012.
OK, I’m not fixing it. :(

fatal: corrupt patch at line 136

Can I coax you to give it another go?

Updated by Robert Bradley on Apr 15, 2012.
Sure - I was in the process of doing that and removing the trailing whitespace at the time.
Updated by Dave Täht on Apr 15, 2012.
const appeared to be needed.

and you missed a whitespace at line 136. :)

Updated by Dave Täht on Apr 15, 2012.
Updated by Dave Täht on Apr 15, 2012.
While I’d LIKE the assembly to work very much, I’m going to go back to the C patch for now, just to prove we have proof of concept.

/home/cero1/src/cerowrt-3.3.0/build_dir/linux-ar71xx_generic/linux-3.3.2/arch/mips/include/asm/checksum.h:281:2: error: invalid ‘asm’: operand number out of range
{standard input}:810: Error: Illegal operands `lw ,4(\$3)’
{standard input}:811: Error: Illegal operands `sllv \$6,\$6,’
{standard input}:812: Error: Illegal operands `srlv ,,’
{standard input}:813: Error: absolute expression required `or \$6,\$6,’
{standard input}:818: Error: Illegal operands `lw ,8(\$3)’
{standard input}:819: Error: Illegal operands `sllv \$6,\$6,’
{standard input}:820: Error: Illegal operands `srlv ,,’
{standard input}:821: Error: absolute expression required `or \$6,\$6,’
{standard input}:826: Error: Illegal operands `lw ,12(\$3)’
{standard input}:827: Error: Illegal operands `sllv \$6,\$6,’
{standard input}:828: Error: Illegal operands `srlv ,,’
{standard input}:829: Error: absolute expression required `or \$6,\$6,’
{standard input}:834: Error: Illegal operands `lw ,16(\$3)’
{standard input}:835: Error: Illegal operands `sllv \$6,\$6,’
{standard input}:836: Error: Illegal operands `srlv ,,’
{standard input}:837: Error: absolute expression required `or \$6,\$6,’
{standard input}:841: Error: Illegal operands `srl ,,8’
{standard input}:842: Error: absolute expression required `addu \$3,\$3,’
{standard input}:844: Error: Unrecognized opcode `set pop’

Nice way to spend a sunday, tho…

Updated by Dave Täht on Apr 15, 2012.
C patch just worked. Building now. Going to lunch. THX, we’ll get there on the asm. (I used to really love programming in asm)
Updated by Dave Täht on Apr 15, 2012.
and the first pass kernel built. rest is building… Seriously going to lunch now.

I note that I’m not doing a from-scratch-clean-build this time, so it’ll be done before I get back.

Updated by Dave Täht on Apr 15, 2012.
back. build crashed (elsewhere)… restarting.

My best guess at the problem with tcp_validate_incoming is that it’s blowing up on accessing the tcphdr (th->rst and th->syn), and the only way for that to happen is that that 16 bit value is on the 3rd byte of a word.

Naturally, tcphdr is a bitfield. I suppose a union here would clean it up,
I’m pretty sure get_unaligned_whatever will mess it up…

But I think the majority of the remaining traps are in this routine.

Updated by Dave Täht on Apr 15, 2012.
So maybe this for tcp stuff?

Might actually be usable throughout as written. Never can wrap my head around endian issues….

Updated by Dave Täht on Apr 15, 2012.
ok, ok, I missed a semicolon. But don’t I have to worry about the native endianness of the local architecture if I do 8 bit values where 16 bit ones were?
Updated by Robert Bradley on Apr 15, 2012.
I’d have thought so too, but it actually looks right to me. I’m actually a bit surprised the kernel developers didn’t need to move both the 4 bit fields to the end of the struct, but the compiler must be smart enough to cope with that (otherwise it wouldn’t be working now).
Updated by Dave Täht on Apr 15, 2012.
Have to move this to the wired lan, but I can confirm that the new build
now available at:

http://huchra.bufferbloat.net/~cero1/3.3/3.3.2-3/

does indeed boot. (it has a bunch of other new stuff in it too, notably new firewall rules).

It does not have the tcp patch as yet.

It FEELs better. Even with logging on, it took a long time to throw these

Apr 15 22:41:33 OpenWrt kern.alert kernel: [  820.781250] [<8024560c>] igmp_rcv+0x114/0x68c
Apr 15 22:41:33 OpenWrt kern.alert kernel: [  820.781250] [<802141d4>] ip_local_deliver_finish+0x174/0x2a0
Apr 15 22:41:33 OpenWrt kern.alert kernel: [  820.781250] [<801e8990>] __netif_receive_skb+0x458/0x4c8
--
Apr 15 22:41:33 OpenWrt kern.alert kernel: [  820.781250] [<80245810>] igmp_rcv+0x318/0x68c
Apr 15 22:41:33 OpenWrt kern.alert kernel: [  820.781250] [<802141d4>] ip_local_deliver_finish+0x174/0x2a0
Apr 15 22:41:33 OpenWrt kern.alert kernel: [  820.781250] [<801e8990>] __netif_receive_skb+0x458/0x4c8
--
Apr 15 22:41:34 OpenWrt kern.alert kernel: [  821.574218] [<8308fe64>] ip6_route_input+0x70/0xd8 [ipv6]
Apr 15 22:41:34 OpenWrt kern.alert kernel: [  821.574218] [<83086310>] ipv6_rcv+0x41c/0x4c4 [ipv6]
Apr 15 22:41:34 OpenWrt kern.alert kernel: [  821.574218] [<801e8990>] __netif_receive_skb+0x458/0x4c8

Apr 15 22:42:39 OpenWrt kern.alert kernel: [  886.886718] [<83095b08>] ipv6_setsockopt+0x118/0x444 [ipv6]
Apr 15 22:42:39 OpenWrt kern.alert kernel: [  886.886718] 
Apr 15 22:42:39 OpenWrt kern.alert kernel: [  886.886718] 
--
Apr 15 22:42:39 OpenWrt kern.alert kernel: [  886.886718] [<83095b10>] ipv6_setsockopt+0x120/0x444 [ipv6]
Apr 15 22:42:39 OpenWrt kern.alert kernel: [  886.886718] 
Apr 15 22:42:39 OpenWrt kern.alert kernel: [  886.886718] 

AND IT DOESN’T CHOKE on ipv6!

Apr 15 22:55:23 OpenWrt kern.alert kernel: [  231.015625] [<83246310>] ipv6_rcv+0x41c/0x4c4 [ipv6]
Apr 15 22:55:23 OpenWrt kern.alert kernel: [  231.015625] [<801e8990>] __netif_receive_skb+0x458/0x4c8
--
Apr 15 22:55:23 OpenWrt kern.alert kernel: [  231.015625] [<8022b4b4>] tcp_rcv_established+0x84/0x694
Apr 15 22:55:23 OpenWrt kern.alert kernel: [  231.015625] [<83267fe8>] ipv6_frag_exit+0x2538/0x3768 [ipv6]
Apr 15 22:55:23 OpenWrt kern.alert kernel: [  231.015625] 
--
Apr 15 22:55:23 OpenWrt kern.alert kernel: [  231.015625] [<8022b4d4>] tcp_rcv_established+0xa4/0x694
Apr 15 22:55:23 OpenWrt kern.alert kernel: [  231.015625] [<83267fe8>] ipv6_frag_exit+0x2538/0x3768 [ipv6]
Apr 15 22:55:23 OpenWrt kern.alert kernel: [  231.015625] 
--
Apr 15 22:55:23 OpenWrt kern.alert kernel: [  231.015625] [<8022b4dc>] tcp_rcv_established+0xac/0x694
Apr 15 22:55:23 OpenWrt kern.alert kernel: [  231.015625] [<83267fe8>] ipv6_frag_exit+0x2538/0x3768 [ipv6]
Apr 15 22:55:23 OpenWrt kern.alert kernel: [  231.015625] 
Apr 15 22:55:56 OpenWrt kern.alert kernel: [  264.242187] 
--
Apr 15 22:55:56 OpenWrt kern.alert kernel: [  264.253906] [<8022b4d4>] tcp_rcv_established+0xa4/0x694
Apr 15 22:55:56 OpenWrt kern.alert kernel: [  264.253906] [<83267fe8>] ipv6_frag_exit+0x2538/0x3768 [ipv6]
Apr 15 22:55:56 OpenWrt kern.alert kernel: [  264.253906] 
--
Apr 15 22:55:56 OpenWrt kern.alert kernel: [  264.253906] [<8022b4dc>] tcp_rcv_established+0xac/0x694
 265.199218]         0000fb04 821a7700 83177780 000004fc 00000000 00010000 0090d410 80221878
Apr 15 22:55:57 OpenWrt kern.alert kernel: [  265.199218]         ...
Apr 15 22:55:57 OpenWrt kern.alert kernel: [  265.199218] [<8022b4d4>] tcp_rcv_established+0xa4/0x694
Apr 15 22:55:57 OpenWrt kern.alert kernel: [  265.199218] [<83267fe8>] ipv6_frag_exit+0x2538/0x3768 [ipv6]
Apr 15 22:55:57 OpenWrt kern.alert kernel: [  265.199218] 
--
Apr 15 22:55:57 OpenWrt kern.alert kernel: [  265.207031] [<8022b4d4>] tcp_rcv_established+0xa4/0x694
Apr 15 22:55:57 OpenWrt kern.alert kernel: [  265.207031]         65726600 6e657470 821a7700 83177780 821a7700 80288c18 
821a7700 8322b180
Apr 15 22:55:57 OpenWrt kern.alert kernel: [  265.207031]         00000000 80360000 802e15e8 801dba10 00000000 00000000 
00010000 8022e930
--
Apr 15 22:55:57 OpenWrt kern.alert kernel: [  265.207031] [<8022b4dc>] tcp_rcv_established+0xac/0x694
Apr 15 22:55:57 OpenWrt kern.alert kernel: [  265.207031] [<83267fe8>] ipv6_frag_exit+0x2538/0x3768 [ipv6]
Apr 15 22:55:57 OpenWrt kern.alert kernel: [  265.207031] 
--
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  65536  65536    10.17       2.78   

The above throughput, with traps on, is over double what we started with.

root@OpenWrt:~# netperf -l 60 -6 -H huchra.bufferbloat.net
MIGRATED TCP STREAM TEST from :: (::) port 0 AF_INET6 to lists.bufferbloat.net () port 0 AF_INET6 : demo
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  65536  65536    60.00     140.29   
root@OpenWrt:~# netperf -l 60 -6 -H huchra.bufferbloat.net -t TCP_MAERTS
MIGRATED TCP MAERTS TEST from :: (::) port 0 AF_INET6 to lists.bufferbloat.net () port 0 AF_INET6 : demo
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  16384  16384    60.01     181.54

Can’t draw any conclusions from these two numbers, I have different firewall rules now… but it’s definitely better. and I guess I should do that on a fresh boot.

Updated by Dave Täht on Apr 15, 2012.
Proof, tho, in the ratios.

This is on a machine that can only do 100Mbit, so the results will differ…

root@cruithne:~/git/deBloat/test# netperf -l 60 -6 -H 2001:4f8:fff8:900::1
MIGRATED TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2001:4f8:fff8:900::1 () port 0 AF_INET6
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10\^6bits/sec

87380 65536 65536 60.16 92.70
root@cruithne:~/git/deBloat/test# bc
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty’.
11117911-9183546
1934365
./60
32239 For ipv6

root@cruithne:~/git/deBloat/test# netperf -l 60 -4 -H 172.29.9.1
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.29.9.1 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10\^6bits/sec

87380 65536 65536 60.14 94.12
root@cruithne:~/git/deBloat/test# bc
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty’.
12587173-11164137
1423036
./60
23717

Updated by Dave Täht on Apr 15, 2012.
And further proof it’s no longer in the checksum routine.

tcp_rcv_established (which affects both ipv6 and ipv4) is the bad boy now.

root@cruithne:~/git/deBloat/test# netperf -l 60 -6 -H 2001:4f8:fff8:900::1 -t TCP_MAERTS
MIGRATED TCP MAERTS TEST from ::0 (::) port 0 AF_INET6 to 2001:4f8:fff8:900::1 () port 0 AF_INET6
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10\^6bits/sec

87380 65536 65536 60.01 91.80
root@cruithne:~/git/deBloat/test# bc
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty’.
13544979-12630492
914487
./60
15241

Updated by Robert Bradley on Apr 15, 2012.
Yes, definitely the right track, and we may even hit that 1000 traps/s target soon enough.
Updated by Robert Bradley on Apr 15, 2012.
Final patch for this evening:

  • Packed igmp headers (eliminating the trap in igmp_rcv)
  • Changed tcp_parse_aligned_timestamp so that it uses the __get_unaligned_cpu32() macros
Updated by Robert Bradley on Apr 16, 2012.
Updated by Robert Bradley on Apr 16, 2012.
Updated by Robert Bradley on Apr 16, 2012.
Updated by Robert Bradley on Apr 16, 2012.
Updated by Robert Bradley on Apr 16, 2012.
Updated by Dave Täht on Apr 16, 2012.
going through the router, rather than to it, with the last set of patches (not the new stuff added last night), I get 220+ MB/sec through the router on both ipv4 and ipv6!
Updated by Dave Täht on Apr 16, 2012.
The tcp timestamp check looks like a far more viable candidate for the problems here than the 8 bit alignment thing for the tcp headers. I look forward to trying these wednesday or so.

(I have a few more but I don’t remember what they are right now)

I’ve accumulated a few more patches that look important, as well. And I need to make sch_sfq switchable via a module param between the faster, possibly unstable, HoQ behavior, to the older Tail of Q behavior (ToQ). But I look forward to a day where something like a ping flood won’t do bad things to this puppy, and perhaps another 20-30Mbit/sec of performance can be had.

I do like the asm patch in that I’m really not fond of making everything use unaligned accesses in the network stack, there are plenty of cases (probably well over 80%) where that isn’t needed. Route lookups come to mind.

However tracking all those down is hard, and certainly killing 75000 traps/sec increased performance by 2 which is hard to argue with!

Updated by Dave Täht on Apr 17, 2012.
in turning tcp timestamps off via sysctl I believe you found the real source of the tcp rcv problem.
With them off, with logging on, ipv6…

m@cruithne:~$ netperf -6 -H 2001:4f8:fff8:800::1
MIGRATED TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2001:4f8:fff8:800::1 () port 0 AF_INET6
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  65536  65536    10.31       6.43  

(this is like over double what it was before)

And even more significantly…

m@cruithne:~$ netperf -H 172.29.9.1
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.29.9.1 () port 0 AF_INET
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  65536  65536    10.06      94.80   

Yes, I’m getting the actual possible throughput on this link, under ipv4, even with logging the traps turned. Nice job. I’ll apply the patch.

As for why ipv6 is still throwing so many traps under this scenario, it appears to be

Apr 18 03:53:58 OpenWrt kern.alert kernel: [172007.242187] [<801e7508>] skb_flow_dissect+0x388/0x400
Apr 18 03:53:58 OpenWrt kern.alert kernel: [172007.242187] [<82afd3f8>] 0x82afd3f8
Apr 18 03:53:58 OpenWrt kern.alert kernel: [172007.242187]

Apr 18 03:55:10 OpenWrt kern.alert kernel: [172078.832031] [<82b08d3c>] 0x82b08d3c
Apr 18 03:55:10 OpenWrt kern.alert kernel: [172078.832031]

Apr 18 03:55:10 OpenWrt kern.alert kernel: [172078.832031]

Apr 18 03:55:20 OpenWrt kern.alert kernel: [172088.753906] [<831cfe64>] ip6_route_input+0x70/0xd8 [ipv6]
Apr 18 03:55:20 OpenWrt kern.alert kernel: [172088.753906] [<831c6310>] ipv6_rcv+0x41c/0x4c4 [ipv6]
Apr 18 03:55:20 OpenWrt kern.alert kernel: [172088.753906] [<801e8990>] __netif_receive_skb+0x458/0x4c8

Updated by Robert Bradley on Apr 18, 2012.
Dave, is that really only 6.4 Mb/s on IPv6?

I managed to find a pointer cast and dereference in ip6_route_input, which I assume is part of the problem. I also noticed it called __ipv6_addr_type, which should be unaligned-safe now that in6_addr is packed. Either way, this new patch should hopefully fix both of these issues.

Updated by Robert Bradley on Apr 18, 2012.
Since I had some free time, I decided to plough through the entire /net/ipv6 directory with grep looking for *(__be32 *) pointer dereferencing. The hope is that this will catch a lot of the remaining unaligned references left in the IPv6 stack, at the possible expense of a few false positives. I also packed the icmpv6 and ipv6 option header structs, which hopefully should improve the ping flood resistance a bit.
Updated by Dave Täht on Apr 18, 2012.
A little duplication of work here, I’d done the route stuff and skb dissect fixes and then fell asleep while the build ran, without an update to the bug log here.

However, I did NOT break out the atomic weapons you did on your second patch. :)

I would prefer to test stuff a little incrementally, the build I started last night had your most recent asm patch in it, in particular, and as soon as I wake up a little more, I’ll install the build, then verify functionality (in fact, I’m having some route lookup issues in ra, that bother me), then move forward with the rest.

Updated by Dave Täht on Apr 18, 2012.
yes, ipv6 tcp endpoints, with logging on, run at 6Mbit/sec, ipv4 tcp endpoints, with logging on, 94+ (The laptop is only 100Mbit), with timestamps off.

explains a lot about the issues I was having with developing a load measuring test that would ‘just work’ on the router.

Updated by Dave Täht on Apr 18, 2012.
[ 580.136718] ICMPv6 checksum failed [2001:04f8:fff8:0800:0227:13ff:fe64:8967 > ff02:0000:0000:0000:0000:0001:ff00:0001]

I’m willing to pursue doing this routine in assembly but I think it would be best to move to testing it in userspace. Going back to C for it for now.

Updated by Dave Täht on Apr 18, 2012.
btw, in the process of so thoroughly breaking ipv6, I got some insight as to where all the traps came from. Perversely, thank you for breaking that. :)

I think with the tcp timestamp fix in place, we can regard ipv4 as baked now.

root@OpenWrt:/sys/kernel/debug/mips# netperf -l60 -4 -H huchra.bufferbloat.net
MIGRATED TCP STREAM TEST from 0.0.0.0 () port 0 AF_INET to lists.bufferbloat.net () port 0 AF_INET : demo
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  65536  65536    60.00     179.97   
root@OpenWrt:/sys/kernel/debug/mips# cat unaligned_instructions 
173
root@OpenWrt:/sys/kernel/debug/mips# 

tunneling still needs to get tested, and that particular test was with the aqm code off…

Apr 18 15:24:25 OpenWrt kern.alert kernel: [ 1073.734375] [<82bad404>] 0x82bad404
Apr 18 15:24:25 OpenWrt kern.alert kernel: [ 1073.734375] 
--
Apr 18 15:24:36 OpenWrt kern.alert kernel: [ 1084.785156] [<801e7510>] skb_flow_dissect+0x390/0x410
Apr 18 15:24:36 OpenWrt kern.alert kernel: [ 1084.785156] [<82bad404>] 0x82bad404
Apr 18 15:24:36 OpenWrt kern.alert kernel: [ 1084.785156] 
--
Apr 18 15:24:41 OpenWrt kern.alert kernel: [ 1089.253906] [<801e7510>] skb_flow_dissect+0x390/0x410
Apr 18 15:24:41 OpenWrt kern.alert kernel: [ 1089.253906] [<82bad404>] 0x82bad404
Apr 18 15:24:41 OpenWrt kern.alert kernel: [ 1089.253906] 
--
Apr 18 15:37:11 OpenWrt kern.alert kernel: [ 1839.308593] [<801e7510>] skb_flow_dissect+0x390/0x410
Apr 18 15:37:11 OpenWrt kern.alert kernel: [ 1839.308593] [<82bad404>] 0x82bad404
Apr 18 15:37:11 OpenWrt kern.alert kernel: [ 1839.308593] 
--
Apr 18 15:37:24 OpenWrt kern.alert kernel: [ 1852.539062] [<801e7510>] skb_flow_dissect+0x390/0x410
Apr 18 15:37:24 OpenWrt kern.alert kernel: [ 1852.539062] [<82bad404>] 0x82bad404
Apr 18 15:37:24 OpenWrt kern.alert kernel: [ 1852.539062] 

Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  65536  65536    60.00     179.19   
root@OpenWrt:/sys/kernel/debug/mips# cat unaligned_instructions 
187

That’s awfully rare to throw that, and so I suspect that’s leaking over from some bit of the still partially working ipv6 code.

I’m actually really tempted to leave it broken and test tunnels today. As well as a few other things like wireless, that we haven’t looked at lately. I have high hopes that wireless (with the timestamp fix) just improved a bit

Updated by Dave Täht on Apr 18, 2012.
ah. simple idea. fall back to the c version of the checksum routine when unaligned? Probably do the same thing to the ipv4 routine.

As a similar exercise, moving it to userspace to see which wins, looks straightforward.

I mean, the shift thing is clever but gcc has come a long way since that routine was written…

I’m going to take a harder look at the two tunneling issues I have in the issue tracker, during this actually convienent time while ipv6 is toast…

Updated by Robert Bradley on Apr 18, 2012.
I was wanting to apologise for managing to waste so much of your time on the asm version, actually! Speaking of which, I was overshifting - “sll \$12, \$12, 0x08” should be “sll \$12, \$12, 0x03”, and “srl \$12, \$12, 0x08” becomes “srl \$12, \$12, 0x03”. There’s also a question of whether a bitshift right of 32 does what I think it should (clear the register), which is needed for aligned addresses.

Personally, I’d stick to using the C version and assuming GCC now does the right thing. We know that works well enough, and there’s bigger problems to worry about elsewhere.

Updated by Dave Täht on Apr 18, 2012.
Hey, I enjoy assembly and remember well when it was necessary. And it still is, and checksums like this are problems…

But what I think is best is a unalignment check on both ipv4 and v6, prefacing the known to work mips one. This should be better on ipv6 and ipv4 than what we started with, especially on wireless.

I’m reworking that patch set now and reviewing your mondo patch.

In checking the tunneling code I found all sorts of problems like this, and I do hope that ends up improving ipsec, ipip, and so on by a lot. I’d only got 20Mbit/sec out of ipsec when I tested it last year. Now I know why…

Updated by Dave Täht on Apr 18, 2012.
oh, and the asm patch is turning out USEFUL. In breaking ipv6 completely, I can gain insight into the remaining ipv4 problems. Thank you. (perversely)
Updated by Dave Täht on Apr 18, 2012.
OK, I have a cleanedup, rollup patch of everything that we’ve needed to touch to date.

I think.

I still need to look harder at the vtun code, but for sure, ipsec was going to suck.

Updated by Dave Täht on Apr 18, 2012.
root@OpenWrt:/sys/kernel/debug/mips# cat unaligned_instructions 
912353
root@OpenWrt:/sys/kernel/debug/mips# netperf -6 -H huchra.bufferbloat.net -t TCP_MAERTS
MIGRATED TCP MAERTS TEST from :: (::) port 0 AF_INET6 to lists.bufferbloat.net () port 0 AF_INET6 : demo

Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  16384  16384    10.01     200.20   
root@OpenWrt:/sys/kernel/debug/mips# 
root@OpenWrt:/sys/kernel/debug/mips# cat unaligned_instructions 
912357


root@OpenWrt:/sys/kernel/debug/mips# cat unaligned_instructions 
912361
root@OpenWrt:/sys/kernel/debug/mips# netperf -4 -H huchra.bufferbloat.net -t TCP_MAERTS


Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  16384  16384    10.01     225.00   
root@OpenWrt:/sys/kernel/debug/mips# cat unaligned_instructions 
912365

However, on forwarding traffic, we’re getting eaten alive, but I think I have a grip on that.

Apr 18 22:54:26 OpenWrt kern.alert kernel: [ 1075.269531] [<801e750c>] skb_flow_dissect+0x38c/0x410
Apr 18 22:54:26 OpenWrt kern.alert kernel: [ 1075.269531] [<82b9d404>] 0x82b9d404
Apr 18 22:54:26 OpenWrt kern.alert kernel: [ 1075.269531] 
--
Apr 18 22:54:27 OpenWrt kern.alert kernel: [ 1077.480468] [<801e750c>] skb_flow_dissect+0x38c/0x410
Apr 18 22:54:27 OpenWrt kern.alert kernel: [ 1077.480468] [<82b9d404>] 0x82b9d404
Apr 18 22:54:27 OpenWrt kern.alert kernel: [ 1077.480468] 
--
Apr 18 22:54:29 OpenWrt kern.alert kernel: [ 1078.566406] [<801e750c>] skb_flow_dissect+0x38c/0x410
Apr 18 22:54:29 OpenWrt kern.alert kernel: [ 1078.566406] [<82b9d404>] 0x82b9d404
Apr 18 22:54:29 OpenWrt kern.alert kernel: [ 1078.566406] 
Updated by Robert Bradley on Apr 18, 2012.
You’ve probably beaten me to this, but here’s my efforts with skb_flow_dissect.
Updated by Dave Täht on Apr 18, 2012.
No, I got so far as .7 traps a sec on ipv6 and 7000/sec on ipv4, and fell asleep.

I decided before trying again to check for output from you.

problem is, none of those candidates seem plausible. flow->ports should be aligned already….. GRE is not hit, I at least am not doing vlans, but yea, looking at that makes sense….

Updated by Dave Täht on Apr 20, 2012.
I found it. more news as it happens.

http://huchra.bufferbloat.net/~cero1/3.3/3.3.2-7/

I’m a little concerned in that I’m seeing nat timeouts… or something related to the sfq patch… we just tinkled over a whole lot of code…

Updated by Dave Täht on Apr 20, 2012.
Nope. Throwing 10,000 traps/sec at 240Mbit, when the aqms are enabled.
on ipv4. Temporarily not able to test ipv6…

nearly zero when aqms disabled.

Updated by Robert Bradley on Apr 20, 2012.
I was going to suggest looking at include/net/inet_ecn.h and include/net/dsfield.h, but those seem reasonably OK for IPv4 (but not IPv6). That’s assuming that sfqred and the functions it calls are the cause of this, as opposed to something deeper in the scheduling code.
Updated by Dave Täht on Apr 21, 2012.
about 1 trap per second now (or rather 4 or more in a row every time babel does a route update)

both ipv6 and ipv4 do about 250Mbit through the router without aqm and nearly no firewall rules. The router never gets ‘chunky’ in it’s feel, and although I care about the last set of traps, we’ll get them eventually.

With default aqm/fw rules, about 224. (I note that usually, you are sending less than 20Mbit out the second ethernet port) I have not benchmarked wireless with this version, it was doing well when last I checked.

It does concern me that occasionally I get a complete tcp or nat reset from #371, but that is in part the product of my busy lab (10+ machines) spewing RA and babel routing info everywhere. Or so I think.

Now I can finally think I have a ‘fluid model’ to play with and can have onboard measurements and calculations that make sense, or so I hope. THANK YOU FOR THE HELP.

http://huchra.bufferbloat.net/~cero1/3.3/3.3.2-8/

I did at one point see 288Mbit through the router, but that may have been an illusion.

Anyway, gotta see how gone #371 and test some wireless next…

Updated by Dave Täht on Apr 21, 2012.
incidentally, to make poking into this easier, I started building debuggable kernels and copying them (and the contents of the patched linux source tree) to a flash stick that I could then call up with gdb on the router.

That makes it possible to look at the symbol tables with the +offset.

Maybe the cross dev gdb works, too, never checked.

Regrettably this doesn’t work terribly well with kernel modules. If you compile them as part of the build, you can do that too…

But doing this mostly via mark 1 eyeball was constructive, as I at least, learned a lot about how the ipv6 and ipv4 hot (and cold) paths actually worked.

Updated by Dave Täht on Apr 21, 2012.
I’m going to treat the route update bug of #371 as a separate bug.

This bug was too epic in scope. Please do not delete any patches, they are useful to have around as blind alleys.

Updated by David Taht on Dec 22, 2013.
[bug #360] returns in 3.10.24

[167552.300781] …
[167552.300781] Call Trace:
[167552.300781] [<80251264>] tcp_rcv_established+0xcc/0x650
[167552.300781] [<80259798>] tcp_v4_do_rcv+0x88/0x290
[167552.300781] [<801ff048>] release_sock+0xe8/0x16c
[167552.300781] [<80248b30>] tcp_sendmsg+0xb0c/0xc70
[167552.300781] [<801fb4dc>] sock_sendmsg+0x78/0xa8
[167552.300781] [<801fd38c>] SyS_sendto+0xcc/0x10c
[167552.300781] [<801fd3e0>] SyS_send+0x14/0x20
[167552.300781] [<80062544>] stack_done+0x20/0x40

Looks like there was some churn around this routine that got messed up and/or
removed from the patch

http://www.bufferbloat.net/issues/360

Finding and fixing this one was pretty epic, and certainly traps like
these are hell on benchmarking the router.

Updated by Dave Täht on Dec 22, 2013.
Going through the full suite of tests (thc, etc) and ipv6 would be sane to re-do, especially before
doing any further benchmarking

This is a static export of the original bufferbloat.net issue database. As such, no further commenting is possible; the information is solely here for archival purposes.
RSS feed

Recent Updates

Jul 21, 2024 Wiki page
cake-autorate
Jul 21, 2024 Wiki page
What Can I Do About Bufferbloat?
Jul 21, 2024 Wiki page
Tests for Bufferbloat
Jul 1, 2024 Wiki page
RRUL Chart Explanation
Dec 3, 2022 Wiki page
Codel Wiki

Find us elsewhere

Bufferbloat Mailing Lists
#bufferbloat on Twitter
Google+ group
Archived Bufferbloat pages from the Wayback Machine

Sponsors

Comcast Research Innovation Fund
Nlnet Foundation
Shuttleworth Foundation
GoFundMe

Bufferbloat Related Projects

OpenWrt Project
Congestion Control Blog
Flent Network Test Suite
Sqm-Scripts
The Cake shaper
AQMs in BSD
IETF AQM WG
CeroWrt (where it all started)

Network Performance Related Resources


Jim Gettys' Blog - The chairman of the Fjord
Toke's Blog - Karlstad University's work on bloat
Voip Users Conference - Weekly Videoconference mostly about voip
Candelatech - A wifi testing company that "gets it".