We moved this page to our Documentation Portal. You can find the latest updates here. |
Issue
With "Large Receive Offload" enabled on the HV's bridged appliance interface, guest VM's networking may work incorrectly.
Normally, these issues should not appear, because LRO is disabled automatically for ethernet devices that are attached to a bridge. However, in a kernel versions 2.6.32-515 and higher (2.6.32-573 is currently implemented in CentOS 6.7), in case there is a bonding configured on the appliance interface, LRO is disabled on the bond interface, but is not on the member NICs.
The dmesg
output of the affected HV is overfilled with below errors:
WARNING: at net/core/dev.c:1915 skb_warn_bad_offload+0x99/0xb0() (Tainted: G W -- ------------ )
: caps=(0x4000, 0x0) len=1514 data_len=1460 ip_summed=1
Modules linked in: fuse nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 xt_state cls_fw sch_htb iptable_mangle ip6table_mangle sch_tbf xt_physdev xt_MARK be2iscsi iscsi_boot_sysfs bnx2i(U) cnic(U) uio cxgb4i(U) cxgb3i(U) ib_iser(U) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 8021q garp xfs exportfs ipt_REJECT ip6table_filter ip6_tables ebtable_nat dm_cache_mq dm_cache dm_bio_prison dm_persistent_data libcrc32c ext2 mbcache ebtable_broute ebtables bonding arptable_filter arp_tables xt_NOTRACK nf_conntrack iptable_raw iptable_filter ip_tables nbd(U) rdma_ucm(U) ib_ucm(U) rdma_cm(U) iw_cm(U) configfs ib_ipoib(U) ib_cm(U) ib_uverbs(U) ib_umad(U) mlx5_ib(U) mlx5_core(U) mlx4_en(U) mlx4_ib(U) ib_sa(U) ib_mad(U) ib_core(U) ib_addr(U) mlx4_core(U) mlx_compat(U) vhost_net macvtap macvlan tun kvm_intel kvm dm_snapshot dm_bufio dm_mirror_sync(U) dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc bridge ipv6 stp llc joydev sg sd_mod crc_t10dif iTCO_wdt iTCO_vendor_support dcdbas wmi power_meter acpi_ipmi ipmi_si ipmi_msghandler igb(U) i2c_algo_bit i2c_core megaraid_sas(U) ixgbe(U) dca ptp pps_core sb_edac edac_core lpc_ich mfd_core shpchp cramfs [last unloaded: scsi_wait_scan]
Pid: 0, comm: swapper Tainted: G W -- ------------ 2.6.32-573.7.1.el6.x86_64 #1
Call Trace:
<IRQ> [<ffffffff81077461>] ? warn_slowpath_common+0x91/0xe0
[<ffffffff81077566>] ? warn_slowpath_fmt+0x46/0x60
[<ffffffff81296a55>] ? __ratelimit+0xd5/0x120
[<ffffffff8146b2e9>] ? skb_warn_bad_offload+0x99/0xb0
[<ffffffff8146f9b1>] ? __skb_gso_segment+0x71/0xc0
[<ffffffff8146fa13>] ? skb_gso_segment+0x13/0x20
[<ffffffff8146fabb>] ? dev_hard_start_xmit+0x9b/0x490
[<ffffffffa022b567>] ? ipv4_confirm+0x87/0x1d0 [nf_conntrack_ipv4]
[<ffffffff81014a19>] ? read_tsc+0x9/0x10
[<ffffffff8148d17a>] ? sch_direct_xmit+0x15a/0x1c0
[<ffffffff8148d24b>] ? __qdisc_run+0x6b/0xe0
[<ffffffff81470128>] ? dev_queue_xmit+0x1f8/0x320
[<ffffffffa01a2818>] ? br_dev_queue_push_xmit+0x88/0xc0 [bridge]
[<ffffffffa01a875b>] ? br_nf_dev_queue_xmit+0x2b/0xa0 [bridge]
[<ffffffffa01a9608>] ? br_nf_post_routing+0x248/0x2c0 [bridge]
[<ffffffff8149ab39>] ? nf_iterate+0x69/0xb0
[<ffffffffa01a2790>] ? br_dev_queue_push_xmit+0x0/0xc0 [bridge]
[<ffffffff8149acf6>] ? nf_hook_slow+0x76/0x120
[<ffffffffa01a2790>] ? br_dev_queue_push_xmit+0x0/0xc0 [bridge]
[<ffffffffa01a2850>] ? br_forward_finish+0x0/0x60 [bridge]
[<ffffffffa01a2893>] ? br_forward_finish+0x43/0x60 [bridge]
[<ffffffffa01a8cf8>] ? br_nf_forward_finish+0x168/0x180 [bridge]
[<ffffffffa01a9238>] ? br_nf_forward_ip+0x238/0x3c0 [bridge]
[<ffffffff8149ab39>] ? nf_iterate+0x69/0xb0
[<ffffffffa01a2850>] ? br_forward_finish+0x0/0x60 [bridge]
[<ffffffff8149acf6>] ? nf_hook_slow+0x76/0x120
[<ffffffffa01a2850>] ? br_forward_finish+0x0/0x60 [bridge]
[<ffffffffa01a292e>] ? __br_forward+0x7e/0xd0 [bridge]
[<ffffffff8149acf6>] ? nf_hook_slow+0x76/0x120
[<ffffffffa01a29dd>] ? br_forward+0x5d/0x70 [bridge]
[<ffffffffa01a387e>] ? br_handle_frame_finish+0x17e/0x330 [bridge]
[<ffffffffa01a9a88>] ? br_nf_pre_routing_finish+0x238/0x350 [bridge]
[<ffffffffa01a9fb8>] ? br_nf_pre_routing+0x418/0x7e0 [bridge]
[<ffffffff8149ab39>] ? nf_iterate+0x69/0xb0
[<ffffffffa01a3700>] ? br_handle_frame_finish+0x0/0x330 [bridge]
[<ffffffff8149acf6>] ? nf_hook_slow+0x76/0x120
[<ffffffffa01a3700>] ? br_handle_frame_finish+0x0/0x330 [bridge]
[<ffffffffa01a3bd8>] ? br_handle_frame+0x1a8/0x270 [bridge]
[<ffffffffa01a3a30>] ? br_handle_frame+0x0/0x270 [bridge]
[<ffffffff8146aaf7>] ? __netif_receive_skb+0x1c7/0x570
[<ffffffff8146e3d8>] ? netif_receive_skb+0x58/0x60
[<ffffffff8146e4e0>] ? napi_skb_finish+0x50/0x70
[<ffffffff8146eaf9>] ? napi_gro_receive_gr+0x39/0x50
[<ffffffff81512aeb>] ? vlan_gro_receive+0x1b/0x30
[<ffffffffa005e014>] ? ixgbe_receive_skb+0x64/0x90 [ixgbe]
[<ffffffffa00611df>] ? ixgbe_clean_rx_irq+0x4df/0x1060 [ixgbe]
[<ffffffff810ad4bd>] ? ktime_get+0x6d/0x100
[<ffffffffa00620bb>] ? ixgbe_poll+0x27b/0x690 [ixgbe]
[<ffffffff81014a19>] ? read_tsc+0x9/0x10
[<ffffffff810ad4bd>] ? ktime_get+0x6d/0x100
[<ffffffff81470443>] ? net_rx_action+0x103/0x2f0
[<ffffffff8107ffa1>] ? __do_softirq+0xc1/0x1e0
[<ffffffff810ed920>] ? handle_IRQ_event+0x60/0x170
[<ffffffff8100c38c>] ? call_softirq+0x1c/0x30
[<ffffffff8100fbd5>] ? do_softirq+0x65/0xa0
[<ffffffff8107fe55>] ? irq_exit+0x85/0x90
[<ffffffff815426d5>] ? do_IRQ+0x75/0xf0
[<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
<EOI> [<ffffffff812f109e>] ? intel_idle+0xfe/0x1b0
[<ffffffff812f1081>] ? intel_idle+0xe1/0x1b0
[<ffffffff8143376a>] ? cpuidle_idle_call+0x7a/0xe0
[<ffffffff81009fe6>] ? cpu_idle+0xb6/0x110
[<ffffffff81531b22>] ? start_secondary+0x2c0/0x316
---[ end trace f69c8d8acec4a713 ]---
Environment
CentOS 6.7
Kernel versions 2.6.32-515 and above
Resolution
The issue appears due to "Large Receive Offload" feature which is enabled by default on the bond member's NICs.
It can be checked and resolved in the following way:
1. Find out the bond appliance interface identifier. For this the "ifconfig" command could be used. In this guide, lets assume that it is "bond0."
2. Locate the member's NICs identifiers:
# grep Interface /proc/net/bonding/bond0
Slave Interface: eth0
Slave Interface: eth1
3. So, we have 2 physical NICs in our bond: "eth0" and "eth1". Now, we should check if they have "LRO" enabled:
# ethtool -k eth0 |grep large-receive-offload
large-receive-offload: on
# ethtool -k eth1 |grep large-receive-offload
large-receive-offload: on
As we can see, it is actually enabled.
4. For Cloudboot hypervisors. In order to disable it, the below commands should be placed in the HV's custom config (please remember that NIC's identifiers may differ in your environment):
ethtool -K eth0 lro off
ethtool -K eth1 lro off
In case Integrated Storage is used, vdisks should not be degraded prior to reboot. Also, be aware that NICs identifiers could be changed after HV restarts, so the best option is mapping them to the corresponding MAC address. Some examples on this could be found here https://onapp.zendesk.com/entries/23245878-Custom-Config-Examples
5. For Static hypervisors. Add the following to /etc/sysconfig/network-scripts/ifcfg-eth* (in our example, it should be "ifcfg-eth0" and "ifcfg-eth1" configuration files):
ETHTOOL_OPTIONS='-K iface lro off'
6. In order to apply the new configuration, we recommend restarting the HV. Please don't forget to migrate critical VMs first.
7. Let's check if LRO is disabled after restart:
# ethtool -k eth0 |grep large-receive-offload
large-receive-offload: off
# ethtool -k eth1 |grep large-receive-offload
large-receive-offload: off
Additional Info
The related bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1270892