We moved this page to our Documentation Portal. You can find the latest updates here. |
Issue
After creating an LACP bond (mode 4) using 2 or more NICS (max 4), all performance seems to go through 1 interface instead of using the 4 interfaces.
Slow network (or less than expected) performance on the network where the bond is in place. We usually see this, when 4 LACP NICS are used for the OnApp Storage Network.
Throubeshooting
Executing IPerf shows that the max speed is worth 1 NIC bandwidth.
#Excuted Iperf -c 10.200.4.254 -N -P 4 -M 9230 on the client side
[root@test ~]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 10.200.1.254 port 5001 connected with 10.200.4.254 port 37718
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 1.15 GBytes 989 Mbits/sec
[ 5] local 10.200.1.254 port 5001 connected with 10.200.4.254 port 37721
[ 4] local 10.200.1.254 port 5001 connected with 10.200.4.254 port 37724
[ 6] local 10.200.1.254 port 5001 connected with 10.200.4.254 port 37722
[ 7] local 10.200.1.254 port 5001 connected with 10.200.4.254 port 37723
[ 6] 0.0-10.0 sec 382 MBytes 319 Mbits/sec
[ 5] 0.0-10.1 sec 368 MBytes 307 Mbits/sec
[ 4] 0.0-10.1 sec 295 MBytes 246 Mbits/sec
[ 7] 0.0-10.1 sec 137 MBytes 114 Mbits/sec
[SUM] 0.0-10.1 sec 1.16 GBytes 987 Mbits/sec
[ 8] local 10.200.1.254 port 5001 connected with 10.200.4.254 port 37730
[ 8] 0.0-10.0 sec 1.15 GBytes 989 Mbits/sec
[ 4] local 10.200.1.254 port 5001 connected with 10.200.4.254 port 37735
[ 4] 0.0-10.0 sec 1.15 GBytes 984 Mbits/sec
Cause
As per https://www.kernel.org/doc/Documentation/networking/bonding.txt mode 4 utilize all slaves in the active aggregator. Slave selection for outgoing traffic is done according to the transmit hash policy:
802.3ad or 4 IEEE 802.3ad Dynamic link aggregation. Creates aggregation groups that share the same speed and duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification. Slave selection for outgoing traffic is done according to the transmit hash policy, which may be changed from the default simple XOR policy via the xmit_hash_policy option, documented below. Note that not all transmit policies may be 802.3ad compliant, particularly in regards to the packet mis-ordering requirements of section 43.2.4 of the 802.3ad standard. Differing peer implementations will have varying tolerances for noncompliance. Prerequisites: 1. Ethtool support in the base drivers for retrieving the speed and duplex of each slave. 2. A switch that supports IEEE 802.3ad Dynamic link aggregation. Most switches will require some type of configuration to enable 802.3ad mode.
Depending on your switch configuration, you will have to:
1. Check your switch config for "etherchannel is correctly configured."
2. Check different xmit_transfer policies.
Resolution
As an example, we changed the xmit_transfer policy here from layer2+3 to layer3+4:
[root@test~]# ifdown onappstorebond
[root@test ~]# echo "layer3+4" > /sys/class/net/onappstorebond/bonding/xmit_hash_policy
[root@test ~]# ifup onappstorebond
[root@test ~]# iperf -s
[root@test1 ~]# iperf -c 10.200.1.254 -P 4 -M 9000
WARNING: attempt to set TCP maximum segment size to 9000, but got 536
WARNING: attempt to set TCP maximum segment size to 9000, but got 536
WARNING: attempt to set TCP maximum segment size to 9000, but got 536
WARNING: attempt to set TCP maximum segment size to 9000, but got 536
------------------------------------------------------------
Client connecting to 10.200.1.254, TCP port 5001
------------------------------------------------------------
[ 4] local 10.200.4.254 port 38032 connected with 10.200.1.254 port 5001
[ 6] local 10.200.4.254 port 38034 connected with 10.200.1.254 port 5001
[ 5] local 10.200.4.254 port 38033 connected with 10.200.1.254 port 5001
[ 3] local 10.200.4.254 port 38031 connected with 10.200.1.254 port 5001
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 1.15 GBytes 989 Mbits/sec
[ 6] 0.0-10.0 sec 1.15 GBytes 991 Mbits/sec
[ 5] 0.0-10.0 sec 1.15 GBytes 984 Mbits/sec
[ 3] 0.0-10.0 sec 1.15 GBytes 986 Mbits/sec
[SUM] 0.0-10.0 sec 4.60 GBytes 3.95 Gbits/sec
Layer3+4 is not 100% 802.3ad compliant. Please check if your application will deal OK with unordered packets traffic.
Comments
5 comments
I'm kind of surprised that OnApp would choose a bonding method that isn't supported by most if not all Juniper EX/Cisco Catalyst switches.
Ken, what method are you referring to, possibly their default balance-rr I think? mode 4/LACP/802.3ad is pretty well supported on Juniper EX/Cisco switches as far as I have seen.
balance-rr isn't supported without using cheaper Netgear switches from what I've seen. Even if you do get it working you're at risk of high load on your switch. At least that has been the case with Juniper EX. 802.3ad is supported but OnApp will tell you that it's not recommended. With that said I've determined that it's not really OnApp's choice since 802.3 has compliance limitations.
I was only intending on using it temporarily so I just moved on. Is there a fully tested and proven instance running 802.3ad on an EX/Catalyst? And without implications such as high load at the switch?
Ken, I'm not sure what else you are running on your switch but even without OnApp I have done LACP/802.3ad on Cisco 3550/3560G and Juniper EX2200,3200, and 3300 switches and never had any problems with high CPU or instability. We had no problems using 802.3ad for our storage network on a test OnApp setup that has been running for 6-8 months.
I do find it odd that OnApp invests heavily in their integrated storage but chooses to go down the path of not publishing any recommended ways to setup your network to use it effectively. They tell you in their guides you need more than 1Gbps and you should bond, but its not supported. I understand why but it doesn't make it any easier for those trying to set it up.
Correction: The run up of CPU load issue could occur when using balance-rr due to flapping.
As for 802.3ad we configured it as we configure all of our LAGs with LACP since we use it on our iSCSI SANs which work flawlessly. Our position was that when OnApp tells us it's "not recommended" and it's related to disk IO that it's not something we wanted to take chances with and we didn't have a long term test environment.
Balance-rr with a switch your not likely to want to use in production. Or... Use a LAG protocol that they put caution tape on. Even this article, which is useful, has a big question mark at the end and you're the first person (I've not looked that hard) that has said that it's working with confidence.
Please sign in to leave a comment.