Main Topics
Browse All TopicsHello experts,
We have new rhel5 linux machine which is to replace our production machine. we configure the bonding (mode 0 - round robin) on the machine to be the interface while eth0 and eth2 its slaves. after this configuration we encounter heavy packet loss pinging from and to the machine. more then that, when we disconnect the eth0 network connection we lose the communication with the machine. (while disconnecting eth2, when eth0 is connected, does not change anything and we still have the communication to the machine).
We also saw that there is a bug with round robin and bonding with rhel5 (http://kbase.redhat.com/f
attached are the configuration files we use and the process of configuring the bonding. we will appreciate any help.
Thanks
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
accorind to this document, we don't have to to anything on the switch:
Switch Configuration
For this section, "switch" refers to whatever system the bonded devices are directly connected to (i.e., where the other end of the cable plugs into). This may be an actual dedicated switch device, or it may be another regular system (e.g., another computer running Linux),
The active-backup, balance-tlb and balance-alb modes do not require any specific configuration of the switch.
please note that you're using mode=1 which represents active-backup mode with the linux bonding module.
The active-backup mode (mode=1) will treat arp lookups in a way that may confuse many switches.
If you have the option, I would recommend that you set up LACP/802.3ad on the switch and use mode=4 as a module parameter instead of mode=1.
Just my two cents.
Dell now tells me there is a bug in RHEL 5.3 with bonding and that only 5.4 will solve; they did try to get me the latest drivers and even firmware but to no avail.
is there anyone out there who successfully uses bonding in RHEL? which version, and which bonding mode? did you verify that the bonding works (disconnect cables, etc.)
i will try some of these ideas, but my switches are production switches so i can't do this any time
everyone,
latest updates.
we got it to work, using the latest driver patches from Dell technical support, and we verified that when disconnecting one cable the other interface picks up, so client software is almost always not aware of it. the configuration file is very much standard, we are using active-standby mode.
HOWEVER, we found out that when we use with the machines for prolonged period the following problems occurr, and they all dissapear once you disable bonding:
- when running a continuous PING from any client, there is some packet loss, we never get 0% lost packets
- when opening SSH sessions from time to time the session gets disconnected, even if you define keep-alive in the SSH client
- when writing/reading folders which are mounted with NFS (the NFS server is a NetApp device) there is random delay, and the percentage of disconnected SSH sessions is higher.
OK, finally solved!
the real issue was that, even as we configured the bonding to use active-standby mode (1), the bonding was still operating in round-robin mode. you see that by typing
cat /proc/net/bonding/bond0 OR cat /sys/class/net/bond0/bondi
to force the bonding to active/standby, shutdown the bonding, then type this:
echo 1 > /sys/class/net/bond0/bondi
then run the "normal" commands to start bonding. works like a dream!
to verify the bonding mechanism, i constanly monitored the MAC addresses on the switches and verified that at any given moment only one port was broadcasting the MAC address of the bond. when i forced that port to shut down, the port on the other switch started broadcasting the same MAC address.
Business Accounts
Answer for Membership
by: mrwortaPosted on 2009-07-26 at 06:53:41ID: 24945780
Your config looks good in general - but what is
about the switch you're using: Did you configure it for 802.3ad?