asked on

Mulitple NIC cards, single IP address for redundancy

I want to make my Solaris server's network connections redundant. I have a redundant network, but the redundancy ends at the server's single NIC card.

So I am looking for a solution here. Conceptually, I need 2 NIC cards with the same IP address. When one fails, the other NIC should take over. Other solutions may be acceptable, but no manually changing wires, rebooting, etc. It's got to be automated. Another possible solution would be a single card with 2 ports.

PsiCop

A single NIC card with multiple ports (for example, the Quad Fast Ethernet, or QFE, board, which has 4 ports) still has the board itself as a single point of failure. If you're really looking for redundancy/fault tolerance, you should go with two separate NICs.

What you want is a Failover configuration. I know how to do this in NetWare, but I honestly don't know if Solaris supports Failover functionality (I'd find it difficult to believe that it didn't).

SOLUTION

ocon827679

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

ASKER CERTIFIED SOLUTION

PsiCop

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

ocon827679

Wow, did you not get laid last night or what! Don't mean to step on your toes here blabs but a simple solution to this question is round-robin DNS. I'm not saying that a good failover strategy is worth investigating, I'm only offering a simple solution. (Please note the operative term - SIMPLE SOLUTION)

Now let's look how round-robin could work. You need to get to the server and a DNS request is sent. The DNS request is resolved to NIC1 in the server. All work that you are doing goes through NIC1. I need to get to the server and send a DNS request, this time NIC2 is resolved due to round-robin. All my work will be accomplished through NIC2. Rock and Roll, everyone is happy. But just as Nirvana only exists on LP's (oops CD's now a days) poop occurs.

NIC1 fails for some unknown reason. You were working through NIC1 and your task is now dead. Sorry dude, but as I said before poop occurs. I'm still running, all fat and happy, think that I'll have a beer and toast your lack of open mindedness. (Sorry man, didn't mean to flame you there, but you deserved it!)

So you start your job over again. Your cache says to use NIC1, but you can't get there. So another DNS request is sent and voila, NIC2 is resolved and operational. Rock and Roll, we're both fat and happy again. Meet me in the bar and I'll buy you one!

By the way, in the future I'd prefer to be referred to as Donald Duck! :-)

PsiCop

ocon,

I can see that you're obviously a Windoze admin now trying to learn a real OS.

You said "So you start your job over again. Your cache says to use NIC1, but you can't get there. So another DNS request is sent and voila, NIC2 is resolved and operational."

WHY would another request get sent? If the workstation looks in its cache and the data it got has not yet expired, then WHY would it send another request to the DNS server? And even if it did, there is NO guarantee it would get NIC2. Do you have ANY clue as to how DNS works? Real DNS, not the Redmond garbage.

Keep up your attitude and personal attacks ("Wow, did you not get laid last night or what!") and I'll be calling you a former EE user, not Donald.

SOLUTION

ryanf

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

SOLUTION

ryanf

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

rfr1tz

ASKER

First off, this is a brain-storming session. OK, the DNS idea isn't good, but I've thought about a solution that uses DNS before too. Sometimes, the original idea isn't so good, but it can be modified by someone else to get a really good solution.

I've got a call in to a vendor. I don't see how any automatic switch-over can work since, as PsiCop noted, the DNS makes the client keep hammering away at the dead NIC. I'll tell you what the vendor says.

How long is the time-out for a DNS entry in a host anyway?

And if the DNS sees two IP's for the same name, is the normal operation to do round-robin? (Just curious, PsiCop).

This is the generic problem: If you've got a server on network #1 and the backup server on network #2, it's not easy to shift over to the backup server. Maybe this is the question I should have asked.

ryanf

http://wwws.sun.com/software/solaris/ds/ds-netmultipath/

Sun Solaris IPMP (Shipping with all installs since 2000) will do exactly what you want...

-Ryan

durindil

Solaris does indeed ship with MPXIO, which is an integrated, multipathing application. It runs the mpathd daemon, and allows you to team your interfaces for failover. You could also use a third party application, such as Veritas Dynamic Multi Pathing (VxDMP.)

As for the hardware, you can use a Sun QFE (Quad Fast Ethernet) or ZNYX's multi ported adapters (www.znyx.com) and their software for failover.

ryanf

I was not aware you could use MPXIO or VXDMP to multipath network interfaces... The link I sent is specifically for IPMP...

-Ryan

SOLUTION

Hanno P.S.

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

SOLUTION

PsiCop

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

PsiCop

Ah, IP Multipathing. I've never had a Solaris machine with multiple NICs - I didn't know that's what its called in Solaris.

Since the Asker did not specify a Solaris version, I note that IPMP is not available before Solaris v2.8. And v2.8 only includes partial support. It is not fully supported until v2.9.

ryanf

It is fully supported in 8... Has been since all releases since 2000... I think 8 was release 2001... Pretty nice config, works great... I think it was limited in 7... they switched from trunking to IPMP just after 6...
-Ryan

PsiCop

According to Sun, its only partially supported in v2.8.

I'm referencing the Sun Solaris Family Comparison Chart at http://wwws.sun.com/software/solaris/fcc/fcc_pfv.html

rhugga

What you want is Sun Trunking 1.3. $995. Allows you to trunk 2 or more supporteed NIC together. It combines NIC's into a trunk and represents them as a singlew MAC address. Note: Your switch also have to support this. Any datacenter size switch surely does.

I currently have 2 X1150A's (copper gigabit) cards in this same config. Note only does it provide seemless failover but also IP load balacning. You can actually watch the packet dispacement with the 'nettr' command.

If this is not an option for you, you can use a set of what I call "Poor Man's Failover". cron a script or have a daemon monitor you NIC(s). If one fails, plumb your backup interface (or attach a logical IP to an existing interface) and bring up the IP on that NIC. It may be 30 seconds slower than an expensive Veritas solution but it was free!

Sun Trunking is the way to go if you have the supported hardware. It was very easy to install and configure.

Last suggestion would be to simlpy give your Sun box two or more IP addressess on 2 or more NIC's (touch /etc/notrouter) and go the DNS route. (I never really like to rely on DNS or anything else that is out of my immediate control)

Hope this helps and I hope I didn't repeat what someone has above, I am tired and quickly browsed the other replies.

-CC

rfr1tz

ASKER

I have talked to the vendors and they say that the "redundant network" problem can be solved using the VIRP protocol (or the HSRP protocol if you love Cisco proprietary protocols.

RLopez

The following example will setup 2 hme NICS with failover working for either card.
Notes: Some Solaris 8 & 9 releases require patching. Vannila Solaris 9 requires patching.
IPMP requires a default gateway to be set. (A remote to ping to ensure things are working)
Clients must connect to the logical interfaces not the physical.
Local mac address must be used. (# eeprom local-macaddress?=true)
Load balancing is performed on a connection basis not per packet.
When the failover occurs you will see the logical address of the failed card or link appear on the working card.
Can be made permanent using /etc/hostname.
Applications should not bind to the physical IP addresses.

ifconfig hme0 plumb 192.168.10.215 group test deprecated -failover netmask + broadcast + up
            Configure IPMP failover on hme0 (Physical Test interface)
ifconfig hme0 addif 192.168.10.216 netmask + broadcast + up
            Add a logical interface to hme0
ifconfig hme1 plumb 192.168.10.217 group test deprecated -failover netmask + broadcast + up
            Configure IPMP failover on hme1 (The other Physical Test interface)
ifconfig hme1 addif 192.168.10.218 netmask + broadcast + up
            Add a logical interface to hme1

RLopez

Anacreo

Rlopez has posted the correct solution to this let me just clarify it a bit as its not a simple topic:

You must have two physical NICs each with its own IP address these are called the test IPs

You need an application IP this is the only IP that should be addressed by users of the application/DNS.

You setup the test IP's like normal in your /etc/hostname.<interface> files as such:

~ <106>$ cat /etc/hostname.ce2
myhost netmask + broadcast + group myhost-pub up
addif myhost-ce2 netmask + broadcast + deprecated -failover up

~ <107>$ cat /etc/hostname.ce5
myhost-ce5 netmask + broadcast + deprecated -failover group myhost-pub up

myhost, myhost-ce2, and myhost-ce5 all exist in the /etc/hosts file with unique IP addresses.

The two test interfaces myhost-ce2 and myhost-ce5 are marked as depricated and -failover, the deprecated flag means put me at the bottom of the list of possible source interfaces (basically don't use unless I'm the only local interface) and the -failover flag means don't fail me over.

The "group myhost-pub" makes this interface part of an IPMP (Internet Protocol MultiPathing) group named myhost-pub, standing for public network.

How does it work:
In most configurations it will end up using the default router as its "ping partner" if for some reason the test interface can't do an ICMP echo ping to the default router then it marks the interface as DOWN. If it fails on the next try (every 10 seconds by default) it will move every interface off of the interface with the test IP (except for things marked with -failover) off to the next physical interface in the IPMP group. By default failing back is turned off so the application IP will remain on the secondary NIC.

If you'd like to modify the timeout I recommend looking at /etc/default/mpathd and making sure that the timeout matches with your network. If you have your CISCO's in a high availability configuration you should probably change the FAILURE_DETECTION_TIME=10000 to FAILURE_DETECTION_TIME=30000 so that its only looking for failures every 30 seconds. This will give the CISCO's a full minute to correct the network problems before IPMP moves the network to the next interface (this is mostly significant with a cluster).

I hope this helps...

BTW, if you don't have /etc/defaultrouter defined then IPMP will first try doing a broadcast for ALL ROUTERS, and it will pick one of the routers that responds as its ping partner, and if that doesn't work it will then do an ALL WORKSTATIONS broadcast, and then try that.

You can also configure this even with only one NIC in your machine so that you can atleast monitor the uptime for the network, it will not bring down an interface if it has no where to move it too....

rhugga

You really should look at Sun Trunking, it is a much more clean and robust approach to what you are doing. As a second choice I would go with and IP load balancer. Both methods are much more robust and much more simple implement. Sun Trunking allows round-rbon to up to 4 interfaces on the fly. If a nic fails, you don't even get a hiccup. At $995 it is a steal for what it delivers.

-rhugga

RLopez

Re Anacreo's IPMP command's:
cat /etc/hostname.ce2
myhost netmask + broadcast + group myhost-pub up <-"THIS WILL MOVE ON FAILURE, NOT KOOL!!"
addif myhost-ce2 netmask + broadcast + deprecated -failover up <-"THIS IS NOT RECOMMENDED"

This would be the better way to configure ce2
cat /etc/hostname.ce2
myhost-ce2 netmask + broadcast + group myhost-pub deprecated -failover up
addif myhost netmask + broadcast + up

ce5 below is fine
~ <107>$ cat /etc/hostname.ce5
myhost-ce5 netmask + broadcast + deprecated -failover group myhost-pub up <-"THIS IS FINE"

If you want a reliable IPMP config, DON"T make the virtual interface the "TEST" address.
The "-failover" option prevents the address from moving and makes the interface the "TEST" address.
I have seen IP's stacking on the working interface with duplicate IP's if the physical address is made to move.
Some of Sun's docs are wrong and some are correct.
The official word re IPMP is "The Test address can be on the physical or virtual interface. Putting the test address on the physical interface is preferred."
The hosts name should be on the virtual address not the physical.
This is the name to advertise to other hosts.
DO NOT advertise the physical address as when the NIC fails clients will loose the connection.
They will connect on this virtual address and if failure occurs the address will float over onto the alternate NIC.
So, On the revised config above, what was ce2:1 now moves to ce5 NIC and becomes interface ce5:1.
On my previous post hme1:1 floats over on failure and becomes hme2:2

I hope this will settle this topic once and for all.

Anacreo

Well I agree with you 100%... And this is what I originally had done but Sun Professional Services overruled me...

Although the way I described will look kind of funky, the interfaces virtual numbers will change in a failover, one node will have a NIC with no non aliased interface and will never get it back, this is how a multitude of Sun engineers have configured our servers. I believe this may have something to do with our Sun Cluster requirements I decided to post the way Sun PS sets it up...

I agree with you though and not knowing what Sun PS may or may not know I'd much rather do it the way you suggest.

To rhugga Sun Trunking may be a steal at $995.00 but IPMP is a bigger steal at $0.00! And IPMP is VERY simple and VERY effective once you "catch" on to what its actually doing... Pinging its "ping partner" and moving anything not -failback'ed to other NIC's in its group if it can't be reached after two attempts. Free, Simple and Well worth using...

21COM

Hi,

One thing that does not shine out from these discussions is how ipmpd acheives load-balancing. If we consider inbound datagram distribution first, when ipmd recieves an IP datagram from an application whose transport layer has not forced a source address to be used, then ipmpd selects a source address from the pool of (non-deprecated) interfaces associated with that group. Since the other host involved in the communication is unaware of this sleight of hand, it responds to the NIC associated with that source IP and this is how incoming traffic is distributed across the range of available interfaces. Note that it is expected behaviour to see IP datagrams being issued from one interface whilst having a source address associated with another interface in that group. For outbound datagram distribution, ipmpd checks the IRE cache for an entry to see which outbound interface to use. If no entry is found, then one must be created and ipmpd employs a round robin algorithm to determine which interface of the group to use. This interface is used for the duration of the IRE cache interval. Note distribution is achieved across a range of destination addresses - that is if only one destination address is involved in the communication then only one routing record will exist and consequently only one interface will be used.

hayese

I like the answers above using IPMP, but I think 1 downside point should be mentioned - IPMP "cost", at a minmum, 2 additional IP addresses! If you have a large network, with many subnet, and you are trying to install 10 to 20 server on 1 subnet this can be a heavy hit on the number on IP's available.

Does anyone have a solution using IPMP, or something else, where we can use 1 IP with 2 NICs? I'm thinking of someone, maybe Sun Trunking, where the IP & MAC are "taken over" by the secondary NIC.

rhugga

Sun Trunking is the solution you want. The IP and Mac aren't really taken over by the other. It is a 'trunk'. (Think of it on EE terms) The switch needs to support port aggregation and you must have local-mac-address? set to true on the host side. For $700 a pop it more than pays for itself. I've see far too many times where a site threw a larger, more expensive system at a problem when the orignal system had the bandwdith to handle more network I/O. Just invest in another Nic and Sun Trunking. For $2000 you just increased throughput significantly. (without inflating software & support costs in the process. Think of the price change for vxvm/vxfs when you move up in tier)

However you are only concerned with redundancy. With Sun Trunking you get that plus the added benefit of performance. Also, it is not failing over as it would with IPMP, Sun Trunking supports round-robin load balancing. In the event one NIC fails, it merely only uses the other NIC(s). Much more transparent and more robust a solution imho.

I use it now in a backup infrastructure that handles 1.5 petabyte in backups a week using lowly Sun E6500's and V1280's.

-cc

hayese

My internal customer does not need a bigger "pipe", they just want to eliminate SPF, and from what I understand, trunking NIC must all go to the same network switch.

With 2 NICs & 2 networks (on the same subnet), IPMP pushes the SPF farther out, and even adds a small amount of throughtput.

Just my 2 cents...thanks

rhugga

Yea, if they are talking about redundant switches that is a different issue. Generally I am willing to accept the switch itself as a SPF, since the cost of going redundant at that level skyrockets. Plus you can usually put your ports on seperate blades in the switch and then you basically have a SPF that is the switch chassis, which has a very low chance of failure. (the blades do as well but the switch chassis is even lower)

This customer must be looking at serious lost revenue to get that devoted to eliminating SPF's. Most sites will go redundant at the distribution but the data center I work in, we have over 100 Cisco 650x class switches and going redundant with every switch would be much more costly than revunue lost to downtime. (plus the power consumption of these is quite large, so doubling the power and cooling for those, as well as support costs, as well as consuming valuable floor space) Then your talking about doubling your patch panels, cabling, etc.. ugh. It might be more feasible to keep a spare switch on site and have a process that quickly loads 'X' config onto the switch and maybe 1 hour of downtime.