Solved

Routing configuration with two NICs

Posted on 2013-11-29
4
309 Views
Last Modified: 2013-12-17
I made some mistakes with my subnet layout a couple of years ago but largely was able to make it all work, but now I'm trying to correct some things and change some others and I'm a bit stuck.

All servers that I'm referring to in my question are Linux and have one ethernet interface and one Infiniband interface. My main subnet is 10.0.0.0/16 (which is too big, it should be /20), and my Infiniband subnet is 10.0.2.0/24 (which is actually just part of 10.0.0.0/16 and I that's really how it's all setup using static IPs). So now I'm trying to move a cluster of 24 servers to a new VLAN at 10.100.0.0/24 which uses an interface in a router that has 10.100.0.1 as the address and I'm stuck keeping the Infiniband network where it is for the time being.

Servers used:
# | eth dns name | eth ip       | ib dns name | ib ip
--+--------------+--------------+-------------+------------
1 | master       | 10.0.0.200   | master.ib   | 10.0.2.100
2 | node01.hpc   | 10.100.0.101 | node01.ib   | 10.0.2.101
3 | node02.hpc   | 10.100.0.102 | node02.ib   | 10.0.2.102

Open in new window


Obviously I'm trying to transition master to the new VLAN to match the others but there's other factors from that happening right away, and I don't think it really matters at this point anyway because the bigger issue is that the routing on the Infiniband subnet is correct.

On master "ip r" shows
10.0.0.0/16 dev eth0  scope link 
10.0.0.0/16 dev ib0  scope link 
default via 10.0.0.1 dev eth0

Open in new window


On node01.hpc & node02.hpc "ip r" shows
10.100.0.0/24 dev em1  scope link 
default via 10.100.0.1 dev em1  proto static

Open in new window


I get the following results when I do ping between the servers
from       | to         | ping
-----------+------------+----------
master     | node01.hpc | succeeds
master     | node01.ib  | fails
master     | node02.hpc | succeeds
master     | node02.ib  | fails
-----------+------------+----------
node01.hpc | master     | succeeds
node01.hpc | master.ib  | succeeds
node01.hpc | node02.hpc | succeeds
node01.hpc | node02.ib  | fails
-----------+------------+----------
node02.hpc | master     | succeeds
node02.hpc | master.ib  | succeeds
node02.hpc | node01.hpc | succeeds
node02.hpc | node01.ib  | fails

Open in new window


I figure at this point the lack of success on the Infiniband addresses is because of a missing route for that interface so I change the routes on node01.hpc and node02.hpc to show
10.0.2.0/24 dev ib0  scope link 
10.100.0.0/24 dev em1  scope link 
default via 10.100.0.1 dev em1  proto static

Open in new window


and now I get
from       | to         | ping
-----------+------------+----------
master     | node01.hpc | succeeds
master     | node01.ib  | fails
master     | node02.hpc | succeeds
master     | node02.ib  | fails
-----------+------------+----------
node01.hpc | master     | succeeds
node01.hpc | master.ib  | fails
node01.hpc | node02.hpc | succeeds
node01.hpc | node02.ib  | succeeds
-----------+------------+----------
node02.hpc | master     | succeeds
node02.hpc | master.ib  | fails
node02.hpc | node01.hpc | succeeds
node02.hpc | node01.ib  | succeeds

Open in new window


This all makes me think that if the route (10.0.0.0/16 dev ib0) on master differs from that of node0{1,2}.hpc (10.0.2.0/24 dev ib0) it won't work as I need it to. Is my assumption correct?

The main problem for me is that I have other servers that have the same routes as master that I have to transition is blocks because they're all production servers. Ideally I'm way out in left field and there's an easy solution.

If you made it this far thanks for reading and if I could I'd award a heck of a lot more than 500 points for a fix.
0
Comment
Question by:coanda
  • 2
  • 2
4 Comments
 
LVL 57

Expert Comment

by:giltjr
Comment Utility
Double check how your NiC's are configured.  You stated that the route table on "master" shows:

10.0.0.0/16 dev eth0  scope link
10.0.0.0/16 dev ib0  scope link
default via 10.0.0.1 dev eth0

This show that on dev ib0 it has a /16 as the subnet mask.  If that is true, then Linux believes it has two interfaces on the same subnet and will only use the 1st interface, in this case eth0.

You need to make sure that ib0 has as /24 instead of a /16.
0
 
LVL 3

Author Comment

by:coanda
Comment Utility
The Infiniband interface is on 10.0.2.0/24, the only way to make it work at all was by putting in on 10.0.0.0/16 which I realize is the wrong way to setup multiple adapters. Really though it didn't end up mattering because it just meant that the IPoIB traffic ignored the interface, which in my case didn't matter anyways because they're really only meant (in this case) to use the RDMA protocol which functions more like UDP.
0
 
LVL 57

Accepted Solution

by:
giltjr earned 500 total points
Comment Utility
If you want traffic to flow over the ib interfaces your simplest move would be to move them to a subnet that is outside of the 10.0.0.0/16 subnet.

With two interfaces in the same subnet, the OS will use the 1st interface the OS sees, which will normally be the ethernet interface.
0
 
LVL 3

Author Closing Comment

by:coanda
Comment Utility
Completely forgot that I had this question open. What you put in your comment is exactly what I ended up doing.
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

I was recently sitting at a desk at work with one of my colleagues and needed some information on my home computer. He watched as I turned on my home computer, established a remote session into it, got the information I needed and then shut it down …
Article by: IanTh
Hi Guys After a whole weekend getting wake on lan over the internet working, I thought I would share the experience. Your firewall has to have a port forward for port 9 udp to your local broadcast x.x.x.255 but if that doesnt work, do it to a …
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now