Win2k3 Routing gives wrong source IP when sending packets to a narrow netmask range with multiple overlapping netmasks

I have trouble with Windows 2003 Server (and perhaps XP/others, but probably not win7/server 2008) where if I have two IPs on my network card set up as follows:

10.0.0.8/8 (netmask 255.0.0.0)
10.222.0.8/24 (netmask 255.255.255.0)

Depending on the order of the entries in the IP list for that card, the operating system may use the incorrect IP as the source address.

For example, when I then attempt to connect to 10.222.0.5, the computer sends the packets with a destination IP address of 10.222.0.5 as it should but a source IP address of 10.0.0.8 -- which, of course, is outside of the netmask range of the target 10.222.0.5, so the packets get dropped by the receiving host because, of course, it's ip/netmask of 10.222.0.5/255.255.255.0 excludes any IP in the 10.0.x range.

The network OS layer SHOULD make sure that when sending packets to a certain IP address that they are "from" an IP that is within the netmask of that IP address -- that's one of the purposes of having a routing table in the first place!  

Is there a solution for this? I've had trouble with it for years, always assuming that Microsoft would fix it in their "next" update - but they haven't. Maybe it's user error?

By way of background, routing tables should be evaluated in order of tightest netmask first:
10.0.0.8/255.255.255.255 -> Direct to self
10.222.0.8/255.255.255.255 -> Direct to self
10.222.0.0/255.255.255.0 -> Out an interface with an ip/mask within this range, with a source IP 10.222.0.8
10.0.0.0/255.0.0.0 -> Out an interface an  ip/mask within this range, with a source IP of 10.0.0.8
0.0.0.0/0.0.0.0 via 10.0.0.1 -> Out an interface with an IP/mask of default gateway, source IP 10.0.0.8

It just makes no sense to send out a packet with a source address outside of the network you're sending it to. But this seems to be the default behavior of win2k3 if the wider netmask IP is listed first when you go to add IPs. Nor does it make sense to evaluate the routing table in any order other then tightest netmask first.

So, in short - How can I make win2k3 always send packets with a source IP that is within the netmask range to which it is sending, even when I have overlapping ip/netmasks on the same interface?

Thanks very much!


PS:
by "netmask range" I mean the range of IPs covered by a certain IP and netmask. For example, an IP/netmask of 10.0.0.0/255.255.0.0 covers from 10.0.0.0 to 10.0.255.255.
rubearAsked:
Who is Participating?
 
BLipmanCommented:
I have a couple of other suggestions:

First, regarding your "hack"; it really is an interesting solution, I wouldn't discount it.  Here is a way to do what you need programatically:
make a batch file that executes at startup, you can first have it delete all IPs except for maybe your primary one, then re-add them in the order you need for proper IP handling

netsh in ip add address "Local Area Connection" 10.0.0.2 255.0.0.0
netsh in ip add address "Local Area Connection" 10.0.0.3 255.0.0.0
netsh in ip add address "Local Area Connection" 10.0.0.4 255.0.0.0
netsh in ip add address "Local Area Connection" 10.0.0.5 255.0.0.0
netsh in ip add address "Local Area Connection" 10.0.0.6 255.0.0.0

That will make your work much less manual (assuming your routing table is rebuilt poorly--the default order--upon reboot).  

I was hoping RRAS would enable the use of an actual routing algorithm in lieu of the broken method older versions of windows provided but, honestly, the overhead and whatnot of enabling all of that crap just to put some intelligence into a broken system may not be a good fix even if it helped.  

The thing I would try personally is using the route add command to put alternate routes in place but with slightly lower metrics.  That should force the table to select your preferred routes and exit interface IPs instead of the 'wrong' ones:

syntax:
Route add -p network [network address] mask [subnet mask] [gateway] metric [value]

The persistent switch should populate your routing table with persistent, static route entries.  If your table does the matching and sees that they are both the same number of bits in the mask, it will (should) look at the lower metric you assigned and use the right path/IP!
Persistent routes should...well...persist, even after rebooting.  Of course there is the option to upgrade to server 2008 but what fun is that when you might be able to fix something MS has failed to do in the last 7 years with W2K3!

Good luck!
0
 
BLipmanCommented:
Would you post a route print?  Your metrics should be able to override your "tightest first" assumption.  I believe lower metrics change the priority.  Why are you dual homing a Windows server anyway?  
0
 
rubearAuthor Commented:
> Would you post a route print?
Sure thing -- see below. Also included ipconfig's output.

> Your metrics should be able to override your "tightest first" assumption.
I don't want to override my "tightest first" assumption. Tightest first is the way it is supposed to be because that's the way it needs to be in order to work right. That's the whole point of the routing table.

> I believe lower metrics change the priority.
One of us is confused. It sure could be me - it's happened before.   But -- aren't metrics used to determine which is the better path when you have two paths to the same place, while netmasks are used to decide which of two different routes to send a packet - routes that go to different places?  Imagine the router saying to itself "Hmm. I'm in Oregon, and this packet is for Texas, but it's quicker to send it to California, so I'll send it there..." ha ha.
Seriously, I don't see how metric could ever override netmasks.

>Why are you dual homing a Windows server anyway?
For the same reason I like to have lots of windows open - it makes me feel cool.
Always cracks me up when I ask a technical "How to" question and someone replies "Why do you want to do that anyway...!"

Since I have only one network card, one default gateway and one route to the net, is it really dual homing?

Anyway, in answer to your question of why I want to have two IPs in overlapping but different sized netmasks on one "windows server" anyway -- basically I like to take advantage of IPv4's sublime ability to allow multiple IP subnets in the same ethernet network. It's an amazing capability and it really works well in some situations.
So naturally, I would want a "windows server 2003" computer to be able to have a "presents" on more then one of these local sub-nets so that it could interact with the different computers/servers/routers which are on various different subnets.

Thus, the main IP is 10.0.0.8/8, but one of the subnets it needs to communicate with is much narrower (10.222.0.x/24) so of course this box needs an IP inside the 10.222.0.x range in order to communicate with things in that range. Furthermore, this box needs to use as its source IP 10.222.0.8 when communicating with things inside the 10.222.0.x range - but that's my problem:

My box (said Windows Server 2003), when sending packets to 10.222.0.5, uses a source IP of 10.0.0.8, completely disregarding my netmasks, and completely disregarding the fact that the routing table says that traffic in the 10.222.0.x range must be, umm, in the 10.222.0.x range.

All real network operating systems work correctly. I think windows 7 and Windows Server 2008 also handle this correctly -- at least my very brief tests showed that they did.


Anyway, below is a copy of my route print and ipconfig.

At the very bottom, I've pasted what a Linux/Unix routing table looks like so you can see how it is sorted by mask tightness - tightest at top, and finally default gateway (widest) at bottom.

Basically, how it's supposed to work is when the routing code in the network OS has a packet that it needs to deliver, it compares the packet's destination IP address to each entry in the routing table. It's supposed to start with the tightest listed mask, progressing to the wider and wider masks until it finds one that matches.  When it finds a route that accepts packets with that destination (based on netmask) then it sends it out that route. If none of the specific routes accept a packet with that destination, then comes the default gateway - last - with a mask of 0.0.0.0/0 - which means "matches any destination IP" - then the packet is sent out the default gateway route.

The problem I'm having with win 2k3 is that instead of starting at the tightest mask, it seems to start at the top of the table -- and the table is not sorted by the masks -- so if a wide netmask is found first, it will use that instead of the narrow one -- thus sending the packet with an invalid source address.

Thanks very much.

~~~~~~~~~~~~~~~~

C:\Documents and Settings\Administrator>ipconfig

Windows IP Configuration


Ethernet adapter Local Area Connection:

   Connection-specific DNS Suffix  . :
   IP Address. . . . . . . . . . . . : 192.168.1.8
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   IP Address. . . . . . . . . . . . : 10.253.1.8
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   IP Address. . . . . . . . . . . . : 10.222.0.8
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   IP Address. . . . . . . . . . . . : 10.128.0.94
   Subnet Mask . . . . . . . . . . . : 255.255.0.0
   IP Address. . . . . . . . . . . . : 10.0.100.108
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   IP Address. . . . . . . . . . . . : 10.0.0.8
   Subnet Mask . . . . . . . . . . . : 255.0.0.0
   IP Address. . . . . . . . . . . . : 192.168.0.8
   Subnet Mask . . . . . . . . . . . : 255.255.0.0
   Default Gateway . . . . . . . . . : 10.0.0.5

C:\Documents and Settings\Administrator>route print

IPv4 Route Table
===========================================================================
Interface List
0x1 ........................... MS TCP Loopback interface
0x2 ...00 11 2f 3c 0a 2d ...... Marvell Yukon 88E8053 PCI-E Gigabit Ethernet Controller - Deterministic Network Enhancer Miniport
===========================================================================
===========================================================================
Active Routes:
Network Destination        Netmask          Gateway       Interface  Metric
          0.0.0.0          0.0.0.0         10.0.0.5      192.168.0.8      1
         10.0.0.0        255.0.0.0         10.0.0.8      192.168.0.8     20
         10.0.0.8  255.255.255.255        127.0.0.1        127.0.0.1     20
       10.0.100.0    255.255.255.0     10.0.100.108      192.168.0.8     20
     10.0.100.108  255.255.255.255        127.0.0.1        127.0.0.1     20
       10.128.0.0      255.255.0.0      10.128.0.94      192.168.0.8     20
      10.128.0.94  255.255.255.255        127.0.0.1        127.0.0.1     20
       10.222.0.0    255.255.255.0         10.0.0.8      192.168.0.8     20
       10.222.0.8  255.255.255.255        127.0.0.1        127.0.0.1     20
       10.253.1.0    255.255.255.0       10.253.1.8      192.168.0.8     20
       10.253.1.8  255.255.255.255        127.0.0.1        127.0.0.1     20
   10.255.255.255  255.255.255.255         10.0.0.8      192.168.0.8     20
        127.0.0.0        255.0.0.0        127.0.0.1        127.0.0.1      1
      192.168.0.0      255.255.0.0      192.168.0.8      192.168.0.8     20
      192.168.0.8  255.255.255.255        127.0.0.1        127.0.0.1     20
    192.168.0.255  255.255.255.255      192.168.0.8      192.168.0.8     20
      192.168.1.0    255.255.255.0      192.168.0.8      192.168.0.8     20
      192.168.1.8  255.255.255.255        127.0.0.1        127.0.0.1     20
    192.168.1.255  255.255.255.255      192.168.0.8      192.168.0.8     20
        224.0.0.0        240.0.0.0      192.168.0.8      192.168.0.8     20
  255.255.255.255  255.255.255.255      192.168.0.8      192.168.0.8      1
Default Gateway:          10.0.0.5
===========================================================================
Persistent Routes:
  None

C:\Documents and Settings\Administrator>

~~~~~~~~~~~~

Linux/Unix routing table:

root@dascomputer:~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
72.83.94.112  0.0.0.0         255.255.255.240 U     0      0        0 eth1
72.83.94.0   0.0.0.0         255.255.255.128 U     0      0        0 eth1
10.128.0.0      0.0.0.0         255.255.255.0   U     0      0        0 eth0
192.168.0.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0
192.168.0.0     0.0.0.0         255.255.0.0     U     0      0        0 eth0
10.0.0.0        0.0.0.0         255.0.0.0       U     0      0        0 eth0
0.0.0.0         72.83.94.1  0.0.0.0         UG    0      0        0 eth1
root@dascomputer:~#
(Public IPs/names changed for security, of course.)
0
Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
rubearAuthor Commented:
I figured out a dirty hack solution:

First, let me say that I don't know if it will work for everyone, or if it will keep persistent after a reboot. I also don't know if the technique could be made to work with multiple network cards.

Yes, I agree, this is a pain. A real pain. And it's a hack. (Of course anyone using Server 2k3 deserves a hack.)
One reason it's a pain is because it disconnects all of the putty sessions you have to all your linux boxes..!

But here's what I did:

First, deleted all the IPs from the network card.

Second, add them back in, one by one, ordered tightest netmask first to widest last.
Here's the IPs and netmasks I put in:

10.222.0.8/24 255.255.255.0
192.168.0.8/16 255.255.0.0
10.128.0.94/12 255.240.0.0
10.0.0.8/8 255.0.0.0

So this way when windows evaluates the route in the order of entry, it tries tightest netmask first, next tightest second, and so on and so forth, finally trying the widest netmask last.

If someone comes up with the "real" proper solution, that'd be great - otherwise this will work for me and I'll pick solution as the accepted solution.


Thanks.
0
 
BLipmanCommented:
First, I do believe Windows servers have some routing issues; they certainly don't work like a cisco router does for one.  I don't remember exactly what the issues are but I have heard of this in the past.  I am actually very familiar with how a router works but what you are doing is kind of strange.  You have all of these subnets on one broadcast domain?  Anyway, I am sure this would work better with VLans and multiple gateways but you probably are running into an inherent bug in Windows.  The TCP/IP stack was redesigned from the ground up with Server 2008, Vista, and 7 so it doesn't surprise me that it works better there.  You could toy around with enabling routing and remote access; maybe that will make improvements.  

I think the table is goofy though, in the example you cited, this is the table line that should match:
 10.222.0.0    255.255.255.0         10.0.0.8      192.168.0.8     20
your gateway is 10.0.0.8 but it is telling it to leave on a 192.168 address.  Maybe add a persistent route more like this:
10.222.0.0  255.255.255.0  10.222.0.8  10.222.0.8  10

You have a few overlapping addresses but the interfaces are set to addresses on other subnets.  I think playing around with some persistent, static routes might improve things.  
0
 
rubearAuthor Commented:
Thanks, BLipman.

Glad to hear that you're very familiar with how routers work. You must know a lot more then me about it. I still can't figure out how the metric could override routing decisions between two non-redundant routes, or how remote desktop would affect the routing decisions

I see nothing strange about the way I'm doing things. It just happens that the way I'm doing it is the best way considering the circumstances and it works well (except for this MS bug). We do use vlans for some stuff, but vlans would vastly complicate things in my specific situation here - and all for no reason (except as a work around for broken MS code? but that sounds silly.) The way I'm doing it is a perfectly acceptable and proper way to do it and would work perfectly if the win2k3 networking code simply complied with the core internet RFCs. I'm certainly not asking of win2k3 anything that wasn't completely understood and expected of the internet designers.

The real solution (rather then switching to vlans or whatever) would be to upgrade to win server 2k8 -- but again, that's a lot of work in my case and again, ms ought to fix it. I mean every month or whatever they push updates for all sorts of stuff -- why not a quick fix to bring the core networking code  up to mid-80s internet standards? ohwell.

As to overlapping netmasks, this is almost always the case on internet-playing machines -- because the default gateway netmask overlaps all other netmasks.

The whole concept of routing is (and I'm not talking about NATting)  for a machine to have routes defined with netmasks. The whole netmask idea is designed so you can direct a router (or any IP-enabled device) to send all traffic within a certain range out one port for one part of the world/country/city/building, but then  you can create smaller subnets within that range and send traffic to ips in those ranges to different places. It's an amazingly powerful and yet very simple concept.

As to my route print looking goofy - yup, windows route prints always look goofy to me. Never sorted, seems to be lots of strange extra entries, who knows what goes on inside there.


As to enabling routing and remote access -- you mean "remote access" as in "Remote Desktop Server?!" Already have that, doesn't help, nor do I see why it should LOL! RDP client and server runs a layer or two (depending on how you slice it) above the routing decisions layer!  As to enabling routing, please explain. Maybe that's the one thing I haven't tried yet. How do I enable routing?

Note that this win Server 2k3 does not act as a router/gateway/trafficpasser/relay for any other host. It is one end-point for all traffic in and out of its ethernet card.

But maybe if I enable routing it will have some options about manipulating the routing table to a better degree.
How do I enable routing?

Otherwise, my little hack (of putting in IPs in the order of tightest-netmask first ) seems to be working great.

Thanks!

0
 
BLipmanCommented:
I am not talking about RDP when I am telling you about Routing and Remote Access.  It is a feature in Windows that allows your system to act more like a router.  It is probably the last thing I would try though.  My suggestion to put in persistent routes with a lower metric may fix your problem.  As to the narrow mask vs. metric thing, I agree that on a Cisco router or a non-windows machine the most specific mask is always applied but this apparently not the case with Server 2003 so you may be able to rig it with a metric.  That was my point.  

Here is info on "Routing and Remote Access" (not RDP):
http://technet.microsoft.com/en-us/library/cc786023(WS.10).aspx

Perhaps making your server work more like a true router can help you with the bug in Windows.  I don't even know if the routing functionality can work with a single NIC though.  Were this an actual router, you could make the port a trunk port and have virtual router interfaces (router on a stick) but I suspect Windows Server's crude router role won't be able to handle it.  
0
 
rubearAuthor Commented:
Thanks, BLipman, for the URL there.
I did a bunch more reading, and it looks like windows Server 2k3 is just broken. The page:
http://technet.microsoft.com/en-us/library/cc778287%28WS.10%29.aspx says:
------------- Quote --------------
With classless routing, routing to all destinations is done on a longest-match basis. This means that, for a specific destination that matches multiple {Network ID, Subnet Mask} pairs in the routing table, the router forwarding the IP packet uses the match with the longest mask (the mask with the most number of bits set to 1) to forward the packet.
~~~
Metric indicates the relative cost of routes so that the best route among possible multiple closest matching routes to the same destination can be selected. If there are multiple routes to the same destination with different metrics, the route with the lowest metric is selected.
----------- End Quote ------------
(They use the expression "Longest match" to mean "tightest netmask.")

So Microsoft thinks that 2k3 works like it should, and like I think it should.

Just for clarification, my 2k3 box is "Windows Server 2003" -- that's the version, not the role. It does not act as a server (except sometimes we RDP to it or use it as a print server.. once in a while..) but it doesn't act as a router or carrier of traffic for any other machines. Mostly it just connects to various other machines on our network for various monitoring purposes.

It's very unlikely that changing the metric would cause a packet to be sent to the wrong route in violation of the routing table, but if you can explain to me how exactly to change the metric for a certain route I will be glad to try it and see if it works.  If it works, it'll still be a dirty hack, but no less elegant then my dirty hack!

As far as "remote access" -- I apologize for assuming you meant RDP. I just did a quick google search and saw some hits for RDP. I now see that "Remote Access" really means:
~~~~~~~~
Remote Access. By using RRAS, you can deploy VPN connections to provide end users with remote access to your organization's network. You can also create a site-to-site VPN connection between two servers at different locations.
(http://technet.microsoft.com/en-us/library/cc754634%28WS.10%29.aspx)
(Except that's an article for win serv 2k8)
~~~~~~  


I did enable RRAS (Routing and Remote Access) and played around a little bit. I even removed and re-added my IPS to my card, ordering them widest-netmask first  (which again caused loss of communications to the narrow netmask route) then I used RRAS to add a static route to the 10.222.00/24 route with a low metric of 2 (much less then the 20 of all other competing routes) but the packets still went out with the invalid source ip of IP of 10.0.0.8 (which is the IP on this box for the 10.0.0.0/8 route.)

So, I guess it's just a bad bug in the code for a long time. But I guess adding the IPs in the order of tightest netmask first really isn't that bad of a work-around as long as you're not always adding and removing tight netmask IPs. (since they have to go in first.)

Thanks again for all your effort.

0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.