Windows 7 appears to ignore default gateway in dual gateway setup
Posted on 2013-11-20
We have a network in place where we have several workstations that need to access two distinct networks (Supernet and Internet)
The first network comes in to Router A, then on to a switch. The second comes in on Router B, and out the same 24 port unmanaged (open) switch. From there, we have one network cable going to each of several workstations.
In order to be able to communicate with both networks using the same IP address, we've set up both routers (with completely different WAN IPs) to both have a LAN IP in the same local subnet. Let's say 188.8.131.52/24 (slightly obscured). This is technically a Canadian government owned range of SuperNet IP addresses, so when they go out Router A to the supernet, they will be seen as public IP addresses. When they go out to the Internet on Router B, they are seen as internal IP addresses. As an example, Router IP would have a gateway LAN IP to it's clients of 184.108.40.206, and our ISP router would be in the same range, but closer to the other end, so 220.127.116.11.
In order to avoid IP conflicts on the SuperNet, each workstation has a series of routes, something along the lines of:
route -p add 0.0.0.0 mask 0.0.0.0 18.104.22.168 metric 1 if 11
(default all traffic to ISP connected router)
This is followed by a series of routes along the lines of:
route -p add 142.149.#.# 22.214.171.124 (SuperNet Gateway for specific IP's).
There are not too many of these, and they don't change, so we set the static routes, and go.
Obviously this is non-standard, but we've done it a number of times to avoid having two network cards on each machine, and to avoid running several additional cables. This has worked consistently in the past, on windows XP machines, and on a few Windows 7 machines (after a tweaking of ArpRetryCount in the windows 7 registry).
What is happening now, suddenly, at one of our networks, is that the new Windows 7 workstations are ignoring the default route, periodically. Immediately after a startup, most of the machines will not appear to have internet or supernet. If you wait long enough, say 10 minutes, it eventually "works itself out" somehow. During the period of not working, if I were to try to ping anything on either the internet or supernet, both respond. This led me to believe DNS issues, but I can actually ping domains on the internet and supernet both, BY Name - with no problems. Viewing in a browser, and telnet on port 80 both fail. This led me to believe a firewall, blocking the port. All firewalls disabled, all additional hardware removed, all routers have firewalls turned off, and still the same periodic problem.
A tracert google.ca - run on the Windows 7 machine during it's non-working time revealed that the first hop attempted appears to be going to Router A (SuperNet). This is what I mean in my Title when I say it appears to be ignoring the default route. Google definitely does not have an IP within the specified ranges in our other routes, and the primary route is to our ISP router (126.96.36.199 in this example). Additionally if you view the Adapter ipv4 settings, the default gateway is listed as 188.8.131.52, but sure enough, every time you restart the windows 7 box and do a tracert google.ca, it goes to that 184.108.40.206 SuperNet gateway first instead. Wait 10 minutes, then it goes to the proper one.
We've replaced pretty much every piece of hardware in the office, including cables, switches and routers.
Final note: There is one remaining Windows XP machine in the office, with all the same settings, and it has worked flawlessly the entire time, ruling out all other issues we could possibly come up with.