Link to home
Start Free TrialLog in
Avatar of BobByers
BobByers

asked on

W2K3 loses internet connection

I'm try to replace a W2K EMail Server with a W2K3 Email Server because the 1st server is receiving a message to the effect that the disk is failing.  My problem is that he W2K3 server will only run for a couple of hours before refusing to receive transmissions FROM the internet.  At that point I cannot telnet to the server for either port 25, 110, or 3389.  I can, from the computer, connect to internet web sites.  The computer is a dual homed machine that belongs to a local domain.  The W2K3 server is set up to be as identical as possible to the W2K server.  

In process, here's what I'm doing.  I rename the old mail server and then shut it down.  I rename the new mail server to the orignal name of the old mail server and shut it down.  I unplug the W2K box from the Linksys router that connects our system to our DSL supplier.  I connect the W2K3 box to the same port on the router that the W2K server was connected to (the two boxes have the same hardwired ipaddresses).  I restart the W2K3 computer.  WAN computers can see the new box and can telnet to the smtp port.  Wan computers can also connect to the box for admin access via RDP.  This all works for about an hour or two.  At that point, everything fails.  Outside systems can no longer telnet in on port 25 or 3389.   The mail server software is still operational, one can connect to it on the local LAN.  And, it seems to be able to send mail to the outside world.  However, the outside world cannot connect to it.  I had to shut down the new mail server and restart the old one.  After a renaming operation, all is well ... but the disk is still reporting that failure is imminent.  The router seems to work with the W2K server, but has problems with the W2K3 server.  

The server is W2K3 SP1.   I do have a Terminal Server that will only work if I remove the SP1 upgrade.  I'd do that on the mail server, but SP1 was built into the install.  

HELP
Avatar of CSTN
CSTN

I would start by looking at the router. You might want to check your port forwarding settings to verify that they are properly pointing to the 2K3 server, although they most likely are as you can, sometimes, reach it.
Other things to look at are firmware updates for the router:
http://www.linksys.com/support

Failing that, check the router log (you may need to enable logging, it is usually off by default) and see what it says at the time of the drop-off.
Avatar of BobByers

ASKER

Sounds right, BUT the router continues to work just fine with the W2K server.  I'm actually beginning to think that Linksys routers don't work well with W2K3 servers ... at least those with W2K3 SP1.  I have a terminal server that I cannot get to work with a linksys router and SP1,  rolling back to W2K3 no service pack works just fine.  I'd try that on the mail server, but the damn software came with SP1 - can't be undone.  

Have tried another Linksys router with this box with similar results.  Does seem to be related to the router, but WHY does W2K work without a hassle.  Am beginning to think that the simple solution is to overwrite W2K3 SP1 with W2K.  I'm willing to bet real money that it will work.  

BTW, this is why this problem is worth 500 points.  The problem is not logical.

Have you looked at the firmware update? Is it possible to pull the NIC from the 2K server and use it in the 2K3 server. Or use another NIC in general.
For sure the Port Forwarding is setup on the Linksys to point to the Correct IP of the Win2k3 Server?

So 25,3389 Etc point to new 2k3 Server IP?

2003 Is Sensative to MTU of Linksys and other routers.  Your DSL should have a MAX MTU of 1492 in the Linksys.  Standard Linksys this setting is on the Advanced Page and/or depending on firmware version the Firewall Page.  Instead of Auto or 1500 for the MTU - Set it to 1492.

Also, did you enable the Internal Firewall on the Win2k3 Server?  If so, you must open the ports on the firwall for such services.  By default the Firewall is not active on Win2k3.  It is a good idea to use it considering a 2k3 server behind a Linksys to add another layer of security, just remember to open the ports.

For Terminal Services Admin Mode?  Simply go to My Computer - Properties - Remote and Enable Remote Connections.  Make sure you specify yourself as a user since Admin by Default is only allowed.  You also need to make sure you are a Domain Member otherwise if you are a local user only, you need to add in manually and point the user "search" to the local Machine.  Also make sure you are port forwarding 3389 to the IP of the 2k3 Machine.

D>

One Last thing - you did shut off the Linksys and any local switches after the change over right?  This is to clear the ARP cache and any confusion between same name - IP and different MAC address.  Linksys - only method is to leave it off for 5 minutes or so - managed switches - you get the option to force clear arp.
You Also mentioned Domain - so you are using DNS.  Make sure the DNS settings reflect the new servername and IP address.  You should also clear cache on DNS and on 2k3 server - ipconfig /flushdns.
D>
All good suggestions. BUT

1.  Firewall is disabled
2.  Am trying to substitute one box for another.  IP address and name of W2K3 computer is the same as IP address and name of W2K computer.  After W2K3 computer "fails", I rename the boxes once again and the W2K box works fine.  I agree that the MTU may be the problem, but don't understand why W2K is OK and W2K3 isn't.  Maybe MS is screwing with us again... what a surprise.  
3.  Also, W2K3 box works for about an hour or so.  Which supports the MTU thesis.  Have tried switching boxes, using a different (brand new out of the box) linksys router and got the same results.  
4.  If port forwarding wasn't correct, the system wouldn't work for an hour.
5.  These are dual homed machines.  ldentically configured.  I have to believe that the problem is with the W2K3 Server.  But, I don't understand just how nor who to overcome the problem.  

6.  Not an easy problem
Avatar of dlongan
Since you mention the systems are "dual homed"  I am assuming two nics, are they both going to the same switch(teaming) or are you using the w2k3 server as a firewall/router?

If you just using them as (teaming) try disabling one and see what happens.
Also you can test the MTU theory by issue the following command during a failure

ping "your wan ip address" -l 1492

This forces the largest packet to be transmitted.  You can try pinging other hosts on the net, but you need to understand lots of sites disable icmp.  You can also try changing the MTU size (1500 max).
SOLUTION
Avatar of Dushan Silva
Dushan Silva
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
There are two NICs.  One connects to the Linksys Router that connects to the Internet.  The other connects to the local network.  The network configuration is identical to the W2K server that works correctly.   Each nic has its own ip address.  The one connected to the Router is assigned by me.  The other is assigned by the DCs DHCP server, and is a reserved address.  Each NIC has a gateway address.  The W2K Server seemed to be OK with that, so I've left it that way on the W2K3 server.

There are no firewalls active ... only the router with acts as a NAT firewall.  

There are no problems accessing the mail server from the local network.  The problem is accessing from the WAN side, and the stoppage occurs after a couple of hours of good operation.  


couple of things,

when the issue is present, can you telnet to port 25 locally from a client to the ip on the external side of the w2k3 server?

also can you provide more info regarding how each nic IP is setup.  for example:

nic 1 ip - 192.168.0.2 mask 255.255.255.0 gw 192.168.0.1 dns x.x.x.x

have you created any static routes?
I'm not sure about how to set the MTU size.  The router is currently set for MTU Auto.  If I ping from my home, the standard ping ... ping "my office" takes about 20+ MS.  The largest successful ping is ping "my office" -l 1464 and this takes about 90+ MS.  any larger buffer size times out.  Again, the current MTU Setting seems to work for the W2K box, but not for the new W2K3 box.  

I'm assuming that I should be pinging the office router from somewhere on the Internet.  Right now I'm not having any failures since I'm running the 2K server until I understand what's causing the failure on the W2K3 system.  My boss gets really excited when email doesn't work.



dlongan

When the issure is present, you cannot telnet to port 25 through the linksys router from a client computer of from a remote computer on the internet.  You can telnet to 25 using the local address

NIC 1 connects to the Linksys.  192.168.1.1 mask 255.255.255.0.  gw and dns are 192.168.1.1 (this is a fixed ip system on the wan side, and the dns settings are those of the provider.  The DNS does work well since I can disconnect NiC2 and have no problem connecting to internet web sites.  BTW, this is true even when one can no longer initiate a transaction from the internet to the mail server.

NIC 2 connects to the Domain lan.  192.168.0.8 mask 255.255.255.0.  gw and dns are 192.168.0.1.  
I meant to try to telnet to the ip of nic 1 using.

If you set the MTU size for a network interface manually, this setting overrides the default MTU for the network interface. The MTU size is the maximum packet size in bytes that the transport will transmit over the underlying network.

This method affects packets sent to all destinations and may significantly affect the performance, depending on the MTU size that you set.

To set the MTU size for the network interface, follow these steps:1. Click Start, click Run, type regedit, and then click OK.
2. Locate the following key in the registry:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\<ID for network interface>
3. On the Edit menu, point to New, and then click DWORD Value.
4. Type MTU, and then press ENTER.
5. On the Edit menu, click Modify.
6. In the Value data box, type the value of the MTU size, and then click OK.  
7. Quit Registry Editor, and then restart the computer.
Well, have tried everything that all of you have suggested.  Here's where I am.  I have three Linksys routers.  All of them work fine with a W2K Server, and a W2K3 Server (no service pack).  None of them work with the W2K3 Server with SP1.  

I bypassed the Linksys router and set up the W2K3 SP1 Server to accept the internet input directly (changed the IP address).  Everything worked fine.  So, I think there must be a problem with the W2K3 SP1 server and the Linksys Routers.  I've got several options

1.  Try a differenct brand of router.  Sounds dumb, but it's cheap, quick, and easy.

2.  Dump W2K3 and revent to our licensed copy of W2K

3.  Use a software firewall.

Any other ideas?
Hey Bob,

Good job in narrowing down the options.

I just would like to confirm if you are able to telnet port 25 to NIC1 (wan side of server - 192.168.1.1).  From the 192.168.0.x lan and also from the 192.168.1.x lan.  When you telnet try both the ip address and FQDN of the server.

I have several clients setup with w2k3 sp1 and exchange 2003 behind linksys routers without any issues.  The only difference is I don't use NAT on the server.  Since the router provides this, I don't add this into the picture.  Do you need to have this?

I'll be doing some more research and will get back with you.
Even though the server is able to access the internet and local lan during the outage, I would confirm how the routing table is before and during the outage.

I would also recommend that you DO NOT define a default gateway on the LAN NIC - only on the wan side and should be set to the LAN side of your linksys.

To view the route table - drop down to the command prompt and type "route print" without the quotes.

If you can please post the route table entries.
Thanks dlongan

I don't have an easy way to get to the 192.168.1.1 connection, but will try it with my laptop this morning.  The way I have this set up, each server with an internet connection has its own Linksys router.  The only purpose for the router is to provide the NAT translation "firewall" for each server.  There is no NAT operation on the server itself - only on the router.

I've never had a problem with the Linksys routers and connectivity when I had all W2K servers.  Did not have a problem with K23 on the terminal server until we loaded SP1.  As soon as I loaded SP1 we started having terminal server connectivity issues.  Read all of the articles on the problem on the MS support site and applied all of the fixes, and it still didn't work.  Rolled it back and haven't had a problem since.  I've never been happy with this solution since there are security fixes in SP1 that I think we need, but it is tough to experiment with a production system.  

Bob,

Sorry -  I re-read your post regarding NAT and you did say the linksys.  Anyway do you have routing and remote access setup on the w2k3 server?

I found the regarding the differences what 2003 sp1 does with tcp/ip

http://technet2.microsoft.com/WindowsServer/en/Library/46d89e98-096b-4a66-8099-feee33d91e0e1033.mspx

One change "Automatic determination of the interface-related and default route metrics" kinda sounds like something.
I don't have routing and remote access setup.

I've always been able to access the web from this box and have now discovered that I can get reliable outside in connection through the router IF I UNPLUG THE LOCAL AREA NETWORK CABLE.  As soon as I plug it back in it loses the outside in connection.  

HOWEVER, if I set the router's nic settings to be the internet address of the router (i'm not saying this well) .. i.e. I set the internet nic to 71.x.x.x  I can connect to both the internet and the local lan and I can access the computer via the internet ... I can RDP to the system, and I can telnet to port 25.  

The local network address is 192.68.0.x.  The router is set to 192.168.1.1 with the nic address at 192.168.1.3.   Have tried setting to 10.0.0.1 and 10.0.0.3 respectively.  That didn't help either.  Tried the DLink router (which is a little nicer to setup) and it didn't help.  Will read the technet article now.
Bob,

Which model linksys router are you using?
At the moment I'm using an EBR2310.  I had been using BEFSR41 Linksys.  Last Linksys that I tried was a BEFVP41 v2.  
do you have them setup as a "gateway" or "router"
I didn't know that you could set them to be one or the other.  I'd guess as a gateway.  How do you set it up as a router?
At least on the linksys you can access this via the web interface and it located under:

Advance - Dynamic Routing

You most likely want it set to "gateway" this way it will provide NAT.  The reason I asked is according to your post about giving your server a public IP and you were able to access the net made it sound as if your router was not configured for NAT.  I am some what confused on your findings regarding the network cable/ip address changes.
Found it.  The Linksys routers are all set to Gateway.  There is no corresponding setting on the DLink

Let me take another shot at explaining.  Works better if you can draw pictures.


    ----->71.x.x.116 ---> Router <- 10.0.0.1 ------------->10.0.0.3 Nic1       Nic2 192.168.0.8  <---
            255.255.255.248           255.255.255.0             255.255.255.0              255.255.0.0
         gw 71.x.x.118                                              .gw 10.0.0.1                    gw 192.168.0.1        

     
Wherever I had 10.0.0.x, I had also tried 192.168.1.x

If I take the router out of the system and just configure Nic1 with the 71.xxx address, the system works properly.  I can access the computer from the internet with the local area network and the internet connected.  If I reset everything as shown above I can only access the computer from the internet if the cable on Nic2 is disconnected
Bob,

I know what your trying to accomplish, but if you take out the "mutlihome" part of the equation I really think things will function correctly.  It the "multihome" functionality thats causing the issue if it does not give you any benefit, why continue trying to make it work?
I do need to access this server from the internet and from the local network.  We have outsiders who send mail to our system and we have local users who do not have internet access, but need to send mail within the company.  Also, we have Asset Management software that again is accessed from the internet and by local users who do not have internet access.  I think this could be accomplished in other ways, but that would require a lot of changes to the existing system.  

This problem was a big surprise to me - and a bit of a black eye as far as my boss goes.  I hadn't expected the problem since the server that I'm replacing is configured exactly this way, as is our terminal server.  I had expected to have about a 30 minute changeover last Monday morning.  

And, the multihome feature seems to work if I remove the router.  I guess I've got some decisions to make.
The picture is more clear,

I setup servers all the time behind routers (linksys and cisco) configured using NAT and using a single nic configured with non-routeable ip addresses.  Then depending on costs (clients choice) I would setup RRAS on the server or use a vpn based router to allow remote access via VPN tunnels.

Unless you are trying to create DMZ zone (public ip addresses protected by a firewall) then there really isn't any need for a multhhomed config.

I do need both putlic and private access to this computer.  Public access because we have internet usage of the mail server and private use of the mail server for the users who are not allowed to have internet access.  But, I only need ports 80, 25, 110, and  32000.  I would also like port 3389 since that gives me access to the system in the event the Terminal Server is acting up.  

Are you suggesting that I can connect the Gateway router like this

            71.x.x.x --------->| router  gw 192.168.0.254|< ----------->--local network
                                                  port fowarding to mail server only
                                                  only allow external access for mailserver
I do have a wingate gateway router at 192.168.0.1,  How would I force the outgoing mail to use this router instead of the wingate router.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I think I've got it.  It is now setup just as before EXCEPT I have changed the DHCP assignment for the Local Area network to a fixed IP assignment.  I removed the local area gateway assignment and left it blank.  AND, I set the local area network nic so that it's DNS Server is the Domain controller and the alternate is the D-Link router.  I can now access the computer from the internet and from the local network.  Interestingly enough, I can no longer get a reliable RDP connection using the local network addresses.  

Am cheating a bit on the domain because I have a reseved dhcp address for this server and when I set it to fixed, I retained the address.  

I am not understanding the following:

"DHCP assignment for the Local Area network to a fixed IP assignment"
"local area gateway assignment and left it blank"

When I setup a DHCP server I usually divide my IP addresses up for example - 192.168.1.100 through 192.168.200 are defined as the "address range for distribution".  This provides 102 IP addresses for dynamic assignment.

Now you could add an "Exclusion list" but you do not have to.

Now any IP address outside of the "address range for distribution" are what are considered "Static".  You can assign them to any device that you don't setup using DHCP.