• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1481
  • Last Modified:

SBS 2003 / Server 2003 intermittently refuses all outside connections

Hi there,

I'm having trouble with a Small Business Server 2003 machine intermittently refusing all outside connections.  The problem is sporadic and I haven't found anything in the logs that points to a culprit.  Everything will be running just fine and then suddenly all outside RDP sessions are killed and the server will refuse connections on port 25 and 443.  This brings mail/webmail to a screeching halt.

It appeared to be related to DNS (ie, server could not do reverse lookups on IPs of incoming connections), so I made some changes based on some research.  The problem appeared to be fixed, but it keeps coming back.
Changes made so far:
PIX Firewall
1. Changed maximum DNS packet size from 512 to 1518

SBS 2003 Server
in HKLM\System\CurrentControlSet\Servers\DNS\Parameters
1. Added key: EnableEDNSProbes, Reg_DWORD = 0x00000000
2. Added key: EDNSCacheTimeout, Reg_DWORD = 0x00057e40

I've also verified that no other processes are using any ports needed by IPSec.

The problem still happens and lasts for intermittent amounts of time.  I have 3 other environments identical to this one (SBS 2003 behind a PIX firewall) that aren't having this issue.

Any help would be greatly appreciated.

1 Solution
vhcgAuthor Commented:
Wanted to add that all incoming connections from inside hosts are always accepted.  It's only stuff from the outside that has trouble.
If I am reading correctly, you mean connections from outside of your network to your SBS 2003?
Are you using Remote Web Workplace for the RDP connections?
Have you tried to Telnet to port 25 after the failure occurs?
Hi there,

is that the only traffic that gets dropped on its way to the server via the pix? If for example you run an ftp server on the server would that connection drop along with the others?

Protect Your Employees from Wi-Fi Threats

As Wi-Fi growth and popularity continues to climb, not everyone understands the risks that come with connecting to public Wi-Fi or even offering Wi-Fi to employees, visitors and guests. Download the resource kit to make sure your safe wherever business takes you!

vhcgAuthor Commented:

When "the problem" flairs up, all active connections from the outside are terminated (usually my RDP connection -- I manage it remotely) and HTTPS connections (Users using OWA).

My quick and dirty test is to  telnet to port 25 from the outside.  When things are working, the server responds.  When things are not working, there is no response...no response either on any of the other ports that I'm allowing through (443, 3389).

Shreedhar EtteCommented:
- Run SBS 2003 Best Practise Anlyser tool and fix the errors reported.

- Check the System Eventlog of the server for the error events from the source Srv.
Also check if RRAS isn't doing some stupid things...
We had some issues on an SBS 2003 yesterday, and oddly enough I could access the network from the SBS server, but the clients couldn't access the server at all (nor a ping, nor telnet !).

Restarting the server, didn't help...

Removing RRAS, and restarting the server again - did resolve the issue (we didn't need RRAS anyway).
Rob WilliamsCommented:
Not suggesting it is not the SBS but I have seen this happen with bad routers and bad modems. Have you tried rebooting on or the other when the problem exists to see if connections are quickly restored?
vhcgAuthor Commented:
Well, the problem still persists.

Based on all of the feedback (thank you), I did the following:

1. Ran the SBS Best Practice Analyser -- no major problems found.
2. Stopped the RRAS server (but haven't removed it)

The problem is definitely worse during business hours when the server is more busy.

Doing my quick test of telnet to port 25, I'm seeing two different behaviors.

1. The connection is refused right away
2. The connection times out
3. Scoured through the event logs  -- nothing found

I'm fairly certain that it is not the PIX firewall since there are several servers behind it and I am able to access all of them without any trouble.


vhcgAuthor Commented:

DNS appears to be part of the picture.  The problem is happening right now and I can't do external lookups (by name or by IP) using nslookup on the SBS server.  I've stopped and started the DNS Server  service, but the lookups still timeout.  As forwarders, I'm using the 2 servers supplied to me by my ISP.  I know they are working because when I configure another server (which has a different public IP) on the network to use the ISP DNS servers, the lookups work just fine.

This is maddening...

Rob WilliamsCommented:
I doubt DNS is the issue, but rather a side effect. PC's would use the SBS for DNS, and external DNS domains would be resolved using forwarders. If the internet were not available, DNS could not resolve the names, as it could not access the forwarders.

Based on your updated information about the other servers it seems apparent there is a disconnect between the SBS and the router.

It also is not likely a software issue where it is intermittent.

I assume connections between the PC's and server are maintained, just internet is lost?
One NIC or 2 on the SBS? If 2 it could be a bad NIC or driver.
Also if 2 NIC's can you change the patch cable for the WAN NIC and switch port? Patch cables are the #1 point of network failures and you can have bad switch ports locking up.
If you set either the NIC or switch port to a fixed speed and/or duplex and leave the other as auto, you can have the port freeze. Both have to be set the same, generally auto is best.

It is also possible you have a virus on the server which is consuming all bandwidth for a period of time.
vhcgAuthor Commented:

As an (awful) workaround, the PCs are currently setup using the SBS server as the primary DNS server and the ISP's DNS servers a the secondary and tertiary servers.  I know this is not optimal and not recommended, but every time DNS would flake out on the SBS machine, the whole company would lose their Internet access and there were some very angry/upset people.  

There are 2 NICs in the SBS, but one of them is currently disabled.  I will give the network settings suggestion a try.

Also, regarding the virus.  I've thought about that, too.  The server is protected via Trend Micro, but I've seen stuff slip on (on the workstations) that Trend Micro was not able to catch.  Can anyone recommend a site where I can do a free scan of the server as a sanity check?  Over the weekend, things will be quiet and I should be able to work on the server without interruption.  

Thanks again for all of the input.  I appreciate it very much.

Rob WilliamsCommented:
>>"As an (awful) workaround"
That is an understatement :-) The #1 mistake made with DNS in a Windows domain is to have a router or ISP added as a second or theirs DNS server. DNS WILL FAIL. DNS in windows does not work as one would expect. You would think a DNS request would be resolved by the first listed server, if it fails, move to the second, and so forth. However it soes not work that way. As described by another expert here, it is more of a shotgun affair with the first one to respond being accepted. Thus if an ISP responds first, which is often the case, your DNS resolve will fail.
This results in all sorts of network related issues and performance issue, many of which seem totally unrelated to DNS.
I have never understood if the SBS is off line, and thus file, printer, and e-mail access, why Internet access is so important. However that is an end user issue not yours and mine :-)

Regardless this doesn't sound like DNS. Even with the mis-configuration the external connections should still remain connected.

As for the "network settings", as mentioned best to have NIC and switch (if a managed switch) set to automatic.

For the virus you might want to first try  netstat -an   from a command line and see if there any "established" connections you cannot explain. As for on-line scans I have used House Call, but it is by Trend Micro so it probably uses the same database  http://housecall.trendmicro.com/?WT.seg_2=2009HP_HouseCall
You could download free Malwarebytes. It works well  www.malwarebytes.com.
If you suspect a root kit you could try Gmer, but it tends to require more user intervention/scrutinizing.
Posting a question in the Anti-virus zone might provide better suggestions for that: http://www.experts-exchange.com/Internet/Anti-Virus/
vhcgAuthor Commented:

Thank you for the continued suggestions.

I ran 'netstat' as recommended and see lots of http connections that I don't quite understand.  http is not open to the Internet on this server.  The only ports open to the Internet are ports 25 (to all), 443 (to all), and 3389 (to me and a few other select trusted sites).

  TCP    mailsrv:9578     ESTABLISHED
  TCP    mailsrv:9579           nuq04s01-in-f165.1e100.net:http  TIME_WAIT
  TCP    mailsrv:9580       ESTABLISHED
  TCP    mailsrv:9582       ESTABLISHED
  TCP    mailsrv:9583     ESTABLISHED
  TCP    mailsrv:9585           nuq04s01-in-f156.1e100.net:http  ESTABLISHED
  TCP    mailsrv:9589           nuq04s01-in-f165.1e100.net:http  TIME_WAIT
  TCP    mailsrv:9591           scaler01-cts.netline.com:http  TIME_WAIT
  TCP    mailsrv:9597           nuq04s01-in-f148.1e100.net:http  TIME_WAIT
  TCP    mailsrv:9599       TIME_WAIT
  TCP    mailsrv:9600       ESTABLISHED
  TCP    mailsrv:9601       ESTABLISHED
  TCP    mailsrv:9602       ESTABLISHED
  TCP    mailsrv:9603       ESTABLISHED
  TCP    mailsrv:9604       ESTABLISHED
  TCP    mailsrv:9605       ESTABLISHED

I don't see the same behavior at other sites that are setup the same way.  

Rob WilliamsCommented:
These are outgoing connections which are allowed by default.
Http would imply, but not necessarily, mean they are web pages connecting to a site or service.
Might this be the case? is in Freemont California, and in Wichita, but no idea what they are.
They could also be a service that updates like a DDNS service, or even Windows updates, but I am not sure of what ports they use.
As for ____1e100.net see http://www.pcmech.com/article/the-mysterious-1e100-net/
vhcgAuthor Commented:

Ok, I had the server moved.  There was quite a bit of trouble with it today.

Old Config:
Internet <-> PIX 506e <-> Cisco SW #1 <-> Cisco SW #2 <-> SBS Server

New Config:
Internet <-> PIX 506e <-> Netgear "dumb" L2 switch <-> SBS Server

I wanted to eliminate any possibility of "funny stuff" happening on the Cisco Switches (a pair of 3560Gs).

The complete absense of anything in the Event Viewer is baffling and makes me think that whatever is happening is external to the server.  If services were randomly shutting off and connections were randomly being refused, I would expect to see at least *something* in the Event Viewer.  There has been absolutely nothing beyond routine Informational messages.


Rob WilliamsCommented:
Can you confirm the following:
-Server has 1 NIC. Second is disabled, not just disconnected or not configured.
-When the problem exists the server and clients cannot connect to the Internet but local clients can still access file shares on the SBS
-When the problem exists external connections such as incoming e-mail, and RDP sessions are dropped
-When the problem exists can the SBS or a client machine access a web page using the IP (by passing DNS), such as Google

This actually sounds like the server has two NIC's, configured with two gateways and one not connected. Windows will switch to the gateway with the lower metric if the first is lost for even a split second, but it does not switch back.
vhcgAuthor Commented:


All of what you listed is confirmed.  When the problem is present, the client machines can talk to the SBS server and it can talk back, but the SBS server cannot do anything beyond communication on the local network.  It's like the default gateway is suddenly gone/forgotten.

However, unless there is something hidden somewhere, I do not see another default gateway.

I've attached some debugging commands from the command prompt.

The SBS server has IP address  There is a Server 2003 machine on the network with IP address  The PIX is the default gateway with IP address

The server is currently inaccessible, but when it does come back up, I can put WireShark on it and do some packet captures.


C:\WINDOWS\system32\drivers\etc>ipconfig /all

Windows IP Configuration

   Host Name . . . . . . . . . . . . : xxxxxx
   Primary Dns Suffix  . . . . . . . : abcdefg.local
   Node Type . . . . . . . . . . . . : Unknown
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : Yes
   DNS Suffix Search List. . . . . . : abcdefg.local

Ethernet adapter Server Local Area Connection:

   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Broadcom NetXtreme Gigabit Ethernet
   Physical Address. . . . . . . . . : 00-C0-9F-BA-C5-4E
   DHCP Enabled. . . . . . . . . . . : No
   IP Address. . . . . . . . . . . . :
   Subnet Mask . . . . . . . . . . . :
   Default Gateway . . . . . . . . . :
   DNS Servers . . . . . . . . . . . :
   Primary WINS Server . . . . . . . :

C:\WINDOWS\system32\drivers\etc>netstat -r

IPv4 Route Table
Interface List
0x1 ........................... MS TCP Loopback interface
0x10003 ...00 c0 9f ba c5 4e ...... Broadcom NetXtreme Gigabit Ethernet
Active Routes:
Network Destination        Netmask          Gateway       Interface  Metric
      1      1     20     20     20     20      1
Default Gateway:
Persistent Routes:


Pinging with 32 bytes of data:

Request timed out.

Ping statistics for
    Packets: Sent = 1, Received = 0, Lost = 1 (100% loss),

Tracing route to over a maximum of 30 hops

  1     *        *        *     Request timed out.
  2     *        *        *     Request timed out.
  3     *        *        *     Request timed out.
  4     *        *        *     Request timed out.
  5     *        *        *     Request timed out.
  6     *        *        *     Request timed out.
  7     *        *        *     Request timed out.
  8     *        *        *     Request timed out.
  9     *        *        *     Request timed out.
 10     *        *        *     Request timed out.
 11     *        *     ^C


Pinging with 32 bytes of data:

Reply from bytes=32 time<1ms TTL=128
Reply from bytes=32 time<1ms TTL=128
Reply from bytes=32 time<1ms TTL=128
Reply from bytes=32 time<1ms TTL=128

Ping statistics for
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 0ms, Average = 0ms


Pinging with 32 bytes of data:

Reply from bytes=32 time<1ms TTL=255
Reply from bytes=32 time<1ms TTL=255
Reply from bytes=32 time<1ms TTL=255
Reply from bytes=32 time=1ms TTL=255

Ping statistics for
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 1ms, Average = 0ms

Open in new window

Rob WilliamsCommented:
That certainly all looks OK.
Wireshark may tell you something but it takes a while to filter out what you are looking for, especially when you don't know what you are looking for :-)

If I were to stick on the default gateway theory..
Try opening the registry editor and go to:
Look through the settings in each of the subfolders and see if there is a gateway listed that is wrong. Ignore any that are or ASCII. I don't know that i would recommend at this point deleting, but if present that subfolder maybe a remnant of an old/Ghost NIC.

vhcgAuthor Commented:

RobWill:  The registry entries check out ok.

All was fine last night, but trouble has started again this morning.  I checked the firewall, did packet captures, checked the DSL modem, etc, etc.

Rebooting the server has temporarily fixed it.  I don't know how long this will last since the server was rebooted at 8am this morning to apply a couple of patches (IE8, Latest 'Malicious Software Removal' tool).  

It sure feels like an intermittent loss of the default gateway, but the system shows otherwise.  I've disabled 'Routing and Remote Access' to remove it from the picture.  No dice there.  Could IPSec be silently 'intervening' at times ?

When the system is not able to get to the outside (or let the outside in) Is there a particular thing I can try while I have Wireshark running?  So far, capturing of pings, traceroutes, etc, just shows that a packet leaves the server and does not get an answer.


Rob WilliamsCommented:
You mention RRAS. Do you use the VPN feature of RRAS? If so I wonder if it could be using a static address pool that conflicts somehow, or I have seen DNS "confused" by an incorrect VPN configuration while a user is connected.

I don't know what you would look for with wireshark while the connection is lost. Usually you take a capture and then start filtering out the traffic you can confirm is acceptable and look at what is left. But it may not show anything.

Another completely different thought is the possibility of an outside denial of service attack. The most common occurs when you allow/reply to ICMP (ping requests) from the Internet. This is off by default with the PIX, but might that be enabled?
vhcgAuthor Commented:

I think I got it.

I've been checking the configs of all of the network devices and when I looked at the status of the DSL modem (which also has a 4 port switch) built-in, I saw *two* ports active.  Only one (the PIX) should be active.  

The modem has a label on it the IP address, netmask, and default gateway entries.

Someone jacked into one of the ports and setup himself up with a static IP...the SAME IP as the public IP of the SBS Server.

I thought I had this licked numerous times before, but I'm fairly certain that I've got it for sure this time.  

I'll report back in a day.

Thanks to all who responded (especially you, RobWill for hanging in there).

Rob WilliamsCommented:
>>"Someone jacked into one of the ports and setup himself up with a static IP...the SAME IP as the public IP of the SBS Server."
That would definitely create havoc. It can lock up the modem if not the router.

I have seen this at universities with students trying to use protected networks. They get an IP off an allowed machine and clone it. Better yet they often clone the MAC. Switches love that :-)

Thanks for updating. Hopefully you are on to something.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now