How do I debug (and fix) an intermittent communications problem?

 I have a server running windows server 2003 R2.  This server is also running Exchange 2007 SP1.  The server runs fine for about 16 hours then looses some of it's communications abilities.  

  It can still communicate with other servers on our network.  I can ping it's default gateway and our VPN / firewall.    I can run a tracert to and it works.

  Trying to use a web browser from this system fails as does any attempt to send email out.  

  When the system is in this state, simply rebooting it fixes the problem until it happens again.

  I am not a guru at getting under the hood in Windows and diagnosing this kind of thing by looking at log entries.  I believe that the issue is with Windows itself since other forms of communication are affected.

  What I would like is a set of tests I can run in order to determine what is causing the blockage, then what I need to do to avoid it happening in the future.
LVL 21
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

So if you use a web browser / send emails, those fail..but you can still ping equipment on your network?

Is there anything at all in the event log?..either in application or system?
developmentguruPresidentAuthor Commented:
I cannot ping, it times out.  I can tracert it.

The application event log had the following error

Microsoft Exchange couldn't find a certificate that contains the domain name in the personal store on the local computer. Therefore, it is unable to support the STARTTLS SMTP verb for the connector Polydeck using with a FQDN parameter of If the connector's FQDN is not specified, the computer's FQDN is used. Verify the connector configuration and the installed certificates to make sure that there is a certificate with a domain name for that FQDN. If this certificate exists, run Enable-ExchangeCertificate -Services SMTP to make sure that the Microsoft Exchange Transport service has access to the certificate key.

For more information, see Help and Support Center at

As far as the system the only warnings or errors are related to printers.
themightydudeCommented: you can tracert, but ping times out..does it resolve to a IP?

Network setup is:

Internet --> Firewall / VPN --> Switch --> Servers / computers?

What do you use for DNS servers?
Powerful Yet Easy-to-Use Network Monitoring

Identify excessive bandwidth utilization or unexpected application traffic with SolarWinds Bandwidth Analyzer Pack.

developmentguruPresidentAuthor Commented:
TraceRt does resolve to an IP address.

You are correct on the network setup.

We have two internal DNS servers both windows 2003 server R2.

We have had the error I posted since I posted it and there was no associated shut down of communications.
How long has this been happening with not being able to use a web broswer to get out?

Any recent changes / upgrades?

New DNS server entries etc etc?

Does ping resolve to an IP address?
developmentguruPresidentAuthor Commented:
--How long has this been happening with not being able to use a web broswer to get out?--

We just found out about it in the last couple of days.  We have had instances where the server has acted this way of the last month or so, just not this frequently.

--Any recent changes / upgrades?--
  We did make a change to the server to activate a second NIC to tie it to our SAN.  We then moved some of our files from the server hard drives to the SAN.

--Does ping resolve to an IP address?--
Ping, from the server while it is in this state, times out.

We did a little digging and found out that our security software (Panda Security) had somehow been tied to the SAN's IP address.  I could see the constant activity being viewed as an attack and the security software shutting down communications.  We have since removed it from that IP address and the server has not shut down yet.  If it is still running, continuously, this time tomorrow then it is likely solved.

One thing you can still do to earn the points is to give me some tests to run (other than what I have mentioned).  Tests that would allow me to see if SMTP can get out, or any other protocols you can think of.  Tests that will show error results would be best.
Ahh..sounds like that might of caused some it enabled on the other NIC as well?

A simple exchange SMTP test you can try is to login to the server then telnet to a mail server from that server.

for example:  telnet 25

You should get some sort of welcome banner from the mail server you telnet into.

To make sure SMTP is working fine on your server...telnet to your mail server from any machine inside or outside of your network using the same method.

You should get some header with "Microsoft ESMTP MAIL service, Verxion xxxx ready at : xxxx"

all of that should be followed by 250 - xxx

If you get that back, then you know the SMTP service is working alright.

Have you tried turning on diagnostic logging for SMTP in your exchange server?

Also, you might try downloading the microsoft exchange can help to point out any potential or current problems as well.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
developmentguruPresidentAuthor Commented:
One NIC goes to our network.  The other NIC goes to the switches (fabric) that only goes to our SAN.  The only one with the security software running now is the one that goes to our network (as it should).

can you give me an example of an external site to try the SMTP telnet with?  This would only be used to verify that the communication is being passed.

Do you have a simple test you use to check FTP?  Sorry if I sound like a newb on all of this, for some things I am.
your wanting to see if FTP is running on the server or your wanting to see if you can get out via FTP from the server to another one?

If your just checking to see if you can get out..just open this up from your server.

If you can get to that then you can get out via FTP from your server.

In regards to a external site to try SMTP with:

telnet 25

then type in  ehlo

you should get something similar to:

220 Sending unsolicited commercial or bulk e-mail
to Microsoft's computer network is prohibited. Other restrictions are found at h
ttp:// Violations will result in use of equipment loc
ated in California and other states. Thu, 9 Jul 2009 14:03:36 -0700
ehlo ( Hello []
250-SIZE 29696000
250 OK
developmentguruPresidentAuthor Commented:
Thanks for some of the testing tools, here is the latest.  The last time the server got into this state I tried some of the tests.  Web requests would not go out on the server but worked well from any other system we tested.  I could ping internal addresses but not external (from the server in question).  Pinging external (or any of the other tests worked fine from a windows XP system on the network.  I was able to send myself an email from hotmail and receive it in house, but SMTP from the server would not function going out.    Tracert timed out.  FTP tests worked from other systems, not the server.  I could do the telnet SMTP test to our exchange server in house and it worked.  Hopefully this info gives you a place to start...
developmentguruPresidentAuthor Commented:
I was also able to use Outlook Anywhere web access to get into emails.  It would allow me to send internally and queue anything I tried to send external.
hmmmm...this is very strange.

It's going to be something specific with that server then since I assume all your other workstations and what not use the same firewall and DNS servers as the server.

To sum up this problem..anything inside of your network is can talk to anything on your network from that server..but if you try to talk to a computer outside of your network from that server, you get nothing.

When it does this again...disable the security software on the network facing NIC..just for a few minutes.

I assume this server has a static ip correct?

You might also do a  route print from the server before the probelm happens, and then again when the problem is occuring.

Also, if none of the above helps when this happens, try disabling then re-enabling the NIC instead of rebooting the server. For the hell of it, you might try resetting the TCP/IP stack...

 netsh int ip reset c:\resetlog.txt

is there anything in the firewall logs about blocking outbound traffic from that server ip?
developmentguruPresidentAuthor Commented:
Thanks for all of your advice, I will add this to my knowledgebase as you have given me some new tricks to try.  I had someone from Panda Security remote in and look around.  What we found is this:  I was right to suspect the other NIC but wrong as to why.  The second NIC (that runs directly to the SAN switches) had the default gateway set up. For whatever mysterious MS reason this worked well for several weeks.  Just recently MS decided to try rerouting the network traffic through the SAN!  We removed the default gateway from Local Connection 2 and all went back to normal.  I will flag the posts you put on here that I found useful as the solution (it has worth to me in future similar situations).  I wrote this to be sure the fix was included for anyone trying to find it in the future.

Do not put a default gateway on a NIC unless you want traffic rerouted through it!  This is, I am sure, obvious to everyone who has been in networking any period of time.  It is not obvious to a programmer like myself.
Glad you got figured it out.

That actually is new information to me as well...I would have assumed different default gateways on 2 or more nics would not have affected anything. Especially since one is on one network, and the other on a different network.

It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2003

From novice to tech pro — start learning today.