Best way to Determine source of intermittant Network Problems

Posted on 2013-06-01
Medium Priority
Last Modified: 2013-06-04
Hi All,

I have two windows servers in a remote data center, and there are issues connecting to them.

The Servers have DNS, Websites, MySQL and Email Services on them.

Access to websites is NOT a problem. This type of connection works perfectly.

MySQL seems to work flawlessly too.

DNS... I am not sure, it seems OK, would like some (non-single) tests I can perform.

RDC. Occasionally I am disconnected - even if only briefly. Sometimes, I try to connect, and it fails instantly, then connects instantly next attempt.

Email... This is a MAJOR issue. SMTP and POP are not working every time, but also do not fail every time:

Desktop > Office Server > Problem Server = Works fine every time.
Desktop > Problem Server > Other Server = Works fine every time.
Desktop > ISP > Problem Server = Email goes missing..!
Desktop > Other Server > Problem Server = Delays, but only sometimes.

As this seems to have affected the two servers at the same time, I suspect this is a network - related issue. On Server is 2003 SP1 the other is 2008r2 SP1

I have done tracert, which shows no problems, but then this is an intermittent problem

Anyone have any tests I can run to see it I can find the problem?

Question by:G_H
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3

Assisted Solution

IanNoble earned 2000 total points
ID: 39212817
If you have access to firewalls and routers in between, check the logs of those for anything blocking email related traffic (straight blocks or over size rules), and on the routers that NAT is working as you expect.

On Cisco, in enable mode, this would be

show log | inc smtp
show log | inc pop
show ip nat translations | inc IP ADDRESS OF EMAIL SERVER
LVL 11

Author Comment

ID: 39213155
I do not have access to any of the "network" hardware - these are dedicated servers in a large (well known) data center.

I am looking for tests I can run from various servers outside that data center which can show connection problems.

A great example would be:
ping -n 1800 ServerName
... This however works fine, and shows only 1 loss...

Is there another equivalent I can use? - Specifically on Mail Ports?


Accepted Solution

IanNoble earned 2000 total points
ID: 39213162
http://mxtoolbox.com/diagnostic.aspx is a good place to start.

If using ping, adding -l 2000 will help check that the routers are handling packet fragmentation correctly e.g.

ping -l 2000 servername

(Lower case L)

You can also use services such as www.site24x7.com that will routinely do email tests (from basic smtp is responding, to sending emails and logging in to retrieve it, to monitoring exchange itself - depending on requirements and how much willing to pay).
Back Up Your Microsoft Windows Server®

Back up all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.


Expert Comment

ID: 39213912
Do you connect over a VPN connection?  If so this may be your issue.  It is possible the VPN tunnel comes down when it reaches the idle tunnel threshold and needs to be reestablished by creating 'interesting traffic' over the VPN.  Depending on the robustness of your app, this could possibly be the issue.

Also, when you are disconnected from RDP, is it after a certain amount of Idle time, or right during the session while you are interacting with the remote server?

Also, what does the mail queue on the server look like when you have emails go missing, do you see them queuing up?
LVL 11

Author Comment

ID: 39214147

No - I do not use a VPN.

Remote Desktop will disconnect after random amounts of time, if only rarely. RDP will also fail (instantly fail) to connect but only once in 50 tries. It will then connect fine on a second attempt.

The Mail Queues are almost as expected. Little or nothing there. The only exception to this is that one external domain has a about 12 items in the out-bound SMTP queue.

Overall, the whole Mail system (in and out) is about 33% down on volume. Some mail is getting through, some is not.


MX Tool Box is where I started. I am getting random results. I will "anono-mise" some images and post them next...

ping <servername> = 0% loss
ping -l 2000 <servername> = 100% loss

I am going to have a hunt around to see what the difference is, and why that should be, but in-case I do not find an answer (and to help those who follow), why should this be?

I will also review the link you provided, and report back.

Thanks all,

LVL 11

Author Comment

ID: 39214156
These Images are captures from MXToolBox.

PLEASE NOTE: some of these show the the Server is Open Relay... PLEASE IGNORE THIS. I added test@example.com as an email address, so that the connection would work.

2013-05-31 03:54:45
MX Toolbox 12013-05-31 03:54:46
MX Toolbox 22013-05-31 04:04:54
MX Toolbox 32013-05-31 04:05:05
MX Toolbox 42013-05-31 04:11:03
MX Toolbox 5
LVL 11

Author Comment

ID: 39214183

Site24x7 reports that the Server is down - on it's first test. I cannot see how to get at the report or log of where the failure happened.

Below is part of a DNS test from the Site24x7 on the main domain name. Is this anything to worry about / what does this mean..?

Site 24x7 DNS ReportGH

Assisted Solution

IanNoble earned 2000 total points
ID: 39214251
http://www.dnsstuff.com/tools has every test you could think of.

The main benefit of site24x7 is you can schedule repeat tests.

Ping -l 2000 not working means the network from your client to the server is not fragmenting packets correctly, its something you could take to your network support team as something that isn't working 100% of time. However in such a scenario you would typically see ping got through and some network traffic, but certain other types of traffic your would not.

It could be the mtu settings on the router interfaces or they are blocking certain types of icmp traffic in the firewall.
LVL 11

Author Closing Comment

ID: 39218264
The tests described here, and especially the website www.site24x7.com help show that the traffic being filtered.

The server had been under attack from external sources. because of this special measures had been put in place in the data center.

It would have been easier if the Data Center said something other than "Non"...

Thank for the help,


Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you considered what group policies are backwards and forwards compatible? Windows Active Directory servers and clients use group policy templates to deploy sets of policies within your domain. But, there is a catch to deploying policies. The…
#Citrix #Netscaler #MSSQL #Load Balance
Monitoring a network: how to monitor network services and why? Michael Kulchisky, MCSE, MCSA, MCP, VTSP, VSP, CCSP outlines the philosophy behind service monitoring and why a handshake validation is critical in network monitoring. Software utilized …
Do you want to know how to make a graph with Microsoft Access? First, create a query with the data for the chart. Then make a blank form and add a chart control. This video also shows how to change what data is displayed on the graph as well as form…
Suggested Courses
Course of the Month13 days, 7 hours left to enroll

801 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question