Solved

Best way to Determine source of intermittant Network Problems

Posted on 2013-06-01
9
387 Views
Last Modified: 2013-06-04
Hi All,

I have two windows servers in a remote data center, and there are issues connecting to them.

The Servers have DNS, Websites, MySQL and Email Services on them.

Access to websites is NOT a problem. This type of connection works perfectly.

MySQL seems to work flawlessly too.

DNS... I am not sure, it seems OK, would like some (non-single) tests I can perform.

RDC. Occasionally I am disconnected - even if only briefly. Sometimes, I try to connect, and it fails instantly, then connects instantly next attempt.

Email... This is a MAJOR issue. SMTP and POP are not working every time, but also do not fail every time:

Desktop > Office Server > Problem Server = Works fine every time.
Desktop > Problem Server > Other Server = Works fine every time.
Desktop > ISP > Problem Server = Email goes missing..!
Desktop > Other Server > Problem Server = Delays, but only sometimes.

As this seems to have affected the two servers at the same time, I suspect this is a network - related issue. On Server is 2003 SP1 the other is 2008r2 SP1

I have done tracert, which shows no problems, but then this is an intermittent problem

Anyone have any tests I can run to see it I can find the problem?

GH
0
Comment
Question by:G_H
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
9 Comments
 
LVL 2

Assisted Solution

by:IanNoble
IanNoble earned 500 total points
ID: 39212817
If you have access to firewalls and routers in between, check the logs of those for anything blocking email related traffic (straight blocks or over size rules), and on the routers that NAT is working as you expect.

On Cisco, in enable mode, this would be

show log | inc IP ADDRESS OF EMAIL SERVER
show log | inc smtp
show log | inc pop
show ip nat translations | inc IP ADDRESS OF EMAIL SERVER
0
 
LVL 11

Author Comment

by:G_H
ID: 39213155
I do not have access to any of the "network" hardware - these are dedicated servers in a large (well known) data center.

I am looking for tests I can run from various servers outside that data center which can show connection problems.

A great example would be:
ping -n 1800 ServerName
... This however works fine, and shows only 1 loss...

Is there another equivalent I can use? - Specifically on Mail Ports?

GH
0
 
LVL 2

Accepted Solution

by:
IanNoble earned 500 total points
ID: 39213162
http://mxtoolbox.com/diagnostic.aspx is a good place to start.

If using ping, adding -l 2000 will help check that the routers are handling packet fragmentation correctly e.g.

ping -l 2000 servername

(Lower case L)

You can also use services such as www.site24x7.com that will routinely do email tests (from basic smtp is responding, to sending emails and logging in to retrieve it, to monitoring exchange itself - depending on requirements and how much willing to pay).
0
NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

 
LVL 2

Expert Comment

by:NE_Tech_Dude
ID: 39213912
Do you connect over a VPN connection?  If so this may be your issue.  It is possible the VPN tunnel comes down when it reaches the idle tunnel threshold and needs to be reestablished by creating 'interesting traffic' over the VPN.  Depending on the robustness of your app, this could possibly be the issue.

Also, when you are disconnected from RDP, is it after a certain amount of Idle time, or right during the session while you are interacting with the remote server?

Also, what does the mail queue on the server look like when you have emails go missing, do you see them queuing up?
0
 
LVL 11

Author Comment

by:G_H
ID: 39214147
@NE_Tech_Dude

No - I do not use a VPN.

Remote Desktop will disconnect after random amounts of time, if only rarely. RDP will also fail (instantly fail) to connect but only once in 50 tries. It will then connect fine on a second attempt.

The Mail Queues are almost as expected. Little or nothing there. The only exception to this is that one external domain has a about 12 items in the out-bound SMTP queue.

Overall, the whole Mail system (in and out) is about 33% down on volume. Some mail is getting through, some is not.

@IanNoble

MX Tool Box is where I started. I am getting random results. I will "anono-mise" some images and post them next...

ping <servername> = 0% loss
ping -l 2000 <servername> = 100% loss

I am going to have a hunt around to see what the difference is, and why that should be, but in-case I do not find an answer (and to help those who follow), why should this be?

I will also review the link you provided, and report back.

Thanks all,

GH
0
 
LVL 11

Author Comment

by:G_H
ID: 39214156
These Images are captures from MXToolBox.

PLEASE NOTE: some of these show the the Server is Open Relay... PLEASE IGNORE THIS. I added test@example.com as an email address, so that the connection would work.

2013-05-31 03:54:45
MX Toolbox 12013-05-31 03:54:46
MX Toolbox 22013-05-31 04:04:54
MX Toolbox 32013-05-31 04:05:05
MX Toolbox 42013-05-31 04:11:03
MX Toolbox 5
0
 
LVL 11

Author Comment

by:G_H
ID: 39214183
@IanNoble

Site24x7 reports that the Server is down - on it's first test. I cannot see how to get at the report or log of where the failure happened.

Below is part of a DNS test from the Site24x7 on the main domain name. Is this anything to worry about / what does this mean..?

Site 24x7 DNS ReportGH
0
 
LVL 2

Assisted Solution

by:IanNoble
IanNoble earned 500 total points
ID: 39214251
http://www.dnsstuff.com/tools has every test you could think of.

The main benefit of site24x7 is you can schedule repeat tests.

Ping -l 2000 not working means the network from your client to the server is not fragmenting packets correctly, its something you could take to your network support team as something that isn't working 100% of time. However in such a scenario you would typically see ping got through and some network traffic, but certain other types of traffic your would not.

It could be the mtu settings on the router interfaces or they are blocking certain types of icmp traffic in the firewall.
0
 
LVL 11

Author Closing Comment

by:G_H
ID: 39218264
The tests described here, and especially the website www.site24x7.com help show that the traffic being filtered.

The server had been under attack from external sources. because of this special measures had been put in place in the data center.

It would have been easier if the Data Center said something other than "Non"...

Thank for the help,

GH
0

Featured Post

Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Background Information Recently I have fixed file server permission issues for one of my client. The client has 1800 users and one Windows Server 2008 R2 domain joined file server with 12 TB of data, 250+ shared folders and the folder structure i…
How to Install VMware Tools in Red Hat Enterprise Linux 6.4 (RHEL 6.4) Step-by-Step Tutorial

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question