• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 391
  • Last Modified:

Best way to Determine source of intermittant Network Problems

Hi All,

I have two windows servers in a remote data center, and there are issues connecting to them.

The Servers have DNS, Websites, MySQL and Email Services on them.

Access to websites is NOT a problem. This type of connection works perfectly.

MySQL seems to work flawlessly too.

DNS... I am not sure, it seems OK, would like some (non-single) tests I can perform.

RDC. Occasionally I am disconnected - even if only briefly. Sometimes, I try to connect, and it fails instantly, then connects instantly next attempt.

Email... This is a MAJOR issue. SMTP and POP are not working every time, but also do not fail every time:

Desktop > Office Server > Problem Server = Works fine every time.
Desktop > Problem Server > Other Server = Works fine every time.
Desktop > ISP > Problem Server = Email goes missing..!
Desktop > Other Server > Problem Server = Delays, but only sometimes.

As this seems to have affected the two servers at the same time, I suspect this is a network - related issue. On Server is 2003 SP1 the other is 2008r2 SP1

I have done tracert, which shows no problems, but then this is an intermittent problem

Anyone have any tests I can run to see it I can find the problem?

GH
0
G_H
Asked:
G_H
  • 5
  • 3
3 Solutions
 
IanNobleCommented:
If you have access to firewalls and routers in between, check the logs of those for anything blocking email related traffic (straight blocks or over size rules), and on the routers that NAT is working as you expect.

On Cisco, in enable mode, this would be

show log | inc IP ADDRESS OF EMAIL SERVER
show log | inc smtp
show log | inc pop
show ip nat translations | inc IP ADDRESS OF EMAIL SERVER
0
 
G_HAuthor Commented:
I do not have access to any of the "network" hardware - these are dedicated servers in a large (well known) data center.

I am looking for tests I can run from various servers outside that data center which can show connection problems.

A great example would be:
ping -n 1800 ServerName
... This however works fine, and shows only 1 loss...

Is there another equivalent I can use? - Specifically on Mail Ports?

GH
0
 
IanNobleCommented:
http://mxtoolbox.com/diagnostic.aspx is a good place to start.

If using ping, adding -l 2000 will help check that the routers are handling packet fragmentation correctly e.g.

ping -l 2000 servername

(Lower case L)

You can also use services such as www.site24x7.com that will routinely do email tests (from basic smtp is responding, to sending emails and logging in to retrieve it, to monitoring exchange itself - depending on requirements and how much willing to pay).
0
Who's Defending Your Organization from Threats?

Protecting against advanced threats requires an IT dream team – a well-oiled machine of people and solutions working together to defend your organization. Download our resource kit today to learn more about the tools you need to build you IT Dream Team!

 
NE_Tech_DudeCommented:
Do you connect over a VPN connection?  If so this may be your issue.  It is possible the VPN tunnel comes down when it reaches the idle tunnel threshold and needs to be reestablished by creating 'interesting traffic' over the VPN.  Depending on the robustness of your app, this could possibly be the issue.

Also, when you are disconnected from RDP, is it after a certain amount of Idle time, or right during the session while you are interacting with the remote server?

Also, what does the mail queue on the server look like when you have emails go missing, do you see them queuing up?
0
 
G_HAuthor Commented:
@NE_Tech_Dude

No - I do not use a VPN.

Remote Desktop will disconnect after random amounts of time, if only rarely. RDP will also fail (instantly fail) to connect but only once in 50 tries. It will then connect fine on a second attempt.

The Mail Queues are almost as expected. Little or nothing there. The only exception to this is that one external domain has a about 12 items in the out-bound SMTP queue.

Overall, the whole Mail system (in and out) is about 33% down on volume. Some mail is getting through, some is not.

@IanNoble

MX Tool Box is where I started. I am getting random results. I will "anono-mise" some images and post them next...

ping <servername> = 0% loss
ping -l 2000 <servername> = 100% loss

I am going to have a hunt around to see what the difference is, and why that should be, but in-case I do not find an answer (and to help those who follow), why should this be?

I will also review the link you provided, and report back.

Thanks all,

GH
0
 
G_HAuthor Commented:
These Images are captures from MXToolBox.

PLEASE NOTE: some of these show the the Server is Open Relay... PLEASE IGNORE THIS. I added test@example.com as an email address, so that the connection would work.

2013-05-31 03:54:45
MX Toolbox 12013-05-31 03:54:46
MX Toolbox 22013-05-31 04:04:54
MX Toolbox 32013-05-31 04:05:05
MX Toolbox 42013-05-31 04:11:03
MX Toolbox 5
0
 
G_HAuthor Commented:
@IanNoble

Site24x7 reports that the Server is down - on it's first test. I cannot see how to get at the report or log of where the failure happened.

Below is part of a DNS test from the Site24x7 on the main domain name. Is this anything to worry about / what does this mean..?

Site 24x7 DNS ReportGH
0
 
IanNobleCommented:
http://www.dnsstuff.com/tools has every test you could think of.

The main benefit of site24x7 is you can schedule repeat tests.

Ping -l 2000 not working means the network from your client to the server is not fragmenting packets correctly, its something you could take to your network support team as something that isn't working 100% of time. However in such a scenario you would typically see ping got through and some network traffic, but certain other types of traffic your would not.

It could be the mtu settings on the router interfaces or they are blocking certain types of icmp traffic in the firewall.
0
 
G_HAuthor Commented:
The tests described here, and especially the website www.site24x7.com help show that the traffic being filtered.

The server had been under attack from external sources. because of this special measures had been put in place in the data center.

It would have been easier if the Data Center said something other than "Non"...

Thank for the help,

GH
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 5
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now