• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 417
  • Last Modified:

LAN Slowing Down - Traffic not high

Recently, and more yesterday and today, LAN performance has been slowing down.  Users have been complaining about our ERP being slow to respond, and begining yesterday the overall network has been dogging it.  Accessing files is slow, and opening IE and loading the home page (or any other) is uncharacteristacily (sp?) slow.

I run The Dude and don't see any high traffic situations and the interface graphs on my firewall don't show any high bandwidth usage.

Other causes?  DNS? WINS? ??

W2k3 AD, 8 servers, 20 computers, 25 thin clients MS TS
0
benhar
Asked:
benhar
  • 12
  • 10
3 Solutions
 
Bill BachPresidentCommented:
Could be anything from a bad or failing cable to electromagnetic interference to a failing switch or NIC to DNS problems.  

To troubleshoot, first try a PING by NAME of the server from a workstation, and see how long it takes to get the pinging started.  Alternatively, try to PING a web server like www.novell.com.  If it takes a VERY long time to get started, then it could indicate a DNS problem.  Check your DNS server list on the workstations and verify that all of the servers in the list are actual DNS servers.  

Then, try FPING (www.kwakkelflap.com) and do a rapid-fire test like this:
    FPING Server -s 50000 -t 0 -n 1000
You should get very rapid responses at all times.  If you see failures or slow responses, then it could indicate a cabling, NIC, or switch problem.

Beyond that, the next best step is to get Wireshark installed (www.wireshark.org) and get a network trace of your "slow" ERP application running on a client.  See what the real response times are, and see if there are packet retransmissions or other such problems.  
0
 
benharAuthor Commented:
Pinging servers resulted in the first reply being 14ms and the rest <1ms.

Pinging to the web, Google, at first, pinging took avg. 46ms, after fping-ing servers, pinging Google resulted in 3 failures and the forth being 48ms.  Pinging Novell, MSN, even Playboy (tried it because it's not a site browsed to) resulted in failures.  The IPs were retrieved, but the pings failed. Even after increasing the timeout interval.

Using FPING, resulted in a few success then continued failures.  Then my AV displayed a DoS alert so I disabled it.  Once I did that, FPING to my servers went through perfect.

I have Wireshark, but am unsure how to configure or read it right to get the answers I'm looking for.

DNS might be to blame... where to start checking that?
0
 
Bill BachPresidentCommented:
The point of the initial PINGs were to test the response time of DNS for name resolution -- thus the "see how long it takes to get started" comment.  Did you stopwatch the amount of time it took to do the name resolution before the pinging started?  Was it noticeably slow, or did it seem quick?  The same thing can be done in a browser window by seeing how long it takes for the data on a page to start coming back.  I'd test Playboy again, but that's just me.  ;-)

Interesting that the AV solution flagged the DoS on FPING.  I wonder if it is examining ALL traffic, which could be responsible for the slowdown, too?  Try disabling the firewall and AV and see if the problems subside.

To run Wireshark, set up a default capture of ALL traffic, then run your PING tests. Stop the capture & review the packets.  You should see your DNS requests going out, and the replies coming back.  Are they handled in a single request?  Or does the first request fail and it has to try the secondary DNS server?  Post your resulting capture and we'll see if we can see anything...

0
Microsoft Certification Exam 74-409

VeeamĀ® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

 
benharAuthor Commented:
Pinging internally it started pretty much right away.  When I would ping to the web, depending on the site, it took about 5 seconds or so for it to even come back with the "Pinging www.##.com [111.11.11.11]..." Then it might fail or work.

Google would ping right away, MSN took a bit to get the IP then fail, same with Playboy.  Then I tried an industry site our sales guys use and at first it failed completely and couldn't resolve the name, then when I tried again it took 5-7 seconds to get the IP then it was 100% successful.

Disableing the AV makes no difference.  It's been installed and configured the same for months so I'm pretty sure it's no factor.  I don't suspect the firewall is an issue because I don't see a lot of traffic hitting it (right now, 33kbps in/out on the inside, 16kbps in/out on the outside interfaces).

For the wireshark, so I need to span the ports on my switch, or am I just focusing on the traffic from the source computer?
0
 
benharAuthor Commented:
Plus, how long is a sufficent capture? 1 minute, 5 minutes, etc.?
0
 
Bill BachPresidentCommented:
Sounds like definitely a DNS issue.   If that's the case, then we might not see anything on the pure client side, but let's try it:

1) Open a command prompt Window and enter the command: PING x (where X is a location you haven't pinged before)
2) Start Wireshark capturing all Traffic.
3) Hit enter in the command window.
4) As soon as it starts responding, click Stop on Wireshark and save the trace.   Let's see what we get...
0
 
benharAuthor Commented:
Okay, after allowing pinging through my firewall, attached are two captures.  

The first was to www.whitehouse.gov (96.7.226.135).  When I first tried, it couldn't resolve the name.  I didn't save that capture.  When I tried again, it was successful.  I stopped the capture after the first reply.

The second capture was to www.kleenex.com (205.203.72.226).  This one I left run until it was done pinging but none were successful. (I had a packet of kleenex on my desk ;) ).
Capture1
Capture2
0
 
Bill BachPresidentCommented:
Can you try "renaming"  the PCAP file to .TXT and see if THAT will post?  Analyzing a text file capture is horribly painful.
0
 
benharAuthor Commented:
Attempting...

BTW - my primary DNS is 172.16.200.12 and secondary is 172.16.200.11
Capture1.txt
Capture2.txt
0
 
Bill BachPresidentCommented:
Never mind -- the first trace was enough.  Look at it in your own wireshark and add a display filter (at the top of the screen) of "udp.port == 53" (without the quotes).

What you will see is this:

Time | Request
1.1563s | DNS Lookup to 172.16.200.12
2.1554s | DNS Lookup to 172.16.200.11
3.1553s | DNS Lookup to 172.16.200.11
5.1554s | DNS Lookup to 172.16.200.96
5.1557s | DNS Lookup to 172.16.200.11
7.7464s | DNS Reply from 172.16.200.96

Note that it takes almost 8 seconds to resolve the DNS name here.  I am guessing, but I would assume from this that the DNS servers at 172.16.200.12 and 172.16.200.11 are either running VERY slowly, they have a firewall blocking the requests, or the service is simply not running there.  Start by checking to see if these boxes are running DNS or not.

Try modifying your DNS entries to include 172.16.200.96 as the FIRST DNS server, and possibly add a few others, to see if that helps.

0
 
Bill BachPresidentCommented:
Another thing I hate about reading text trace files -- I misinterpretted the .96 address -- this was a reply, not a request.  See the trace dump file attached.

You'll note that eventually, the DNS reply does come back from both servers, but it takes a LONG time.  6.5 seconds is WAY too long.  

Now that we know this, the next step is to look at the DNS server, either .11 or .12.  What SHOULD happen is that the DNS server will get the request, determine that it doesn't know the answer, and forward the request upstream.  Something is wrong with THAT part of it.

We'll need to get Wireshark on THERE, and then see what we can see from there.  It could be that the DNS server list on THAT server is outdated and includes an old DNS server that is no longer responding.  

trace.bmp
0
 
benharAuthor Commented:
*Highlight...Delete*  I was just responding that you mis-interpreted the capture.  The .96 machine is my box.

I will get wireshark on the .12 server (primary DNS) first and see what happens.
0
 
benharAuthor Commented:
Capture attached and filtered for "udp.port==53"

Pinged www.boeing.com.

Renamed .pcap file to txt.
DNScapture.txt
0
 
benharAuthor Commented:
BTW - .3 is my web content filter on my firewall.
0
 
Bill BachPresidentCommented:
Did you look at this trace?  I particularly like the first two lines:

The DNS server at 200.11 is forwarding the request to the DNS server at 200.12.
The DNS server at 200.12 is forwarding the request to the DNS server at 200.11.

Sounds like a DNS loop.  It's not until they both fail that it looks like 200.12 finally sends the request to the upstream DNS server (192.42.93.30) and gets an rapid reply.

Review the two DNS servers and ensure that they don't point to each other when a failure occurs.  You can have ONE pointing to the other, but don't create a loop.
0
 
benharAuthor Commented:
"Who's on first?"

I thought that was rather strange.  I was fiddling with DNS the other day and maybe did that unintentionally.  

I would change that in the Forwarders, right?  Keep the secondary IP in the forwarder for the primary, but remove the primary IP from the secondary's forwarder?

Long way around to find a stupid mistake!
0
 
Bill BachPresidentCommented:
Yes, that would likely fix it, but it is still not the best idea.

For more of a "load balancing" solution, don't forward from one to the other at all -- allow both of them to send the request to the ISP and cache the results.  Otherwise, you're likely to have a failure of the primary DNS server result in a delay that is twice as long -- the secondary server won't be queried until the primary fails, and the secondary will try the primary again, and then have to wait for that request to fail, too.
0
 
benharAuthor Commented:
So have no forwarders then?  I did remove the one and results were observed immediataly, but will remove them both if that's the best practice.

Any further suggestions for optimum performance?
0
 
Bill BachPresidentCommented:
In both cases, forward to an outside DNS server.  Even better, forward to TWO possible servers, and swap them -- so that .11 sends to DNS1, then DNS2, and .12 sends to DNS2, then DNS1.  This will hopefully allow good response even after multiple failures.
0
 
benharAuthor Commented:
Where do I get a list of external DNS servers?  From my ISP?
0
 
Bill BachPresidentCommented:
Yes, you ISP would usually give you two that are reasonably close to your link.
0
 
benharAuthor Commented:
Thanks a whole bunch for the help and quick responses!  
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 12
  • 10
Tackle projects and never again get stuck behind a technical roadblock.
Join Now