Link to home
Start Free TrialLog in
Avatar of hachemp
hachemp

asked on

Tracking Down Network Timeout Issues - Need some good tools/analyzers

Can anyone recommend some good tools or even just a methodology for tracking down intermittent timeouts/delays on a network?  I have a client who complains of timeouts and slowness to web pages.  They'll wait a moment, refresh the page and then it will load just fine.  Seems to happen across all web pages and from most if not all of their machines.  I've tested their bandwidth and they're getting their contracted rate on d/l and u/l speeds which should be more than enough for their size.  

They also complain about losing connection to some internal network shares so I believe the issue is probably internal.  Their switches are not optimally cabled, and they have a larger-than-I-would-recommend broadcast domain (one flat network using /23 SM), so I'm thinking these issues could be caused by broadcast storms, excaberated by the less than optimal cabling (switch to switch to switch instead of hub and spoke).  

I'm having a hard time replicating the issue when I go in to test, due to the intermittent nature of it, but I have seen it firsthand.  I'm wondering if anyone can recommend some good software (preferably free) that I can use to analyze their network to see if they're getting an inordinate amount of broadcast traffic, or that could give me any other clues on how to proceed to track this issue down.  I've used Wireshark before and could probably capture using that and then filter just the broadcast traffic, but not sure how to tell how much is too much.  Any advice on any of this is highly appreciated.
SOLUTION
Avatar of Rob Williams
Rob Williams
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of hachemp
hachemp

ASKER

Thanks for the response.  I have not checked any of the physical cabling yet but I have a Fluke so I may give that a shot.  In my experience if there's a problem in the wiring it's either all or nothing (not all of the time but mostly), and I've conducted multiple ping tests through all of the major switches and getting consistent 1ms ping times all around.  I'm sure it could still be cabling regardless but just doesn't seem likely to me given the circumstances.

I had already checked DNS...their DHCP server is handing out two internal DNS servers.  However, those servers were set up with forwarders to external DNS servers from a different ISP than what they have currently.  I didn't like that, so I pulled those out and am just using root hints on both now...still waiting to see if that made any difference.
Which Fluke do you have?

The DNS changes sound good. They may help with browsing, but not with internal file access.
What type of switches?  Are they managed?

Anything in the logs?

Can you setup a syslogd server and have the switches forward their logs to it?

If Cisco, do they have portfast enabled everyplace that does not need to carry multiple tagged VLANs?
To add to my previous comment/question about your Fluke meter.  The meter needs to be  a Fluke Cable certification meter such as a DSP or DTX series.  Other network tools such as the Fluke Nettool, CableIQ, MicroScanner, or even the LAN tools cannot do a cable certification (i.e. a full test).  Generally only quality cable installers and large campuses own these.
The easiest way to accomplish this level of diagnostic would be to configure a span port on the central most switch and setup wireshark to do a capture to a local laptop or pc. Keep in mind that this will be a lot of information, so be sure you have a drive with sufficient storage space. You can then use the expert composite analysis to see any issues in the capture.
Avatar of hachemp

ASKER

Thanks for all of the comments guys.  Yeah RobWill unfortunately my Fluke is not that cool (NetTool Series II)...it will do wiremapping but not cable certification.

They have a few Enterasys switches (B5G124-48P2) and a Dell PowerConnect 5212 as their central switch.  Unfortunately they have absolutely no idea what the passwords are to any of them.  I have quite a bit of experience with the Dells but none with Enterasys.  At some point I'm probably going to have to reset them all to default so I can get into the management...not ideal but I don't seem to have much choice.

Fritz, thanks for the tip on Wireshark.  I have used it a few times but have never used the expert analysis.  I can't set up a span port without being able to get into the switches but I can at least run WS from another computer and see what kind of traffic is on the network.  I'll post back once I have done that.  I'll make sure to spread the points around as best I can for all who are contributing...and thanks again.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of hachemp

ASKER

Great stuff, Rick, and thanks much for the commands.  Do you happen to know if that button on the back will clear the config, or just reset the password?
That just resets the password.
Avatar of hachemp

ASKER

Excellent, thank you sir.
Avatar of hachemp

ASKER

Hello, I was finally able to go onsite and touch those Enterasys switches.  They had an B5G124-24P2 that I hit the password reset button on, and it reset to 'admin' and blank password...had no problems logging in.  However, on the B5G124-48P2, they have two of these connected with a stacking cable.  On one of them, connecting to the console, all I get is the following prompt:

(Unit 1)>

I don't seem to be able to do anything at all from there.  Pushing and hold the password reset does nothing to the CLI or otherwise.  Guessing this is the slave switch in the stack?

When I connected the console cable to the other one, I did get prompted for username and password.  I tried the default, no go of course, so I reached back and pushed the password reset button.  I get this on the CLI:

<161>Feb 14 18:58:51      10.1.1.240-2 USER_MGR(1): 217 % Password Reset button has been pressed

All good, right?  So I tried again to log in with admin and no password and get this:

<165>Feb 14 18:59:29      10.1.1.240-2 USER_MGR(1): 218 % User:admin failed login from console

I tried holding this button in probably 20 different times, ranging from tapping it to holding it for probably 60 seconds.  Each time I get the password reset button message on the CLI, and each time I was unable to log in with what should be the default credentials.  I scoured the internet for any other default credentials for these switches and found none.  The only thing I can figure out is that someone must have disabled the admin user.  

So my question is...is there any way for me to get into this switch?  Any way to reenable the admin user or otherwise gain access?  Thanks in advance!
It is configured to use some type of remote authentication, like RADIUS or TACACS+?

If so you might have to disconnect the network connection so it can't get to the server and then it may fail over to local authentication.

At least that is how it works on Cisco when you do remote authentication.
Avatar of hachemp

ASKER

Thanks, that would make perfect sense.  However, I'm about 90% certain they don't have RADIUS or TACACS set up anywhere.  However, next time I'm there I'll certainly give that a shot.
I don't think I have ever had an issue using the password reset but I don't use stacking cables on mine. Might try breaking the stack for a minute, if you can, and see if it lets you reset either one. There is a login lockout if you use the wrong password 3 times. That gets unlocked again in 15 minutes if it happens.
Avatar of hachemp

ASKER

Thanks for the tips...I'm gonna be onsite hopefully this week and I'll give that a shot.
Avatar of hachemp

ASKER

Sorry, haven't been able to get back out there so closing the ticket and awarding points.  I appreciate the info.