We look after a number of small business networks and from time to time we have reports of network connectivity issues. Recently we have a network that is randomly stopping and starting internet access on a handful of PC's that are not patched into the same switch. We have found however that removing one switch in particular seems to fix the issue so we think that's the cause. It does make me think however that we could use a very easy to use network diagnosis tool, be it hardware or software to help with this sort of thing. As we are not advanced network engineers we'd like something reasonably easy to use and translate the findings. Any recommendations? Thanks
I would suggest looking into "Wireshark" which is a free network diagnostic tool to monitor traffic on your LAN. When the network is stopping you can look to see what is trying to communicate to determine if its an outside source or if its an internal device. This has a lot of details down to ports and services too if need be.
Also, when removing the one switch out of the network it corrects the problem completely? Can you tell if its trying to communicate or locks up when this happens? Could be creating broadcast storms or loops and may be faulty all together. Is there any special configuration on this one switch? Is it a layer 2 (typical) switch or is it a layer 3 (Managed for VLAN assignments)?
Gavin Reid
ASKER
Hi James,
Unfortunately I've tried to use Wireshark in the past and it completely baffles me, just don't know how to interpret the results. Was kind of looking for something a little more basic that (for instance) scans the network, works out the basic structure of the network (switches, access points, pc's etc and then identifies possible trouble areas. 2 out of the 5 switches are managed (but not actually configured in any way) and the rest are just smart switches, all a mix of TP Link and HP, nothing complex.
You might try tshark (command line version of Wireshark). Easy to interrupt.
And, this type of problem (intermittent) is difficult to diagnose.
To catch these sorts of problems what I do is track a few local machine stats on each machine.
1) Local Packet Loss - using netstat -i looking at RX-ERR + RX-DRP + TX-ERR + TR-DRP, sending notifications if percentages get high.
2) Router Config - using ethtool to ensure basic settings like speed + duplex are correct, sending notifications of any change.
3) Path Packet Loss - run mtr in line/report mode between different local + other machines looking for oddities, like sustained high packet loss, sending notifications if percentages get high.
These simple tests surface all manner of problems.
If you create a few simple CRON jobs to do this type of test every 5 minutes (saving data, for later graphing) you can turn up all manner of interesting data.
For example, some tech at one hosting company reset one of my router ports from 1G to 10M one day. I could then open a ticket + give the exact time (within 5 minutes) the problem started + the hosting company then reviewed it's tech logs + found the problem + fixed it quickly.
If I hadn't noticed the problem + been able to point to exact time problem started, likely it would have been difficult to find + fix.
Sometimes simple tests work better than analyzing packets.
James Bunch
Absolutely, and if you have questions about Wireshark feel free to message me and I will try to help if possible. I am still learning a lot of it myself but have recently been looking into some of the "Self-Help" videos and things and may be able to offer a little assistance along the way. I'm sure we can narrow it down enough to make it worthwhile! =)
Best,
James
Gavin Reid
ASKER
Thanks all very much for your advice, much appreciated
Also, when removing the one switch out of the network it corrects the problem completely? Can you tell if its trying to communicate or locks up when this happens? Could be creating broadcast storms or loops and may be faulty all together. Is there any special configuration on this one switch? Is it a layer 2 (typical) switch or is it a layer 3 (Managed for VLAN assignments)?