Network going down intermittently

Hi,

I'm having problems with our office network at the moment in that client PCs intermittently lose connectivity to the network.  I've added an image to give a rough idea of how things are set up here.

What is happening is this:

- Some computers intermittently lose the ability to connect to servers (shared folders, email server, database server, network apps, etc.)

- Some but not all computers intermittently lose the ability to connect to our router (ping response "192.168.1.1: Destination host unreachable").  Router address is 192.168.0.254.  This is of particular inconvenience as we have 25 sites all of which connect to our servers for email and a gift card system which is hosted here in head office  (hence customers are inconvenienced also).

- Sometimes when users are able to ping the router, they are still unable to browse to anything except locally hosted webpages (unable to see google but can see intranet) yet they can still ping 'www.google.ie' and get responses in decent time and can browse to the router's login page.  We use OpenDNS for filtered internet access and our ISP's DNS for unfiltered access however this problem is not restricted to one setup or another.

NOTE: We do also however have our own local DNS server which is primerilly used for Domain control, Active Directory, etc.  Nearly all computers in the office are set up with this as primary DNS currently.

Up to now, we did not use managed switches, however I had one in stock and purchased the second last week.  In an attempt to locate the issue I installed both to see if there was an issue with a particular network card and I have replaced two network cards that I found issues with.  I have rotated the connections so that I have now monitored all port connections at least for one day.  I have also rotated the 3rd switch (unmanaged) to make sure the switch itself was not causing the problem.  I now have two brand new switches and one used but working.

It has been suggested by a friend that there may have been a loop which I can now confidently dismiss as I have checked every port and small switch (used as wireless access points) on the network.

The other suggestion was that there may be a broadcast storm.  I have tried a few things to dismiss this idea within the limits of what I can try during working hours but so far with no luck.  Is there a good way to go about this?

Or is a broadcast storm out of the equation altogether?

Any other suggestions?

Thanks in advance...
-network.jpg
kierankennyAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

MightySWCommented:
Hi, first a question...  Do you have your port speed on the 1841 configured as default?  

What does a show int on the FE interface show?  Are you at liberty to post that here?

Also, you should ensure that the speed on the uplink ports JUST between the two 3 Com devices is set to whatever the switch ports are capable of.  So rather than autonegotiate, set them to 100 / full duplex.  This goes for the same on the router.

Sounds to me like you are getting a bunch of CRC errors and possibly colissions on your ports.

As far as broadcasts go, you can use a dummy switch or hub, or setup one of the ports on one of the 3Com switches to be a monitor port and run Wireshark on it to monitor the broadcasts between the switches.  If you want to mitigate the number of broadcasts, then you should think about shrinking down your subnet from a defualt class B or A to separate VLANs based on roles.  This will create even more broadcast domains for you, thus allowing the clients to communicate more effectively.  

3 Com switches are NOTORIOUS for going bad and sending out extraneous floods.  The term broadcast storm seems to refer to 'too many broadcasts' within the VLANs.  If your netgear does not by default support some kind of native VLAN tagging then it will just broadcast to find the nearest arp and keep a VERY LIMITED arp table.

This is a pretty long check list, but just start with the port speeds.  More than likely that Netgear switch is causing some problems with broadcasts.  

In the future, you should think about splitting up into at least 2 VLAN's (minus the default / Native VLAN) and create sub-interfaces on the 1841 to route between the two VLAN's.

HTH
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
eeRootCommented:
Is the router at 192.168.1.1 or 192.168.0.254?  The error "Destination host unreachable" implies a routing problem, but these generally don't come and go intermittently.  You may have some bad or loose cabling somewhere that is causing intermittent network disconnections.  Now that you've replaced the ummanaged network devices with manage-able switches, I'd recommend you set up a syslog server to try and capture some error messages.  There's a free one available called "Kiwi syslog server" -

http://www.solarwinds.com/products/freetools/

Once you have the server up and running, you add the IP of the syslog server to the configs of the router and switches, then they will start sending errors and informational messages to the server and you can centrally monitor them.  Hopefully, one of the network devices will log some errors next time teh network goes down that you can use for troublshooting.

There is a freeware tool called PTRG ( http://www.paessler.com/prtg ) that can do much more then just a syslog server, but getting it set up can be a big project.  If you don't have any other monitoring tools to help you out, give it a try.

PS. log into the router, type "show log", and post the results
0
Rick_O_ShayCommented:
I would recommend you connect the top 3com in the picture directly to the Netgear at the bottom if you can reach it with a cable to eliminate a possible choke point on the middle 3com.

Also disable any unused ports so you can be sure no one is plugging a something in somewhere and causing a loop.
0
kierankennyAuthor Commented:
Thanks for that.  I tried the port speed on the router first to no avail.  I then set the port speeds on the switch to 1000M/Full Duplex and while this is a worthwhile procedure it made no difference.  So I downloaded Wireshark and it was able to identify two small routers which were creating broadcast storms, one was a netupia (192.168.1.254) and the other was a Zyxel (192.168.1.1).  I'm not sure what this Zyxel was doing but this appears to be the device that interfered with client PCs accessing the Cisco router.  I removed both routers and replaced with 4-port hubs and all is peaceful again!

Thanks a million for your help and thanks to everyone for your prompt replies.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Networking Hardware-Other

From novice to tech pro — start learning today.