Intermittent Internet-Could it be the Router?

I need some input!!

Problem :
Intermittent loss of internet access.

Observations :
With known ping-able sites, sometimes get the 4 replies, sometimes get destination host unknown, sometimes get 4 request timed out, sometimes get a couple replies followed by timeouts.  Same results, regardless if using names or ip addresses.

Traceroute reports the router x.x.x.10, then followed by all Request timed out.

After a 'rest', everything just comes back.

(sub-observation : makes it kinda tough to test when many locations are now rejecting pings and tracerts - is there a list of KNOWN good ping-able locations?)

Environment :
1 w2k server sp4, dns, active directory
~40 clients - mixed w2k prof., win 95/98/xp
HP procurve switches
Adtran TA850 router, pulling 4 channels for fractional T1 for internet access
Server IP x.x.x.1
Router IP x.x.x.10

Thought there was a dns issue, so reconfigured dns on server - actually a fresh install of w2k(also corrected some other issues)
Thought it might be the switch/hub, 'cause the fan was blown and it was running really hot - replaced.
Thought there might be problems with clients - reconfigured to make sure there weren't any conflicting settings.
Thought it might be cable from router to switch - replaced.
Thought it might be cable from server to switch - replaced.
Thought it might be my drinking problem - drank more.
Thought there might be a problem with the T1 providers circuit - requested a test, all tested ok.
I can ping to my public IP address from home (different provider) and never a problem with ping results (although, the tracert does time out, but I suspect that is because some router on the path is denying tracert requests ??????????????)
None of the above altered the situation for the better or worse - no change.

Tried to recall what could have possibly changed.  About a month ago, we took some kind of hit where we lost the T1 - voice and data, as well as the battery backup that the router, phone system, etc. is plugged into.  Router lost the configuration.  Provider reconfigured the router, and we had voice and data again.  I didn't notice the severity of the intermittent internet access - not really sure if it was an issue, or if it is indeed worse now.

All of the above things I tested/replaced/rebuilt, COULD have been the culprit, so I suppose the router COULD also be the culprit.  Does anyone feel, with any certainty, based on the above information, or am I just grasping at straws, thinking its the router - other than the fact it is the last, and as of yet only unreplaced, piece in the connectivity 'channel'?  Is it possible (again with any certainty) that the router could have sustained more damaged than was originally thought?  What about the firmware?

If not the router, what else could it possibly be?  I'm at my wits end - help.

Thanks
raprealmAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

lrmooreCommented:
sounds like you have an infection of Welchia /MSblast on the inside network
Rebooting the router clears the cache and translations, until it gets overwhelmed again.

If you can, block outbound icmp and you will at least stabilize the network connections.

If the router is doing nat, look at the nat translations. If you see hundreds of entries from one host, you know that host is infected.
0
raprealmAuthor Commented:
further questions -

All clients are running norton av, are up to date and reporting clean.
With the clean install of w2k on the server, it doesn't seem likely that the server is infected does it?

I don't actually reboot the router - the internet connections (and the ability to ping successfully etc. just comes back on its own.

The router IS doing nat.

If I block outgoing icmp, I won't be able to ping - is that correct?
0
lrmooreCommented:
Just being up to date on AV won't cut it. You need to check for MSBLAST
You won't be able to ping, but everything else will work. This is only temporary to see if this really is the issue. 1 or 2 infected systems send out thousands of icmp packets looking for more hosts to spread to. This overruns the router memory trying to create all those nat translations ...
If you have a hub, you can use a sniffer (http://www.ethereal.com) to see what is really going on in the network.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

raprealmAuthor Commented:
interesting ...........

Hasn't this virus been around awhile?  Doesn't NAV pick it up?
At night, most all (if not all) machines are shut off - still the same problem at the server, going out to the net.
I will check etherreal.com
If this ends up being a virus, I think I'll puke!
0
GnartCommented:
I had a similar situation awhile back.  It drove me nut and I blamed it on the ISP's connection.  It turned out to be my router was failing (hardware problem - red face).
4/5 ping successful is somewhat a norm on a busy network.  Do you PING from inside the router to see if you have a problem with intermittent ping to a know site - if that occurs most likely the router need to be replaced or repair.  You should also check for virus by monitoring traffic on your router and also monitor packet drop and retransmitted for a trouble some router.

cheers
0
td_milesCommented:
Just a comment in terms of pinging things. I would suggest that when you are testing stuff, it you simply want to test your link, then ping the router at the other end of the link (assuming that it responds to ICMP echo requests). This avoids any potential for the packet loss to be anywhere else on the Internet in general.

If you must try pinging hosts on the Internet then try with a larger timeout (in windows, use the "-w" option) to make sure the pings are catually getting lost, not just taking longer than one second (the default timeout).
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Routers

From novice to tech pro — start learning today.