Link to home
Start Free TrialLog in
Avatar of bruno71
bruno71Flag for United States of America

asked on

Only Some Computers Lose Internet Access

I've got an Active Directory network with about 50 computers.  We started having issues last week and it came back today.  Only some of the computers are losing Internet access.  I've rebooted all the hardware I can.  But I've got two computers I'm working on...one is online and the other one isn't.  However, all the local network connectivity seems to be fine on all the computers.  The computer that can't get online can still get to mapped network drives.  But the Internet access is sporadic...sometimes works, sometimes doesn't.

It may be something related to different network switches, but I haven't confirmed that yet. (Although I rebooted all of them).

It also seems to be affecting something funny with some of our internal websites.  Again, not confirmed...but could it be something with DNS?

It's all really weird and inconsistent...the worst kind of problems.  Any ideas?

Thanks.
~bruno71
Avatar of jschristian44
jschristian44

Maybe your server is running out of memory or something and switches some off so it can distribute it better.  That's just a theory I just made up.  But it sounds like the computer that is losing access might have a faulty wireless card or something or is in a spot where the network cannot connect to for some reason.  Try switching the physical working computer with the one that is losing access.  Switch them around physically I mean.  If the same thing happens on the working one is losing access, then it is just the physical spot.  That would be my first guess.  I had this problem in my local area network where I could only access one computer at a certain spot and the other one wouldn't work there.  It might be your router drivers too.  So I just gave you a bunch of things you can check.  I am not the best when it comes to Networks but I experienced something similar to yours and I think it's just where the location of the physical computer and the router are that the signal doesn't go right to that spot well enough.
I have seen this with some firewall devices, is there a user limit on your Hardware Firewall? It could be that you are maxing your license. Seen this on Watchguard and Checkpoint
Avatar of bruno71

ASKER

Thanks for the suggestions...

It's not a user limit on the firewall.  I don't think there is a limit, plus we don't have that many people here (compared to a busy day).

If it happens again, I'll try physically switching the network cables.

In the meantime, here is what I found.  I have 2 domain controllers.  One is the "primary" server with all the primary AD roles and is also a DNS & DHCP server.  The second server is also a DNS server.  The DNS service on the second server was stopped.  I started it again, and things have seemed good since then.  All the clients are configured from the DHCP server (the primary one), with the primary server listed as the first DNS server.  The secondary server is listed second.  Could this have caused the sporadic problems?  Even if the primary default server was working fine?
Avatar of bruno71

ASKER

On second thought...it could be something related to the firewall.  It just happened again, and the DNS service was still running.

However, I had lost connection to our firewall/gateway.  It's a Cisco ASA 5510.  Any tips on what/where to look for errors?  I will continue to monitor.
Avatar of bruno71

ASKER

Still continuing to have issues.  Here's another weird part...

I opened a command prompt and started a continual ping to our firewall (ping 192.168.1.10 -t).  I would watch the ping succeed with a response or fail with a time out as I switched network cables around, changed switch ports, etc.  One workstation is succeeding right now...the other is failing.  I pulled the network cable from the failing workstation and plugged it right back in - it had a few successful pings then went back to time outs.  I pulled it again and plugged it right back in - now it's getting successful responses!

I would think that means a hardware issue (switches failing?), but I'm not sure of anything right now.
Avatar of bruno71

ASKER

Still testing...here's the latest theory:

I've replaced our 2 regular switches in the server room.  I haven't replaced our 2 PoE switches because I don't have any spares for those.  But I've unplugged them independently and it didn't seem to change anything.

The problem seems to be isolated to two IP addresses -
192.168.1.9 - the secondary address on our web server NIC...(the primary address on the NIC 192.168.1.6 works fine)
192.168.1.10 - the address of our Cisco ASA 5510, firewall and gateway to the Internet

I'm running continual pings on them with varying responses.  However, I think what ever is happening affects Win7 and WinXP differently.  Attached is a screenshot from a WinXP machine...3 of the 4 addresses I ping are successful.  The ..1.9 fails.  Same results on another WinXP box.  I have 2 Win7 boxes where all 4 are successful.  

And then I occaisionally get something like the second screenshot...1 successful ping in the middle of a bunch of failures.  And it happened on both WinXP boxes at exactly the same time.  The Win7 boxes were unaffected.

Still desparate for any kind of help...thanks.

User generated image
 User generated image
Avatar of bruno71

ASKER

...Moments after my last post, I started getting successful replies from ..1.9 on one XP box - all four are now successful.  The other XP box only had 1 successful reply and then went back to time outs and it's now failing with ..1.10.

The Win7 boxes are unchanged.

**(sigh)**
ASKER CERTIFIED SOLUTION
Avatar of netjgrnaut
netjgrnaut
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
...also, start ping tests *from* .9 *to* some of the test XP client - both working and non-working.

Of course, the success of this will depend on your XP client FW configs...
Avatar of bruno71

ASKER

!!! The first glimmer of hope I've had in a while !!!

I checked the arp and the MAC address for ..1.9 was different between failing and successful workstations.  I cleared it and restarted the pings.  Now several IP addresses were failing and they were all pulling the same "bad" MAC addresses.  Stop/Restart the pings and it started pulling the correct MAC addresses.

How do I find out what that MAC address belongs to?

You have a duplicate IP address on the network.  There is more than on computer with .9 set.

First, check the server (that's supposed to be .9) and note the MAC address.  You can do this (among other ways) with "ipconfig /all" at a cmd prompt.

Finding the rogue .9 is more difficult, depending on your switch infrastructure.  Are any/all of your switches managed?  Managed switches will have a MAC-to-port table.  Where that table is varies widely based on the switch.

Odds are good that the rogue .9 is on the same switch as the test clients that are showing the wrong MAC address.  So I'd start with the switch forwarding tables there.

Worst case (if all your switches are managed), you start anywhere and search for the MAC address of the rogue .9.  Each lookup will lead you to an uplink (switch-to-switch) port - and hence another switch - until you find the switch and station port of the rogue.

If you don't have managed switches, this gets *much* more difficult.  You're best bet then is to segment (break) the network into pieces apart from the *real* .9.  When you find the segment that can still *ping* .9, you've found the segment where the rogue resides.  Then you have to go station by station to find the bad IP config.

Glad to help!
"More than one* computer with .9 set."  Fat fingers... :-/
Avatar of bruno71

ASKER

...But other IP addresses are pulling the same bad MAC address.  It's not just the ..1.9.  Sometimes the same MAC address is listed for ..1.10 and I've seen it for other IP addresses too.
Really?  So you've identified the rogue MAC address (regardless of IP)?  Is it a single address, or are there multiple rogues and multiple affected IPs?

Let's level-set...
Do you have the MAC of the *real* .9?
Do you have the MAC of the *real* .10?
Let's focus on those two.
Is there only one *wrong* MAC address for both .9 and .10?  For *each* .9 and .10?

A single bad/rogue MAC would be tracked down using the methods I described above - regardless of how many different IP addresses it's clobbering.  The MAC can only exist in one place (let's ignore the reality of MAC spoofing for now - that shouldn't be an issue here).

If you've got a single MAC clobbering/duplicating multiple IP addresses, it almost sounds like some device misconfigured for proxy ARP.  But let's find the MAC on the network first, disconnect it second, and do a failure analysis last - once you have restored normal operations to your environment.
Avatar of bruno71

ASKER

It seems there is only 1 rogue MAC address.  Is there some way to find out what the real IP address of that MAC address is?  I looked in our DHCP leases, but the IP address associated with it was no longer valid.
There's no practical way of telling the "right" IP from the "wrong" IP.  Wrong is defined as "an IP already allocated to another host."  You need to use the MAC to track down the device causing the problems.

I'm guess from all this that you don't have managed switches?  If you *do* have managed switches, but you're not sure where to look for the MAC address tables, post the make/model here and I'll help you figure it out.  Much much better than trying to find a MAC on an unmanaged network.

You can use this website to lookup the vendor associated with the rogue MAC address.  That's the manufactorer of the NIC, which *might* point you in the right direction (particularly if your system hardware configurations are well documented somewhere).

The good news is, you're only looking for a single device.

Hope that helps!
Avatar of bruno71

ASKER

I've got some 3Com 2226-PWR Plus switches and some Dell PowerConnect 2724.

I tried to lookup the vendor associated witht the MAC address, but it didn't find anything.  The MAC address starts with 78-d6-f0
Avatar of bruno71

ASKER

Getting closer...  I may have found it...  Stay tuned...
While you're at it, make sure .9 and .10 and whatever else are excluded from your DHCP scope. If they're not excluded from the provided range, a device that doesn't do duplicate address detection could be picking them up.

Just a thought.

I'll dig into your switch docs when I get back to my desk.
Here's your MAC vendor...

http://hwaddress.com/mac/78D6F0-000000.html


Company: Samsung Electro Mechanics
Prefix: 78:D6:F0
Address space: 78:D6:F0:00:00:00 - 78:D6:F0:FF:FF:FF
Address:
Metan Dong 314, Youngtong Gu
Suwon Kyung-gi Do. 443-743
Korea, Republic Of

Your 3Com 2226-PWR Plus is managable.  How to configure management... and here's the manual.  That should help you find the MAC address table.

Your Dell PowerConnect 2724 is also managed.  Here the manual for those.

That should be everything you need to know to track down the rogue MAC... if you haven't found it already, that is.

Good luck!

Avatar of bruno71

ASKER

I might be crazy...but I may have narrowed it down to a users Android phone.  It's a SAMSUNG Charge.  His wi-fi MAC address matches the address I was seening in the ARP table.  He's out of the office until Monday and everything is working fine now.  I'll test on Monday and let you know.
Check your DHCP scope to ensure those static IP addresses (.9, .10, whatever) are excluded.  The phone would make sense in the scenario you're seeing - and matches the MAC vendor.  Doubtful that the phone would do (or honor) duplicate IP address detection during DHCP.  You can enable that on the server as well, but it's a poor strategy to protect static IPs.  I'm guessing that you've got the whole /24 scoped into DHCP, and no exclusions.  Or something similar.  It fits...
Avatar of bruno71

ASKER

Actually, our DHCP scope starts at ..1.80  Everything before that is manually setup.  I'm guess the phone is just malfunctioning.  We'll find out Monday.
Everything still working right while the phone is out of the office?

Have a great day!