Solved

Only Some Computers Lose Internet Access

Posted on 2011-09-28
24
703 Views
Last Modified: 2012-08-13
I've got an Active Directory network with about 50 computers.  We started having issues last week and it came back today.  Only some of the computers are losing Internet access.  I've rebooted all the hardware I can.  But I've got two computers I'm working on...one is online and the other one isn't.  However, all the local network connectivity seems to be fine on all the computers.  The computer that can't get online can still get to mapped network drives.  But the Internet access is sporadic...sometimes works, sometimes doesn't.

It may be something related to different network switches, but I haven't confirmed that yet. (Although I rebooted all of them).

It also seems to be affecting something funny with some of our internal websites.  Again, not confirmed...but could it be something with DNS?

It's all really weird and inconsistent...the worst kind of problems.  Any ideas?

Thanks.
~bruno71
0
Comment
Question by:bruno71
  • 12
  • 10
  • +1
24 Comments
 
LVL 4

Expert Comment

by:jschristian44
ID: 36717835
Maybe your server is running out of memory or something and switches some off so it can distribute it better.  That's just a theory I just made up.  But it sounds like the computer that is losing access might have a faulty wireless card or something or is in a spot where the network cannot connect to for some reason.  Try switching the physical working computer with the one that is losing access.  Switch them around physically I mean.  If the same thing happens on the working one is losing access, then it is just the physical spot.  That would be my first guess.  I had this problem in my local area network where I could only access one computer at a certain spot and the other one wouldn't work there.  It might be your router drivers too.  So I just gave you a bunch of things you can check.  I am not the best when it comes to Networks but I experienced something similar to yours and I think it's just where the location of the physical computer and the router are that the signal doesn't go right to that spot well enough.
0
 
LVL 4

Expert Comment

by:waynej1979
ID: 36718128
I have seen this with some firewall devices, is there a user limit on your Hardware Firewall? It could be that you are maxing your license. Seen this on Watchguard and Checkpoint
0
 

Author Comment

by:bruno71
ID: 36718313
Thanks for the suggestions...

It's not a user limit on the firewall.  I don't think there is a limit, plus we don't have that many people here (compared to a busy day).

If it happens again, I'll try physically switching the network cables.

In the meantime, here is what I found.  I have 2 domain controllers.  One is the "primary" server with all the primary AD roles and is also a DNS & DHCP server.  The second server is also a DNS server.  The DNS service on the second server was stopped.  I started it again, and things have seemed good since then.  All the clients are configured from the DHCP server (the primary one), with the primary server listed as the first DNS server.  The secondary server is listed second.  Could this have caused the sporadic problems?  Even if the primary default server was working fine?
0
 

Author Comment

by:bruno71
ID: 36718465
On second thought...it could be something related to the firewall.  It just happened again, and the DNS service was still running.

However, I had lost connection to our firewall/gateway.  It's a Cisco ASA 5510.  Any tips on what/where to look for errors?  I will continue to monitor.
0
 

Author Comment

by:bruno71
ID: 36719363
Still continuing to have issues.  Here's another weird part...

I opened a command prompt and started a continual ping to our firewall (ping 192.168.1.10 -t).  I would watch the ping succeed with a response or fail with a time out as I switched network cables around, changed switch ports, etc.  One workstation is succeeding right now...the other is failing.  I pulled the network cable from the failing workstation and plugged it right back in - it had a few successful pings then went back to time outs.  I pulled it again and plugged it right back in - now it's getting successful responses!

I would think that means a hardware issue (switches failing?), but I'm not sure of anything right now.
0
 

Author Comment

by:bruno71
ID: 36815091
Still testing...here's the latest theory:

I've replaced our 2 regular switches in the server room.  I haven't replaced our 2 PoE switches because I don't have any spares for those.  But I've unplugged them independently and it didn't seem to change anything.

The problem seems to be isolated to two IP addresses -
192.168.1.9 - the secondary address on our web server NIC...(the primary address on the NIC 192.168.1.6 works fine)
192.168.1.10 - the address of our Cisco ASA 5510, firewall and gateway to the Internet

I'm running continual pings on them with varying responses.  However, I think what ever is happening affects Win7 and WinXP differently.  Attached is a screenshot from a WinXP machine...3 of the 4 addresses I ping are successful.  The ..1.9 fails.  Same results on another WinXP box.  I have 2 Win7 boxes where all 4 are successful.  

And then I occaisionally get something like the second screenshot...1 successful ping in the middle of a bunch of failures.  And it happened on both WinXP boxes at exactly the same time.  The Win7 boxes were unaffected.

Still desparate for any kind of help...thanks.

WinXP box
 one success on XP
0
 

Author Comment

by:bruno71
ID: 36815159
...Moments after my last post, I started getting successful replies from ..1.9 on one XP box - all four are now successful.  The other XP box only had 1 successful reply and then went back to time outs and it's now failing with ..1.10.

The Win7 boxes are unchanged.

**(sigh)**
0
 
LVL 6

Accepted Solution

by:
netjgrnaut earned 500 total points
ID: 36815796
All the ping tests are internal, and running to/from the same subnet, yes?

Have you checked for IP address duplication?  Do an arp -a on all the test systems, then compare the MAC address of .9 with the actual server MAC.  Check .10 with your ASA while you're at it.

Does the XP client with the persistent failures have better luck with any other IP ping targets?

If possible (not popular during production hours), disconnect .9 from the network, do an "arp -d *" on your test XP clients, and start pinging again.  Repeat the process with .10 disconnected from the network.

If you can narrow it down to limited XP clients having problems, check *their* IP configs.  Are they using DHCP?

There are (obviously) a lot of variables here - but you're on the right track with DNS/DHCP/IP issues.  (IMHO)

Good luck.  I'll certainly help more if I can...
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 36815809
...also, start ping tests *from* .9 *to* some of the test XP client - both working and non-working.

Of course, the success of this will depend on your XP client FW configs...
0
 

Author Comment

by:bruno71
ID: 36815984
!!! The first glimmer of hope I've had in a while !!!

I checked the arp and the MAC address for ..1.9 was different between failing and successful workstations.  I cleared it and restarted the pings.  Now several IP addresses were failing and they were all pulling the same "bad" MAC addresses.  Stop/Restart the pings and it started pulling the correct MAC addresses.

How do I find out what that MAC address belongs to?

0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 36816101
You have a duplicate IP address on the network.  There is more than on computer with .9 set.

First, check the server (that's supposed to be .9) and note the MAC address.  You can do this (among other ways) with "ipconfig /all" at a cmd prompt.

Finding the rogue .9 is more difficult, depending on your switch infrastructure.  Are any/all of your switches managed?  Managed switches will have a MAC-to-port table.  Where that table is varies widely based on the switch.

Odds are good that the rogue .9 is on the same switch as the test clients that are showing the wrong MAC address.  So I'd start with the switch forwarding tables there.

Worst case (if all your switches are managed), you start anywhere and search for the MAC address of the rogue .9.  Each lookup will lead you to an uplink (switch-to-switch) port - and hence another switch - until you find the switch and station port of the rogue.

If you don't have managed switches, this gets *much* more difficult.  You're best bet then is to segment (break) the network into pieces apart from the *real* .9.  When you find the segment that can still *ping* .9, you've found the segment where the rogue resides.  Then you have to go station by station to find the bad IP config.

Glad to help!
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 36816109
"More than one* computer with .9 set."  Fat fingers... :-/
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:bruno71
ID: 36816321
...But other IP addresses are pulling the same bad MAC address.  It's not just the ..1.9.  Sometimes the same MAC address is listed for ..1.10 and I've seen it for other IP addresses too.
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 36816557
Really?  So you've identified the rogue MAC address (regardless of IP)?  Is it a single address, or are there multiple rogues and multiple affected IPs?

Let's level-set...
Do you have the MAC of the *real* .9?
Do you have the MAC of the *real* .10?
Let's focus on those two.
Is there only one *wrong* MAC address for both .9 and .10?  For *each* .9 and .10?

A single bad/rogue MAC would be tracked down using the methods I described above - regardless of how many different IP addresses it's clobbering.  The MAC can only exist in one place (let's ignore the reality of MAC spoofing for now - that shouldn't be an issue here).

If you've got a single MAC clobbering/duplicating multiple IP addresses, it almost sounds like some device misconfigured for proxy ARP.  But let's find the MAC on the network first, disconnect it second, and do a failure analysis last - once you have restored normal operations to your environment.
0
 

Author Comment

by:bruno71
ID: 36816596
It seems there is only 1 rogue MAC address.  Is there some way to find out what the real IP address of that MAC address is?  I looked in our DHCP leases, but the IP address associated with it was no longer valid.
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 36816755
There's no practical way of telling the "right" IP from the "wrong" IP.  Wrong is defined as "an IP already allocated to another host."  You need to use the MAC to track down the device causing the problems.

I'm guess from all this that you don't have managed switches?  If you *do* have managed switches, but you're not sure where to look for the MAC address tables, post the make/model here and I'll help you figure it out.  Much much better than trying to find a MAC on an unmanaged network.

You can use this website to lookup the vendor associated with the rogue MAC address.  That's the manufactorer of the NIC, which *might* point you in the right direction (particularly if your system hardware configurations are well documented somewhere).

The good news is, you're only looking for a single device.

Hope that helps!
0
 

Author Comment

by:bruno71
ID: 36817088
I've got some 3Com 2226-PWR Plus switches and some Dell PowerConnect 2724.

I tried to lookup the vendor associated witht the MAC address, but it didn't find anything.  The MAC address starts with 78-d6-f0
0
 

Author Comment

by:bruno71
ID: 36817306
Getting closer...  I may have found it...  Stay tuned...
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 36817349
While you're at it, make sure .9 and .10 and whatever else are excluded from your DHCP scope. If they're not excluded from the provided range, a device that doesn't do duplicate address detection could be picking them up.

Just a thought.

I'll dig into your switch docs when I get back to my desk.
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 36817644
Here's your MAC vendor...

http://hwaddress.com/mac/78D6F0-000000.html


Company: Samsung Electro Mechanics
Prefix: 78:D6:F0
Address space: 78:D6:F0:00:00:00 - 78:D6:F0:FF:FF:FF
Address:
Metan Dong 314, Youngtong Gu
Suwon Kyung-gi Do. 443-743
Korea, Republic Of

Your 3Com 2226-PWR Plus is managable.  How to configure management... and here's the manual.  That should help you find the MAC address table.

Your Dell PowerConnect 2724 is also managed.  Here the manual for those.

That should be everything you need to know to track down the rogue MAC... if you haven't found it already, that is.

Good luck!

0
 

Author Comment

by:bruno71
ID: 36817709
I might be crazy...but I may have narrowed it down to a users Android phone.  It's a SAMSUNG Charge.  His wi-fi MAC address matches the address I was seening in the ARP table.  He's out of the office until Monday and everything is working fine now.  I'll test on Monday and let you know.
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 36817828
Check your DHCP scope to ensure those static IP addresses (.9, .10, whatever) are excluded.  The phone would make sense in the scenario you're seeing - and matches the MAC vendor.  Doubtful that the phone would do (or honor) duplicate IP address detection during DHCP.  You can enable that on the server as well, but it's a poor strategy to protect static IPs.  I'm guessing that you've got the whole /24 scoped into DHCP, and no exclusions.  Or something similar.  It fits...
0
 

Author Comment

by:bruno71
ID: 36891191
Actually, our DHCP scope starts at ..1.80  Everything before that is manually setup.  I'm guess the phone is just malfunctioning.  We'll find out Monday.
0
 
LVL 6

Expert Comment

by:netjgrnaut
ID: 36891339
Everything still working right while the phone is out of the office?

Have a great day!
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

Short answer to this question: there is no effective WiFi manager in iOS devices as seen in Windows WiFi or Macbook OSx WiFi management, but this article will try and provide some amicable solutions to better suite your needs.
Creating an OSPF network that automatically (dynamically) reroutes network traffic over other connections to prevent network downtime.
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now