Need help troubleshooting network problems (Windows, Exchange, Internet...)

I have a feeling I'm missing something simple and going to kick myself when someone posts the obvious answer! But I'm not much of a networking guy and inherited the responsibility due to layoffs. So any help would be appreciated.

On Friday morning, we started experiencing some strange network behavior. Some, but not all, users were complaining that they could not connect to Exchange and/or network shares and/or the internet. The users always reboot about 3 times before they contact me, and that usually didn't help. However, while troubleshooting one user's PC, I managed to regain access simply by disabling and enabling his network adapter. (I was finding that the PCs had no IP address when things went south, and were getting a local 192 address.)

About mid-day, the company iPhones started giving a connection error to Exchange, with a message that said something about an untrusted connection or certificate. (I'm sorry, but after you click the message, it doesn't come back so I can't quote it exactly.)

I also noticed that while my PC never lost connection to anything (I had Exchange, shares and internet), I was not receiving any emails from outside the company. Test emails from my personal account never came through, and test emails to my personal account never went out.

Yesterday, I received a delivery notification email from my personal account due to the fact that my test email never went through. The notification included:

Your message has been enqueued and undeliverable for 1 day
to the following recipients:

  Recipient address: xxx@xxx.com
  Reason: unable to deliver this message after 1 day

Delivery attempt history for your mail:

Sat, 27 Jun 2015 23:07:13 +0000 (GMT)
TCP active open: Failed connect() to TCP port 25 of 173.xx.xx.xx (No formatted text for errno = 110)


The IP mentioned in the notification is our static public address for mail.

If it helps, our basic network: Internet -> Comcast modem -> Cisco ASA5510 -> HP switch/network. Our DC provides DHCP.

Hopefully I've included enough information. I'm hoping someone can point out the obvious that I'm missing. Thanks.
Eric JackIT ManagerAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

JohnBusiness Consultant (Owner)Commented:
Have you tried restarting the Server?  
After restart, are both DHCP and DNS running properly.

Do you have more devices than your DHCP block allows?
0
nader alkahtaniNetwork EngineerCommented:
Check the dhcp pool it may be full
Check viruses
0
Eric JackIT ManagerAuthor Commented:
Have you tried restarting the Server?  
After restart, are both DHCP and DNS running properly.

Do you have more devices than your DHCP block allows?

I rebooted the DC Friday morning with no difference. I just double-checked the services now and everything seems to be running as it should. The Address Pool is from 10.16.68.1-255 and leases run from 1-71 with gaps. So the pool should have plenty of spares.

Check the dhcp pool it may be full
Check viruses

Nothing reported by Symantec Endpoint Protection.

I'm thinking about rebooting and power-cycling the whole network just for kicks since I'm here in the building alone today.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Rob WilliamsCommented:
Could you post the results of IPconfig / all   from a problematic PC?

Sounds like you may have a rogue DHCP server on the network if addresses should be10.x.x.x and getting 192.x.x
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
JohnBusiness Consultant (Owner)Commented:
It is possible that the network gear got overloaded. That happens. Turn off the main modem and main router. Turn off switches.

Now turn on the modem and wait for two minutes. Turn on the main router and wait for two minutes. You should have good connectivity and internet at the router.

Now turn on a switch that the server is connected to and test.

Now turn on the rest of the gear and test.
0
Eric JackIT ManagerAuthor Commented:
Well, the office next to me has been one of the trouble offices. He had no shares/Exchange/internet until I disabled/enabled his network adapter and then it all worked again. I just went and did that again and now he's got no access again! So here's a shot of his ipconfig /all (sorry for the photo, kind of hard copying it when he's off the net.)

However, some thoughts: I thought the 192 address is what the adapter gave itself when it doesn't get an IP from the DHCP. And... even if there is a rogue DHCP somewhere, how would that affect incoming/outgoing email to the outside when the Exchange server is on a static IP?

Anyway, here's the ipconfig:

ipconfig /all
0
Rob WilliamsCommented:
When there is no DHCP server you get a 169.254.x.x address.

The DHCP server is listed as 192.168.0.1 which is the default for most home routers.  Bet someone installed something like a Linksys router so they could have wireless.

I have a blog article that helps to locate:
http://blog.lan-tech.ca/2012/04/23/rogue-dhcp-servers/
You can find the MAC address of the unit, and then at least you know the brand you are looking for.

You could also start a continuous ping;  ping -t 192.168.0.1    and start unplugging cables
0
JohnBusiness Consultant (Owner)Commented:
The problem computer has a 192.168.0.7 address. The DHCP server is 192.168.0.1. So it looks reasonable. .7 is low - is that your DHCP range.

I would scan the problem computer thoroughly for viruses and keep it disconnected.

With the problem computer disconnected, do you have continuing issues?
0
Rob WilliamsCommented:
Hi John. Eric mentioned; "The Address Pool is from 10.16.68.1-255"  so it sounds like a rogue DHCP server.

Eric, disabling the network adepter, rebooting and such refreshes the DHCP lease.  Then there is a race condition as to which DHCP server will respond first.  When it works it means yours won out.

With the wrong addressing your Exchange server cannot be reached.
0
JohnBusiness Consultant (Owner)Commented:
I missed the addressing earlier. So that, or the problem computer has viruses. Disconnecting it is a step in the troubleshooting. If it is disconnected, a rogue DHCP server should still show up.
0
Eric JackIT ManagerAuthor Commented:
Well, something to keep in mind, it's more than one "problem computer". It seems everyone has a chance to lose their connection to the network.

So let's assume I have a rogue DHCP server/router/or something serving up addresses, how is that keeping Exchange from being reached from OUTSIDE the building? Ohhhhhh... wait. I made a change to my Comcast cable modem Friday morning: I put in a 1-to-1 NAT settings for an outside static IP to point to the Exchange server for webmail. Because I had notes from the previous person that that setting should have been in place and it wasn't (we changed modems a few months ago.) I just went into the modem and disabled that change I made... and suddenly all the emails are working into and out of the building again!

But, would that cause the other issues? Or is it a classic "Eric F-Up" where I tend to make one huge problem comprised of lots of little errors? I'm going to take a moment to review the replies here and assess where I stand now with the network. (At least email works! Whew!)
0
JohnBusiness Consultant (Owner)Commented:
I made a change to my Comcast cable modem Friday morning: I put in a 1-to-1 NAT settings for an outside static IP

That is new information. I think I would put those settings in a commercial router (Netscreen or like) and put the policies in the router. We have done this at clients (we have Juniper Netscreens for them all) and do not have this issue.
0
Rob WilliamsCommented:
Sorry, I though Exchange wasn't accessible internally.  Though that would definitely affect the flow of e-mail, it should not affect client PC's accessing things like the Exchange server or file shares.

Are you sure the local network uses 10.x.x.x and not 192.x.x.x?

Does   nslookup  ServerName return the correct address??

Could the DHCP pool to which you referred, 10.16.68.1-255, be for VPN use?
0
JohnBusiness Consultant (Owner)Commented:
Also, I do not exactly what you did, but it would seem to me the change may have restricted to one device and also reset your modem to a different IP range
0
Rob WilliamsCommented:
Good point John.  The modem/router itself is probably the rogue DHCP server, if DHCP was not disabled on it.
0
nader alkahtaniNetwork EngineerCommented:
0
Eric JackIT ManagerAuthor Commented:
Okay, recap!

It's starting to look like a collection of small, unrelated problems cropped up at the same time which made one great big problem. Looks like the outside email issue was solved by reverting the Comcast SMC business modem back to the way it was on Thursday. Once I did that, outside emails (inbound and outbound) started working immediately!

Oddly, once I did that, my GFI Mail Essentials updates suddenly started working again too! I'm guessing the updates were being hampered by the same static NAT. I don't understand exactly what happened without digging into it further.

I'll address the NAT settings on the modem at a different time in a different thread. When I'm not trying to put out other fires instead of just adding fuel to the existing fire!

Now...

The wonky network issue still exists.

When there is no DHCP server you get a 169.254.x.x address.

The DHCP server is listed as 192.168.0.1 which is the default for most home routers.  Bet someone installed something like a Linksys router so they could have wireless.
Yes, this is looking more and more like a rogue DHCP server. I went back to the PC in the office next door and disabled/enabled the network adapter. Got a 192 IP. Did it again and got a 10.16 IP. Again and got a 192 once more. So it seems like there are two DHCP servers racing to hand out addresses! When a PC gets a 192 IP, they have no access to the domain or internet.

That is new information. I think I would put those settings in a commercial router (Netscreen or like) and put the policies in the router. We have done this at clients (we have Juniper Netscreens for them all) and do not have this issue.
It's a Comcast SMC small business router. As I said, I'm going to tackle this later. The important part is I reverted the settings to the way they were Thursday before we had any problems.

Sorry, I though Exchange wasn't accessible internally.  Though that would definitely affect the flow of e-mail, it should not affect client PC's accessing things like the Exchange server or file shares.

Are you sure the local network uses 10.x.x.x and not 192.x.x.x?

Does   nslookup  ServerName return the correct address??

Could the DHCP pool to which you referred, 10.16.68.1-255, be for VPN use?
Well Exchange, shares and internet are not accessible internally. Sometimes. But the issue seems to be whenever a PC gets a 192 address instead of a 10.16 address.

Yes, I'm looking at DHCP on my DC now, and the address pool is 10.16.68.1-255. That's what my notes say, and that's what range a PC gets when it works fine. Doing an nslookup swik-s-dc01 gives me the correct 10.16.64.59 address which is the DC's static IP. The pool I referred to is for the PCs on the domain.

Also, I do not exactly what you did, but it would seem to me the change may have restricted to one device and also reset your modem to a different IP range
Referring to the change I made on the modem? All I did was add a 1-to-1 NAT for 173.X.X.X (public) to 10.16.64.62 (internal IP for Exchange server.)


So..., I need to find out where the heck those 192 IPs are coming from, because that seems to be what's causing PCs to lose their access to Exchange/shares/internet!
0
Eric JackIT ManagerAuthor Commented:
Good point John.  The modem/router itself is probably the rogue DHCP server, if DHCP was not disabled on it.
Yes, the modem does serve DHCP to the guest network, but the IP range is 10.1.10.X. So I don't think the issue is from the modem. It's been configured that way for years with no issues.
0
JohnBusiness Consultant (Owner)Commented:
Have you disconnected the "problem" PC.   In fact, I would disconnect most computers and focus on the Comcast modem/router, one switch, and the server connected to this switch to see if that much is stable.
0
Eric JackIT ManagerAuthor Commented:
Have you disconnected the "problem" PC.   In fact, I would disconnect most computers and focus on the Comcast modem/router, one switch, and the server connected to this switch to see if that much is stable.
LIke I said, there is not one "problem" PC. My own PC can/has been a problem PC, though it is working fine on the domain at this moment. I'm hoping to narrow down the culprit without having to disconnect everything off the network. There are a lot of connections. Anyway, I don't think it's the modem because the modem is back to the configuration it had before Friday, and has worked fine. It's DHCP does not issue 192 addresses.

Download  Microsoft Rogue DHCP Server detection :
I'm not getting this to work right. When I click the Detect Rogue Servers button, I get a message that says Interface 10.16... is used by DHCP Client for DHCP operation and cannot be used by Rogue detection tool. Configure the static IPv4 address for this interface, stop DHCP client and restart the application. Even after I set the IP on my PC to a static IP outside the DHCP range.
0
Eric JackIT ManagerAuthor Commented:
Finding the rogue DHCP server:

So based on the ipconfig /all from a PC with the 192 IP, the DHCP server is 192.168.0.1. I can ping that IP from the affected PC. But I can't ping it from my PC which is on the domain with a correct IP. Shouldn't that IP be pingable either way? Since when a PC requests an IP, it can get an address from either one?
0
Rob WilliamsCommented:
The Microsoft Rogue DHCP Server detection  is just going to tell you it is 192.168.0.1

Ping the unknown router/gateway
Ping 192.168.0.1
Now run  arp -a
Locate the MAc address of 192.168.0.1
Next go to the following site and enter MAC address
http://standards.ieee.org/develop/regauth/oui/public.html

This will tell you who the manufacturer is.  It may only tell you it is Intel, but may tell you something like Linksys, or your new modem/router.

I don't believe in coincidences.  I still suspect the modem/router
0
Rob WilliamsCommented:
>"Shouldn't that IP be pingable either way? Since when a PC requests an IP, it can get an address from either one? "
No you cannot ping a device on another subnet unless you have routing set up between them.
0
Eric JackIT ManagerAuthor Commented:
Rob, yeah... I just got the MAC from the PC on the "rogue" subnet using blog link you left earlier.

Seems it's from Delta Networks. No idea...
My switch is managed, but I'm not exactly sure how to use it to track down that MAC. It's an HP ProCurve 4208. Logging into it to see if I can figure it out.
0
JohnBusiness Consultant (Owner)Commented:
@Eric Jack - I would not get locked on to one cause.

Rouge DHCP server: Yes, could be. But who would do that in your organization.
Bad Modem: Yes, it was good, but they fail as Rob has noted.
Viruses; Can cause strange issues and should be checked on all machines.
0
Eric JackIT ManagerAuthor Commented:
John, oh... trust me... we've had people here do some silly things on the network before. In their defense, the products we design and manufacture have computers, servers and network interfaces in them. We design our own software to run these products. So out Engineering department has hardware and software developers who don't always follow the rules (by choice or by accident) and sometimes plug servers or other network devices onto the LAN instead of the "walled garden" they are supposed to play in.

I run Symantec Endpoint Protection on all the clients and servers here. I have more faith in that than something having been plugged into my network. So rather than going on a witch hunt for a virus-laden PC(s), I'm focusing my energy on finding 192.168.0.1 and I'm going to pull the plug on it to see if that corrects the problems.
0
Rob WilliamsCommented:
I think on an Procurve the command is   show mac    to see what MAC is attached to what port.

Delta Networks? or Dell?  The PC is a Dell.  were you checking the MAC of the PC, 192.168.0.7 or of 192.168.0.1

What about the DNS names in your IPconfig, are they correct?
SWIK-D-FSTEFANI
tx.local
chi.ameritech.net

The latter seem odd for internal use.
0
Eric JackIT ManagerAuthor Commented:
From the PC that's got the 192.168.0.7 IP, I did arp -a |find "192.168.0.1" and wrote down the MAC it provided for that IP address.

I also tried an nslookup 192.168.0.1 and nbtstat -a 192.168.0.1 hoping to glean a name of something I can find, but no luck. Get unknown.

And of course I can't figure out how to find the MAC on the HP switch. I can't find a way to list the ports with the MAC addresses, and the diagnostics tab has a Link Test where you enter a MAC, but it keeps failing. I can't even get the Link Test to find my own MAC address, which makes me think I'm using the wrong format or something.
0
JohnBusiness Consultant (Owner)Commented:
If you try a trial version of Comm View (tamosoft.net) it can show you packets for an IP address and in the packets are MAC addresses. Not exactly what you need but it might help.
0
Eric JackIT ManagerAuthor Commented:
I just found the command to list MACs from a telnet session into the switch rather than the web-based Java gui. I found the MAC... and it's on the fiber trunk to another switch. So off I go on hunting this thing down!

Telnet to switch, log in, SHOW MAC-ADDRESS
0
Eric JackIT ManagerAuthor Commented:
I think I licked it!

Using the telnet sessions into my HP switches, I traced the MAC for 192.168.0.1 back to a port. And sure enough, it was one of Engineering's PCs in the product development test area. The port had a non-IT switch hanging off it with a domain PC and some other unknown systems! Unplugged that switch from my domain and things started working normally again.

Tested the next office over PC. Renewed the IP multiple times and it got a DC issued DHCP address in the 10.16.68.X range every time! And an oddity with one of my wireless APs cleared up too. Thank you engineering! (Sarcasm implied.)

I'll review the thread shortly to accept the solution(s). Looks like maybe I can finally start my vacation which was supposed to begin at noon on Friday. At least I won't need to come in tomorrow morning.
0
Rob WilliamsCommented:
Sounds like progress !
0
Eric JackIT ManagerAuthor Commented:
So the primary problem was a rogue DHCP server put onto the network by someone in Engineering. I tried to select the responses that directly related to this as the solution. Thank you everyone for helping me muddle through this.
0
Rob WilliamsCommented:
Thanks Eric.  Glad you were able to solve....and go on vacation.
Though my suggestion of a rogue DHCP server was ultimately the issue, John's suggestions were valid in troubleshooting this sort of problem.  I am happy to share the points.

Cheers!
--Rob
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Networking

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.