DHCP Client needs to be constantly restarted to restore server connectivity

Hi there,

In another thread, we've isolated our SBS 2003 server's connection problems to the DHCP Client needing to be restarted.  I have it set to restart every hour, though sometimes the connection fails before that.  Server loses outside connection, and all work stations lose internet as a result.  

DHCP is definitely off on the router.  

Background: our server went out for a new power unit, I turned on DHCP on the router in the meantime, disabled it when the server came back, and we've had this problem ever since. Failing hardware?  Failing NIC?  Something else?  I'm at a loss as to what to try next...  

Any help would be appreciated!
KentenAsked:
Who is Participating?
 
RobertPartenCommented:
Kenten, you say there are two network adapters on that machine. Please ensure that the other adapter is disabled and PLEASE ensure there are NO static IP's assigned to that interface
0
 
zippybungle2003Commented:
an erros in event viewer?
0
 
KentenAuthor Commented:
Nothing related to the times when it goes down.  There a recurring "KDC" error daily about there being multiple accounts with the name MSSQLSvs...  But that's it.
0
Cloud Class® Course: Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

 
RobertPartenCommented:
WHen you check on this, is the DHCP server "Service" in the stopped state in the services.msc?

Ensure it is set to Automatic/Started and see what happens. Check ALL devices on the network for a DHCP server. It is possible that you could have a bad NIC, but also check to see if you are having issues at another point in the network. Kind of hard to examine when there isn't much known about the network. But whatever device you were using when SBS went down I would recommend double checking to ensure DHCP is turned off.
0
 
KentenAuthor Commented:
Router: DHCP unchecked, so it's off.  Tested it by connecting a laptop to it, laptop failed to acquire a network address.

DHCP Client service is set to automatic, Recovery set to restart the service on 1st, 2nd, and subsequent failures.  "Restart service after" is set to 1 minute.

Dependencies: AFD, and TCP/IP Protocol Driver.  Might one of these be failing?  I'm guessing no, since restarting DHCP Client is all that is needed to restore service.

I can tell you anything you want to know about the network, it's pretty straightforward.  8 Windows Vista workstations connected to a SBS 2003 server, one Linksys switch connecting it all and DSL internet coming through a Linksys router.  Some workstation would have randomly started doing DHCP, would it?
0
 
RobertPartenCommented:
What about the server....check services.msc on the SERVER side and see what that is set to.
0
 
KentenAuthor Commented:
Thanks for your reply!

Those ARE the settings on the server, sorry if I was unclear.

DHCP Client service is set to automatic, Recovery set to restart the service on 1st, 2nd, and subsequent failures.  "Restart service after" is set to 1 minute.  Login is set to NT AUTHORITY/Network Service account.

Or am I misunderstanding you?
0
 
KentenAuthor Commented:
Oh, unless you're talking about DHCP Server service.  Yes, it's "Started",startup is automatic,  and set to restart the service on failure.
0
 
KentenAuthor Commented:
Interestingly, while internet / outlook / outlook web access, etc all go down every 10-minutes to 1 hr, necessitating the DHCP client restart, I'm connected remotely and never lose my VPN or MSTSC connection.
0
 
RobertPartenCommented:
Are there any other SBS servers in the network at this time? I have seen this issue when two SBS servers are on the same subnet. When you restart this server are all the services in services.msc showing stopped? Event viewer shows nothing?

From the server, can you ping to an outside address? Can you plug in a new client machine and get a DHCP address? Is the SBS server set to a static IP address that is OUTSIDE the scope of the DHCP server address lease pool?
0
 
KentenAuthor Commented:
Just the one SBS server.  Nothing in the event viewer that I can see.

SBS server is set a static address of 192.168.2.100, the router is .2.1, while the address range is .2.1 to .2.254 with the server/router ips set to "excluded from distribution".

When restarting, the services all start up as they're set to do, and connection is good right away.

I can't ping anything outside the network from the server when it's down.

0
 
RobertPartenCommented:
Ok, so the network is down from the server. Is the adapter disabled? Can you ping the localhost (and also the IP of the server itself) to see if you get a response?

Have you tried pining the gateway device to see if you can at least ping inside the subnet? Disabled any and all power saving configuration on the adapter under the network properties adapter settings.
0
 
KentenAuthor Commented:
Power saving on the NIC was checked: it was disabled.
When I ping the server IP while it's down, I get a response.  Pinging the router also gets a response.
0
 
RobertPartenCommented:
You are pining the router from the server when it is down right?

What type of router is between your ISP and you? Because if you are able to ping your router from the server and not ping 4.2.2.2 and get a response than you may have an issue with your ISP/router. Is there a switch between the router and your server? If so, what kind?
0
 
KentenAuthor Commented:
It's down again now.

All from the server:
I can ping other workstations, and the router, and the server IP.
But no response from google.com
Hardware failure from 4.2.2.2
0
 
KentenAuthor Commented:
Oh, and the switch is a Linksys SR2016 "16-port 10/100/1000 Gigabit Switch"
0
 
KentenAuthor Commented:
I never even considered the switch.  Is that something that could suddenly have decided to run DHCP?  Is it something I can log into and configure?
0
 
RobertPartenCommented:
This is a sticky situation in that it could be your router that is failing; however, you could also have an issue with the server hardware itself.

Can you do a snapshot of the ping stats for the local IP of the router and 4.2.2.2 so I can see them? WIthout starting th server back up, can you go to a workstation and ping the router IP and then 4.2.2.2?

If you get a hardware failure from the PC and the server, I am going to point to an issue further upstream and not at the server level. I assume your SBS box doesn't run the ISA firewall right (aka you aren't using SBS to be a router).
0
 
KentenAuthor Commented:
I'll do that. I'm connected remotely right now, so I'll have to go in tomorrow and ping from the other workstation.  

In the meantime, is there anything else I could investigate?  Does needing to restart DHCP Client hint at what might be failing, hardware or software?  Should I configure the servers 2nd unused NIC the same as the current one, and switch over to that and see if there's any difference?

Or should I contact the ISP when it's down, and see if they can reach into the router and see if it's active?  Maybe they would have some insight?  (I heard that when we switched to our current ISP, our internet was down for a week as they had to customize all kinds of things in the router...)
0
 
RobertPartenCommented:
I have no idea about "customization" in a router as routing is just...routing no matter what. If possible, switch NIC's on the server and see what happens; however, I would be very interested in knowing what happens from a client machine. Let us know when you get back in.
0
 
ChiefITCommented:
System Event logs for event 5719?

Event logs for DNS errors?

Do not use two nics on an SBS domain.

Cisco router with that switch?

Also, please go to the server's command prompt and provide:
DCdiag /test:DNS
It looks like you might have a bad forwarder.

Also, you can try DCdiag /v for config errors. and Netdiag /v for issues in the network config...

Please make sure DHCP is DISABLED on the router. Otherwise it will try to provide DNS as well. This will cause DNS issues with the server.
0
 
RobertPartenCommented:
@Chief, he can't even ping 4.2.2.2 from the server itself (read the comment he posted about the error message he got back from pining 4.2.2.2) thus it sort of rules out DNS as the culprit.
0
 
KentenAuthor Commented:
Hi again!  So here's the results of the workstation ping.  Interestingly, the workstation can ping 4.2.2.2, while the server has that hardware error.

 Ping from server
 ping from workstation
0
 
KentenAuthor Commented:
Just in case there's anything useful here, here's the NIC settings:

(unchecked) Network load balancing
(checked) File and Printer Sharing for Microsoft Networks
      Optimized: Maximize data throughput for network applications

(checked) Enable IEEe 802.1x authentication for this network
(checked) Authenticate as computer when computer information is available


IP address: 192.168.2.100
Subnet mask: 255.255.255.0
Default gateway: 192.168.2.1

Preferred DNS server: 192.168.2.100

Advanced TCP/IP Settings
 Default gateways: Metric: Automatic
 (checked) Automatic metric

 DNS server address: 192.168.2.100
 (selected) Append primary and connection specific DNS suffixes
 (checked) Append parent suffixes of the primary DNS suffix

 (checked) Register this connection's addresses in DNS

WINS
 WINS addresses: 192.168.2.100
 (checked) Enable LMHOSTS lookup
NetBIOS setting
 (checked) Enable NetBIOS over TCP/IP

Options
 TCP/IP filtering (not enabled)
0
 
KentenAuthor Commented:
Maybe useless, but:

Looking at DHCP properties, the DNS settings are set to
   (checked) enable DNS dynamic updates according to the settings below:
      (selected) Dynamically update DNS A and PTR records only if requested by the DHCP clients

I'm wondering why restarting DHCP Client restores the connection.  Either DHCP Client is frozen and needs to be restarted, or else the restart itself is affecting something else maybe something like the above DNS update?
0
 
KentenAuthor Commented:
While I'm looking at DHCP settings, here's the Scope properties.  I was hoping to see something weird, such as leases being only for 10 min or soemthing.  No such luck:

Lease duration for DHCP clients
  limited to: 8 days
DNS : (not enabled) Enable DNS dynamic updates according to the settings below
Advanced:  Assign IP addresses dynamically to clients of (selected) DHCP only
0
 
RobertPartenCommented:
kenten: Please give me the output of the following commands:

route print

netsh interface ip show config

ipconfig /all

from the server.
0
 
KentenAuthor Commented:
Hi Robert, do you want the output from when connectivity is up or down or both?

Yes, the 2nd NIC is disabled, and while it had a static IP, I removed it just in case.  I also enabled then re-disabled it, in case it never fully disabled the first time, or some such nonsense.  :)
0
 
ChiefITCommented:
Someone set up an ACL that blocks the MAC or IP address of the server from going out on the network.

You can ping the gateway, you just can't get out the gateway..

One thing you might do is go to the command prompt and clear your ARP cache. ARP cache is short for Address Resolution Protocol, and that cache may be pointing to the second (disabled) nic for going out the network when routing.

To view the cache entries, type any one of the following commands:
arp -a
arp -g

To delete the entries, type the following command:
arp -d IP address

To flush the ARP cache, type the following command:
netsh interface ip delete arpcache

But, don't forget to look at the router to make sure there is not an ACL that blocks that IP or MAC of your server from going to the outside world...
0
 
RobertPartenCommented:
Kenten, let us know if what you did fixed it. I woudl like to see the output from the server both up and down. Also, ChiefIT's recommendation of flushing the arp cache is a good one I forgot. Do you manage the router or does your ISP? s per ChiefIT, I would see if there are any ACL's setup in that router...otherwise I don't know too many ISP's that configure ACL's that block internal computers from being able to access the Internet.

Let me ask you, when the server is up and running, are you able to ping 4.2.2.2?
0
 
RobertPartenCommented:
Any luck so far?
0
 
KentenAuthor Commented:
HI Experts!

Sorry for the delay, I was taken off of this before I could resolve it, and handed it back a week later.  :)  But I resolved it, finally.  Thank you!

So a week before all of this, I followed a different expert's advice and explicitly went and disabled that 2nd NIC.  When I went to look at it again to see if it had a static IP, as Robert suggested, there the NIC was, enabled.  I disabled it again, removed the static IP, and all is working now.

Now I absolutely can't for the life of me, believe that I somehow DIDN'T disable it, or disabled it wrongly somehow. I had never disabled a NIC before, and clearly remember going through the steps for the first time.  Yet here it was enabled.

Either it woke itself up in on of those dozens of server restarts, or else somehow I'm an idiot?  Anyway, I'll keep an eye on it to see if it ever wakes up again...  And I greatly appreciate all of your help!

(P.S. I did flush the ARP cache manually, sometime early in all of this.)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.