Avatar of meirionwyllt
meirionwyllt

asked on 

Testing DHCP failover - clients unable to contact the 'standby' DHCP server

Our main DHCP server is on Windows Server 2012 R2. We have a failover partner on Windows Server 2019. We have set up DHCP failover, which appears to be set up OK, because all the scopes are fully in sync, as are the leases.

I tried to test the failover, so I shut down the Active server. The Standby server was report Connection Interrupted. At this point, one of the test clients, I did ipconfig /all to test whether the client was at least point towards the Standby server, which it was. Also the DHCP Server logs in Event Viewer had events saying that "This DHCP server ###### has transitioned to a PARTNER DOWN state for the failover relationship #### and the MCLT period of 3600 seconds has expired. The server has taken over the free IP address pool of the partner server ##### for all scopes which are part of the failover relationship. So, all good I thought.

I then did an ipconfig /renew on this, but I got...

An error occurred while renewing the interface Ethernet : unable to contact your DHCP server. Request timed out.

Thinking that it could potentially be a port problem, I then did a telnet to the standby server on various ports (67,68,69) which all failed to connect (should the DHCP have allowed this telnet request to connect??)

In case it would make any difference, I clicked the 'Partner down' button on the Standby server in the Failover settings. It didn't make a difference. In fact, it was a bit annoying that I did that because after the Active server came back up it was in 'Recover wait' mode until the Maximum Client Lead Time elapsed, so for that time I had no functioning DHCP server at all!! It's back to 'Normal' now so that's fine, and it's giving out leases.

Anyway, I'm not sure what to try next. I've powered the Active server back up for now, and I can test again at a later point.

Any ideas what might be wrong here? I'm not sure if the problem is due to the failover not working, or due to a general DHCP server problem with the Standby server. Are there any other tests I can run to find out, before actually trying the failover test again?

Thanks.
Windows Server 2019Windows OSWindows Server 2012DHCP

Avatar of undefined
Last Comment
meirionwyllt
ASKER CERTIFIED SOLUTION
Avatar of Peter Hutchison
Peter Hutchison
Flag of United Kingdom of Great Britain and Northern Ireland image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of meirionwyllt
meirionwyllt

ASKER

Ah. Didn't realise that this was needed. It probably hasn't been done, because it's a new server that I've built, and the Network team might not know about it.

Would that need to be done only on the switch that's connected to the new server, or would need to be done on all switches for the clients as well?

Thanks
EXPERT CERTIFIED SOLUTION
Avatar of kevinhsieh
kevinhsieh
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
Avatar of meirionwyllt
meirionwyllt

ASKER

Excellent. Thanks. I'll get our network team involved on Monday and report back.
SOLUTION
Avatar of Hello There
Hello There

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
Avatar of meirionwyllt
meirionwyllt

ASKER

I asked Networks to check the core switch. Correct, the 'DHCP Standby' server's IP address wasn't added as an iphelper address, and this needs to be done.

However, they said that this wouldn't have caused the issue that I was seeing, because both DHCP servers and the test clients were all on the same VLAN, so iphelper would not have been needed. So they think there is another problem at play here. Any ideas?

There was a different IP address listed as an iphelper address for the above VLAN, though - an IP address of a machine in a different network that is no longer a DHCP server even. Could this old and incorrect entry be confusing things?

Thanks
SOLUTION
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
SOLUTION
Avatar of kevinhsieh
kevinhsieh
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
Avatar of meirionwyllt
meirionwyllt

ASKER

Hi, thanks for the recent messages.

Just to clarify on my initial testing. I had three clients. On #1 I had done an ipconfig /release before the Active server was shut down (and I had checked the leases on both servers to make sure that it has been removed), and #2 and #3 had active leases. The idea with #1 and #2 was to do ipconfig /renew when the Standby server was in the Connection Interrupted state, and for #3 I would do ipconfig /renew when the Active server was in the 'Partner down' state. But the ipconfig /renew failed in all three scenarios.

In hindsight, what I should have done is tried it on a client on a different VLAN, but I was working in a small timeframe.

Yes, the DHCP server has been Authorized with AD.

As for the Firewall, I disabled the profiles for Domain, Private and Public when I was trying to troubleshoot, but the problem remained.

Avatar of meirionwyllt
meirionwyllt

ASKER

Hello, OK I have an update on this. I got Networks to remove the obsolete iphelper address from the core switch.

I then re-tried the previous test exactly the same, with the 3 clients testing different scenarios. (Although this time I just stopped the DHCP Server service on the Active server, rather than switching off, as suggested above.

For client1 (where I had done the ipconfig /release before the stopping the Active server), this was able to successfully get a lease from the standby server. SO this does suggest that the other iphelper address was confusing things.

The other two (client2 and client3) still failed to renew.

Am I right in thinking that...

a) client2 and 3 would continue to work as normal (in terms of network connectivity) because they already have a lease when the Active server goes down?
b) that if these leases were to expire and the Active server still wasn't back up, then these clients would then behave the same as way as client1 (i.e. no lease) and therefore be able to get one from the standby server?

Thanks.
SOLUTION
Avatar of kevinhsieh
kevinhsieh
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
Avatar of meirionwyllt
meirionwyllt

ASKER

OK great, in that case I can close this question. Difficult to know how to do the points because I feel I've learnt a bit from everyone, so I'll spread the points out.

Thanks everyone for your help.
Windows OS
Windows OS

This topic area includes legacy versions of Windows prior to Windows 2000: Windows 3/3.1, Windows 95 and Windows 98, plus any other Windows-related versions including Windows Mobile.

129K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo