Link to home
Start Free TrialLog in
Avatar of camowen
camowen

asked on

Windows 2003 Multihomed Domain Controller generates 1054 Userenv error

Windows 2003 standard server Application log: Userenv, 1054, Windows cannot obtain the domain controller name for your computer network. (An unexpected network error occurred.) Group Policy processing aborted. My troubleshooting steps are listed below. Error was repeating every 5 - 10 minutes, now every 15 minutes, so still not fixed.

This server is multihomed - one nic for the user network, one for server backup. There are 6 AD sites, 5 associated with the primary data center (3 DCs), 1 site associated with a secondary data center (1 DC). No errors in file replication between DCs, no Directory Service errors. Only one DC is generating Userenv errors (just added this week), the other 3 DCs do not show this error. The server is running Exchange 2003 with SP2. Another server is also running Exchange 2003 on Windows 2003 standard, also multihomed (this is a hardware migration project - original server will be retired), and no errors on the original Exchange server. Server is running Windows 2003 R2 Standard edition, SP2, all critical updates.

There is a great deal of troubleshooting information available on this error. I have corrected this error on many implementations of Windows. There are 5 tests that generally get to the bottom of this, all well documented on the Internet:
1. Check IP configuration, and make sure only DNS servers are listed that contain AD entries
2. run netdiag and dcdiag and look for errors
3. run gpupdate and check to see if another 1054 logged.
4. check dns: access \\mydomain.com\sysvol\mydomain.com
5. check that can access AD with tools such as dsa.msc (AD Users and Computers).

All of these tests passed, and I found no problems. Well, there was one additional troubleshooting step that I found helpful: %systemroot%\system32\config\netlogon.dns listed entries for both the user and backup networks, and that was not correct. I believe this was corrected by clearing the option "Register this connection's addresses in DNS" (nic properties; TCP/IP properties, advanced, DNS tab). I had to reboot to regenerate the file.

even gpupdate ran with no problems, and did not log an error, and gpresult gave expected results. But the Userenv 1054 errors continue to appear, about every 10 minutes, sometimes every 5 minutes.

I thought I had the fix with this regedit. I think it clearly made a difference, and it is reproducible. I made the fix, no errors in first hour after boot, remove the fix and get lots of errors:

hlm\system\ccs\services\dns\parameters: add string (REG_SZ) value PublishAddresses and set to list network address of server to be published (if more than one address, separate with spaces).
hlm\system\ccs\services\netlogon\parameters: add DWORD value RegisterDnsARecords, set to 1
Stop and restart DNS and Netlogon services. I would think that clearing the option "Register this connection's address in DNS" would have the same effect, but as I say, results were reproducible.

I now see a single DNS entry for the server, rDNS is correct, netdiag and dcdiag run without errors, netlogon.dns is correct, nslookup correctly resolves the server name, gpupdate runs without errors, and I get only a few Userenv 1054 errors.

But I still don't have a full answer, as the error continues to appear: one hour after boot; 45 minutes later, then 18 minutes later.
Avatar of Hypercat (Deb)
Hypercat (Deb)
Flag of United States of America image

Sounds like you've done a very thorough job already on this problem.  If you haven't rebooted your server since doing the regedits, try a simply ipconfig /flushdns and then ipconfig /registerdns.  Maybe it's got something in the DNS cache that is causing the errors.  Is this server a DNS server?  If so, is it pointing to itself for primary DNS?
ASKER CERTIFIED SOLUTION
Avatar of Netman66
Netman66
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of camowen
camowen

ASKER

thank you for your response. The server is a DNS server, and is pointing to itself, listed first. Listed second is another W2K3 DC DNS server. Since this server is not actively hosting user data I am free to reboot, and have done so frequently as I troubleshoot. I may be lucky to find that something has been cached within AD (DNS is AD integrated), and maybe more time will clear the problem. It has been an hour since last boot and still no errors, but that has been the nature of this thing before - 1+ hour latency, and then recurrence.
If you want to test it without waiting, you can force a group policy update and see if the error occurs. From a command prompt, type "gpupdate /force" (without the quotes of course).
Avatar of camowen

ASKER

netman66: good suggestion! backup network nic was listed first - now corrected. Now must wait for results...
Avatar of camowen

ASKER

hypercat: gpupdate /force never caused the error to occur. One of the frustrations of tracking this particular problem.
Ah I see - I think the NIC suggestion was a very good one too and may just be the answer.
Binding order affects the Services that must bind to a given NIC.  The wrong NIC order causes the services to use the wrong NIC to service.

I'm concerned that since other issues were corrected for the server with the NICs in the wrong state, that a little more tweaking will once again be necessary after this is fixed.

Keep us posted.
Avatar of camowen

ASKER

I think the NIC binding order is likely the key. I can see that it will be some time to prove that out, as I had to change binding order on two other DCs. Not sure why they never showed the error. And I suspect that servers will need to be rebooted in the end - something that will have to be scheduled. But the error rate is greatly reduced. So I will close out, and thank you for your help!
No Problem.  Let us know if there are other issues now.

Thanks,
NM
Avatar of camowen

ASKER

Well, there is a sequel. Once I am confident of what I say here, i would be glad to write this up in a format suitable for the knowledge base. I think key words would be "Active Directory" "domain controller" multi-homed userenv 1054

I discovered that all of the troubleshooting described above was for naught! And I did quite a bit of additional work on this, as well. Finally my mind went back to my NT4 days, when the rule was simply that you cannot have a multi-homed domain controller. So, I disabled the second network card (part of the tape backup network), set its properties to DHCP, set DNS to another DNS server (DNS is AD integragated), and demoted to a member server. And I gave plenty of time for changes to replicate to all sites and DCs. All instances of the 1054 error went away. With the second network card still disabled, I promorted the server back to a DC, and again gave plenty of time for the change to replicate to all sites. Only then did I enable the second network card. Well, I am ahead of the story at this point. But two other DCs were built in this way, with no problems, and so I expect the same of this server now.

Back to the userenv errors. Very frustrating! I could reboot and observe no errors for hours, even a day. And then an occassional error - maybe one now, one in an hour, then 15 minutes, then another hour. But always precisely on a five minute boundary, to the second. This says to me directory synchronization. And after a while the error occurs every 5 minutes. There is a utility, dfsutil, that shows which server is the DFS server. This should always be a server in the same site. "dfsutil /pktinfo" will list all DCs and indicate which is currently assigned as the dfs server. This became a tangent for me, as it consistently showed a server in the same site as the ACTIVE TARGETSET, but then a server in a different site as the TARGETSET. I suspect this is wrong, and a bug within AD. (this was a known bug in Windows 2003, SP1, for which a hot fix is available and a registry edit required, and can result in very slow boot times for XP machines.) But it is also how two other DCs in the same site are set, and they have no userenv errors. So I think my work down this path did not address the real problem.