Windows 2003 SBS DC problems after adding 2012 as DC

I will be migrating from a 2000/2003 to two 2012 DC's over the next 12 months. I have demoted the 2000 DC, Upgraded the forest to 2003, and added the two 2012 DCs. The plan is to have one 2012 server as the production server and the other as a backup DC/backup files.

The problem I am having is when I test the 2012 DC by itself. I bring down the two 2012 DCs then restart the 2012 DC. It comes up, but has severe problems. Exchange doesn't work and after a while neither does the domain function (cannot logon). Please note that the 2012 server is the production, and the 2000 was a backup DC. Then, when I shut down everything, bring up a 2012 DC, then bring up my 2003 server everything works fine.

The first thing I see that looks wrong, in the system event log is an SPNEGO 40960 error from LsaSrv shortly after IPL:

The Security System detected an authentication error for the server LDAP/SVR02.  The failure code from authentication protocol Kerberos was "There are currently no logon servers available to service the logon request.

There are many 40960 errors, and the other errors seem to indicate a problem with logon servers. They seem to slowly bring the ship down.

What can I do so that the 2003 DC can run without the 2012 DCs? The 2012 DCs are supposed to be backup DCs
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Lee W, MVPTechnology and Business Process AdvisorCommented:
Did you make the 2012 server a global Catalog server?  Did you run DCDIAG /C /E /V prior to promoting the 2012 server(s) to DCs?  Did you run DCDIAG /C /E /V after you promoted them?  Exchange requires a global catalog to work (and it will take time to find it if the one it was using isn't available - it's not instantaneous).  In a native mode domain, Global Catalogs process logons.
Gareth GudgerSolution ArchitectCommented:
Which server has the FSMO roles. When you are doing this testing, are you shutting down the DC that holds all the FSMO roles? If so = not good. Move the FSMO roles to the 2012 server and retest.
MikeBroderickAuthor Commented:
I did not do anything when running the promotion dialog to accomplish it, but the 2012 servers are GC servers. Note: the 2003 server is a GC server. No, I did not run DCDIAG. I will try it tonight.

All FSMO roles are held by the 2003 server. My test is to shut down the 2012 servers then restart the 2003 server to see if it will run without the 2012 servers. Are you sure you want me to move the roles to a 2012 server?

Thank you for your help.
CompTIA Cloud+

The CompTIA Cloud+ Basic training course will teach you about cloud concepts and models, data storage, networking, and network infrastructure.

Gareth GudgerSolution ArchitectCommented:
I guess I misunderstood what you were trying to accomplish. For logon services and Exchange Services to work, the server holding the FSMO roles needs to be running.
MikeBroderickAuthor Commented:
Sorry I wasn't clear. Eventually I will move away from the 2003 Server but for now it is the production server. I was treating the 2012 servers as throw-away because nothing was on them. I'd leave them down all night or for a few days, and started noticing problems if I rebooted the 2003 server.

I ran the DCDIAG command. It passed all tests except 2:

    Starting test: Services
         * Checking Service: Dnscache
         * Checking Service: NtFrs
         * Checking Service: IsmServ
            IsmServ Service is stopped on [SVR02]
         * Checking Service: kdc
         * Checking Service: SamSs
         * Checking Service: LanmanServer
         * Checking Service: LanmanWorkstation
         * Checking Service: RpcSs
         * Checking Service: w32time
         * Checking Service: NETLOGON
         ......................... SVR02 failed test Services

The other failure was due to warnings/errors in the event viewer in the last 24 hours. Since replication messages were there, I discounted this section. I can send you the whole file if you want to see it.

Brad HeldCommented:
If you are shutting down the 2012 servers make sure it points to itself for dns first - then the 2012 servers, I would also make sure that the 2012 servers point to the 2003 first as well then the other 2012 servers. The reason for saying that is that usually netlogon errors has to do with unable to find a dc in dns.

Is the 2003 a global catalog?

Another thing I would like to clarify is that there is really no such thing as a backup domain controller. All domain controllers are active, process logons and changes, changes attributes, process group policy etc. The pdc emulator is normally the authoritative time source, manages trusts, urgent replication of passwords etc.

Exchange I assume point to the 2003 server first for dns, then a 2012 server.

Is this a 2003 SBS server? If so then set the intersite messaging service to automatic and start the service - then rerun the dcdiag. That service is used for smtp replication which I am sure your not doing, but that should fix the dcdiag.
Brad HeldCommented:
One other side note, generally not a good idea to turn off dc's for days on end. That is a recipe for disaster. Think of it this way: Users are changing passwords, logging in, new groups are created, computers are changing their passwords, group policies are being edited. If the 2003 server goes down and has to be rebuilt there is a good chance all of those changes will be lost. its generally better to just let those additional dc's run, it'll cut down on the replication when they come up and will protect you in case of a disaster.
MikeBroderickAuthor Commented:
When you say points to itself first, where do you mean? On the network adaptor, each server points to the 2003 server first. When I open the DNS console, I see the 2012 DCs listed before the 2003 server. Should I change them here? If so, how? Will the changes propagate to the other DCs?

Yes, I know that technically there is no backup DCs I was referring to a backup in case one goes down.

Later I'm going to shut down the 2012 machines and retry the dcdiag routine. I'll let you know if I see anything.
Brad HeldCommented:
No under TCPIP IPv4 of the network adapter. The primary should be the 2003 and secondary should be one of the 2012 servers, and optionally third would be the other 2012.
So under normal conditions the dns server listed first is where the DNS records are registered and then propagated to the other servers. Secondaries are used when primary is unavailable, which is why the secondaries should never point to an ISP dns server.

Changes are replicated via Active Directory replication based on what the KCC calculates should be the replication topology. Yes replication requires DNS only in that all SRV records and host records are registered. Generally speaking dns should be active directory integrated which means that the AD zones are stored in an Application partition either domainDNS or forestDNS depending on the scope of replication. If all servers go to the same server first then that means that your fsmo role holder is updated when each of your dc's register their srv records, making it the single point of truth.

As you get ready to retire the 2003 change the dc's to point their primary dns to where you want the fsmo role holder to be.
MikeBroderickAuthor Commented:
I ran the dcpromo again, with the 2012 servers down. It looked OK, ignoring the errors on the replication tests .

I think I found the problem. The following error was in the directory service section of the event log. I didn't read it closely before because its title said replication, and missed its significance:

Source: NTDS Replication  Category: Replication  ID: 2092

This server is the owner of the following FSMO role, but does not consider it valid. For the partition which contains the FSMO, this server has not replicated successfully with any of its partners since this server has been restarted. Replication errors are preventing validation of this role.
Operations which require contacting a FSMO operation master will fail until this condition is corrected.
FSMO Role: CN=Schema,CN=Configuration,DC=BroderickData,DC=local
In other words, its working as designed. I've had a 2 DC setup for over 10 years and never noticed this.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Gareth GudgerSolution ArchitectCommented:
For logon services and Exchange Services to work, the server holding the FSMO roles needs to be running.

Back to my original comment. :)

Glad you find the problem.
MikeBroderickAuthor Commented:
Thank you guys for the help. It was a group effort.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2003

From novice to tech pro — start learning today.