odewulf
asked on
checking which DC is acting as the PDC after a dcpromo
We have 2 DC: 1 SBS 2003 and 1 server 2003 on different sites and subnet (the sbs is in SF and the server 2003 in DC). The server 2003 had a bad hard drive and the AD got corrupted so we could not do the replication anymore. we decided to demote the server 2003 and then repromote it. here is what I did:
1. check the fsmo roles: they are all done by the SBS server (good)
2. run dcpromo on server 2003: everything went fine and I restarted the server 2003
3. check SBS server:
- run metadata clean up
- there is nothing in the DNS, AD, site and domain about the server 2003
4. try the replication and everything is working fine.
5. the log events are fine, the fsmo are managed by sbs so everything looks great until we tried login in in DC
6.when we tried login in to a workstation in DC it was taking a really long time to apply the computer settings
7. since I didn't restart the SBS yet, I decided the restart the SBS and the server 2003
8. SBS came back up first and when I tried to login I was getting an error saying that the domain doesn't exist or is unavailable. so it looks like the master DC was the server 2003.
Once the server 2003 came back up I was then able to login to SBS and login to the workstations in DC and in SF were fine as well.
9. I checked who is acting as the PDC and it is the SBS server. In the DC the logon server for the workstation is the server 2003 and the group policy are applied from the sbs server so everything looks fine but I am not sure if that will happen again as I believe the SBS should always be the master meaning that even if the server 2003 is down other people should still be able to logon to the domain, is there a way for me to check that?
or was it just because I needed to restart the sbs server once before all the changes are applied?
thanks a lot for your advice
Gaetan
1. check the fsmo roles: they are all done by the SBS server (good)
2. run dcpromo on server 2003: everything went fine and I restarted the server 2003
3. check SBS server:
- run metadata clean up
- there is nothing in the DNS, AD, site and domain about the server 2003
4. try the replication and everything is working fine.
5. the log events are fine, the fsmo are managed by sbs so everything looks great until we tried login in in DC
6.when we tried login in to a workstation in DC it was taking a really long time to apply the computer settings
7. since I didn't restart the SBS yet, I decided the restart the SBS and the server 2003
8. SBS came back up first and when I tried to login I was getting an error saying that the domain doesn't exist or is unavailable. so it looks like the master DC was the server 2003.
Once the server 2003 came back up I was then able to login to SBS and login to the workstations in DC and in SF were fine as well.
9. I checked who is acting as the PDC and it is the SBS server. In the DC the logon server for the workstation is the server 2003 and the group policy are applied from the sbs server so everything looks fine but I am not sure if that will happen again as I believe the SBS should always be the master meaning that even if the server 2003 is down other people should still be able to logon to the domain, is there a way for me to check that?
or was it just because I needed to restart the sbs server once before all the changes are applied?
thanks a lot for your advice
Gaetan
Do a dcdiag and let me know what it reports.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
thanks for your fast reply.
let me do a dcdiag and post the resutl.
the SBS server has the 5 FSMO roles.
right now only the server 2003 is the global catalogue server. do I need to restart the servers after doing that? what does that change?
thanks again for your help
let me do a dcdiag and post the resutl.
the SBS server has the 5 FSMO roles.
right now only the server 2003 is the global catalogue server. do I need to restart the servers after doing that? what does that change?
thanks again for your help
No restart required. What that does is allows both servers to authenticate users
ASKER
sorry right now only the SBS has global catalogue checked. why do I need both to be global catalogue?
thanks
thanks
ASKER
ok so I am going to check it for both servers
here is the dcdiag result
here is the dcdiag result
here it the one for the SBS server:
Domain Controller Diagnosis
Performing initial setup:
Done gathering initial info.
Doing initial required tests
Testing server: Default-First-Site-Name\SBS
Starting test: Connectivity
......................... SBS passed test Connectivity
Doing primary tests
Testing server: Default-First-Site-Name\SBS
Starting test: Replications
......................... SBS passed test Replications
Starting test: NCSecDesc
......................... SBS passed test NCSecDesc
Starting test: NetLogons
......................... SBS passed test NetLogons
Starting test: Advertising
......................... SBS passed test Advertising
Starting test: KnowsOfRoleHolders
......................... SBS passed test KnowsOfRoleHolders
Starting test: RidManager
......................... SBS passed test RidManager
Starting test: MachineAccount
......................... SBS passed test MachineAccount
Starting test: Services
IsmServ Service is stopped on [SBS]
......................... SBS failed test Services
Starting test: ObjectsReplicated
......................... SBS passed test ObjectsReplicated
Starting test: frssysvol
......................... SBS passed test frssysvol
Starting test: frsevent
There are warning or error events within the last 24 hours after the
SYSVOL has been shared. Failing SYSVOL replication problems may cause
Group Policy problems.
......................... SBS failed test frsevent
Starting test: kccevent
......................... SBS passed test kccevent
Starting test: systemlog
......................... SBS passed test systemlog
Starting test: VerifyReferences
......................... SBS passed test VerifyReferences
Running partition tests on : ForestDnsZones
Starting test: CrossRefValidation
......................... ForestDnsZones passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... ForestDnsZones passed test CheckSDRefDom
Running partition tests on : DomainDnsZones
Starting test: CrossRefValidation
......................... DomainDnsZones passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... DomainDnsZones passed test CheckSDRefDom
Running partition tests on : Schema
Starting test: CrossRefValidation
......................... Schema passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Schema passed test CheckSDRefDom
Running partition tests on : Configuration
Starting test: CrossRefValidation
......................... Configuration passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Configuration passed test CheckSDRefDom
Running partition tests on : hce
Starting test: CrossRefValidation
......................... hce passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... hce passed test CheckSDRefDom
Running enterprise tests on : hce.local
Starting test: Intersite
......................... hce.local passed test Intersite
Starting test: FsmoCheck
......................... hce.local passed test FsmoCheck
ASKER
and here is the one for the Server 2003
Domain Controller Diagnosis
Performing initial setup:
Done gathering initial info.
Doing initial required tests
Testing server: Default-First-Site-Name\DC-FS
Starting test: Connectivity
......................... DC-FS passed test Connectivity
Doing primary tests
Testing server: Default-First-Site-Name\DC-FS
Starting test: Replications
......................... DC-FS passed test Replications
Starting test: NCSecDesc
......................... DC-FS passed test NCSecDesc
Starting test: NetLogons
......................... DC-FS passed test NetLogons
Starting test: Advertising
......................... DC-FS passed test Advertising
Starting test: KnowsOfRoleHolders
......................... DC-FS passed test KnowsOfRoleHolders
Starting test: RidManager
......................... DC-FS passed test RidManager
Starting test: MachineAccount
......................... DC-FS passed test MachineAccount
Starting test: Services
......................... DC-FS passed test Services
Starting test: ObjectsReplicated
......................... DC-FS passed test ObjectsReplicated
Starting test: frssysvol
......................... DC-FS passed test frssysvol
Starting test: frsevent
......................... DC-FS passed test frsevent
Starting test: kccevent
......................... DC-FS passed test kccevent
Starting test: systemlog
......................... DC-FS passed test systemlog
Starting test: VerifyReferences
......................... DC-FS passed test VerifyReferences
Running partition tests on : ForestDnsZones
Starting test: CrossRefValidation
......................... ForestDnsZones passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... ForestDnsZones passed test CheckSDRefDom
Running partition tests on : DomainDnsZones
Starting test: CrossRefValidation
......................... DomainDnsZones passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... DomainDnsZones passed test CheckSDRefDom
Running partition tests on : Schema
Starting test: CrossRefValidation
......................... Schema passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Schema passed test CheckSDRefDom
Running partition tests on : Configuration
Starting test: CrossRefValidation
......................... Configuration passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Configuration passed test CheckSDRefDom
Running partition tests on : hce
Starting test: CrossRefValidation
......................... hce passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... hce passed test CheckSDRefDom
Running enterprise tests on : hce.local
Starting test: Intersite
......................... hce.local passed test Intersite
Starting test: FsmoCheck
......................... hce.local passed test FsmoCheck
You may as well have both, it offers authentication redundancy.
Is the server in DC assigned to the subnet in DC in Sites&Services?
The PDC emulator can be offline and users can still log into any DC. The primary DNS settings for users in DC should of course their server with SF being the secondary. Depending on your bandwidth the logon from the alternate site will take a little longer. Did you give the rebuilt server in DC the same name or something new?
ASKER
in fact they are both assigned to the DC subnet...not sure why that is.
I have been going through the logs and here is a few thing I discovered:
NTDS replication error 1411: AD failed to construct a mutual authentication service principal name for the following DC:
DC: 126c....
the call was denied. communication with this domain controller might be affected.
I can't find that resource in the DNS. the 2 DC have different resource numbers.
NTDS replication warning 2092
for the SBS server. this server is the owner of the following FSMO role but does not consider it valid. for the partition which contains the fsmo, this server has not replicated successfully with any of its partner since this server has been restarted. replications are preventing validation of this role.
if I do the replication from site and domain it says successful as well
not sure what is going here as everything seems to be working fine for the users
I have plenty of other errors but I think those 2 are the most intriguing
I have been going through the logs and here is a few thing I discovered:
NTDS replication error 1411: AD failed to construct a mutual authentication service principal name for the following DC:
DC: 126c....
the call was denied. communication with this domain controller might be affected.
I can't find that resource in the DNS. the 2 DC have different resource numbers.
NTDS replication warning 2092
for the SBS server. this server is the owner of the following FSMO role but does not consider it valid. for the partition which contains the fsmo, this server has not replicated successfully with any of its partner since this server has been restarted. replications are preventing validation of this role.
if I do the replication from site and domain it says successful as well
not sure what is going here as everything seems to be working fine for the users
I have plenty of other errors but I think those 2 are the most intriguing
ASKER
brent4257,
that is the way it is: own DNS as primary and SF DNS for secondary
I used the same name when I did the rebuild.
we have a mpls between the site so it should be fine. note that once we restarted both servers last night the logon is fine and the users are able to authenticate. I am just wondering why I could not login to the SBS server while the DC server was down.
and of course now I am going through all those weird errors :-/
thanks for your help
that is the way it is: own DNS as primary and SF DNS for secondary
I used the same name when I did the rebuild.
we have a mpls between the site so it should be fine. note that once we restarted both servers last night the logon is fine and the users are able to authenticate. I am just wondering why I could not login to the SBS server while the DC server was down.
and of course now I am going through all those weird errors :-/
thanks for your help
Here are a couple of links:
http://www.microsoft.com/technet/support/ee/transform.aspx?ProdName=Windows%20Operating%20System&ProdVer=5.2&EvtID=1411&EvtSrc=Active%20Directory&LCID=1033
http://support.microsoft.com/default.aspx?scid=kb;en-us;232538
One contact stated to make sure the timezones were set correctly which is basic stuff I'm sure you checked but thought I'd add.
Run netdiag and see what you get.
B-rad
Haven't found a Q article on this but here is the basic problem. You will receive this error when you try to promote a machine and it is pointing to a DC that is not replicating correctly. If you go to the command prompt and type "set" you can see what your logon server is. This is the server that logged you onto the domain. This server probably is the one that is not replicating. Here is how it would happen in most scenarios:
You build a new W2K box, you join it to the domain, at which point it contacts a DC, the object for the new computer account is created on that DC. If this DC is having replication problems then the object will not replicate out to other DCs that hold the FSMO roles. So when you try to promote the new server to a DC, it checks with the RID master and it has no idea of that object. Henceforth it errors out.
Resolution:
Make sure that the server in "set" as logon server can communicate with all other DCs especially the FSMO role holders, once the object replicates throughout the forest you should be able to promote it.
http://www.microsoft.com/technet/support/ee/transform.aspx?ProdName=Windows%20Operating%20System&ProdVer=5.2&EvtID=1411&EvtSrc=Active%20Directory&LCID=1033
http://support.microsoft.com/default.aspx?scid=kb;en-us;232538
One contact stated to make sure the timezones were set correctly which is basic stuff I'm sure you checked but thought I'd add.
Run netdiag and see what you get.
B-rad
Haven't found a Q article on this but here is the basic problem. You will receive this error when you try to promote a machine and it is pointing to a DC that is not replicating correctly. If you go to the command prompt and type "set" you can see what your logon server is. This is the server that logged you onto the domain. This server probably is the one that is not replicating. Here is how it would happen in most scenarios:
You build a new W2K box, you join it to the domain, at which point it contacts a DC, the object for the new computer account is created on that DC. If this DC is having replication problems then the object will not replicate out to other DCs that hold the FSMO roles. So when you try to promote the new server to a DC, it checks with the RID master and it has no idea of that object. Henceforth it errors out.
Resolution:
Make sure that the server in "set" as logon server can communicate with all other DCs especially the FSMO role holders, once the object replicates throughout the forest you should be able to promote it.
ASKER
Brent4257
thanks for the links. I will see if that DNS error about the "missing" DC goes away.
ok so it might just be that I am too impatient. everything is working fine for the users in SF and DC. the users in DC use the DC server as the logon server and in SF they use the SBS.
the replication is working as well as I was able to create an user in sbs AD and it was in the DC AD, and deleting it from the DC AD, deletes it from the SBS AD.
I will then wait until tomorrow night to restart the servers and see what is going on. My main concern now is that if the DC server goes down or internet goes down in DC, then people won't be able to authenticate anymore and I find that really strange since the logon server in SF is SBS so not sure why the SBS server can't logon when the DC server is down anyway I will update you more on friday morning
thanks again for your help
Gaetan
thanks again for your help
thanks for the links. I will see if that DNS error about the "missing" DC goes away.
ok so it might just be that I am too impatient. everything is working fine for the users in SF and DC. the users in DC use the DC server as the logon server and in SF they use the SBS.
the replication is working as well as I was able to create an user in sbs AD and it was in the DC AD, and deleting it from the DC AD, deletes it from the SBS AD.
I will then wait until tomorrow night to restart the servers and see what is going on. My main concern now is that if the DC server goes down or internet goes down in DC, then people won't be able to authenticate anymore and I find that really strange since the logon server in SF is SBS so not sure why the SBS server can't logon when the DC server is down anyway I will update you more on friday morning
thanks again for your help
Gaetan
thanks again for your help
ASKER
Thanks for your help. I guess that I was just to impatient as one day later I was able to restart the SBS and the other DC without issue. I will keep in mind that replication can take up to 24 hours to clean old records
thanks
Gaetan
thanks
Gaetan