Replication Issue on Additional Domain Controller Windows 2008 R2

We have 4 Domain Controllers (ABCDCQ1,ABCDCQ2,ABCDCQ3 and ABCDCQ4). In that ABCDCQ1 having all 5 fsmo roles. It’s single domain environment. We are getting replication error on ABCDCQ2 , this domain controller having Additional Domain Controller role only. We want to resolve replication issue on ABCDCQ2.Result of repadmin /showrepl which we ran on ABCDCQ2 is attached and PFA. We are getting replication error on DomainDnsZones partition only rest shows successful.

Please assist us to resolve replication issue on ABCDCQ2.
Replication-issue-on-Additional-.doc
ShailendraJadhavAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ShailendraJadhavAuthor Commented:
Would you please look in to this at earliest. We want to provide solution as early as possible to our client.
0
footechCommented:
Have you already looked at Microsoft's guidance for this?  They provide troubleshooting steps and resolutions.
http://support.microsoft.com/kb/2645996
You could just demote and repromote, but it's better to find out the actual cause if possible to help prevent it from happening again.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Will SzymkowskiSenior Solution ArchitectCommented:
Take a look at the Directory Service Event Logs on the DC in question. Also run dcdiag /c: (Comprehensive Runs all tests) on the DC as well and verify the out-put.

You may also want to reference the link below for more details on troubleshooting AD replicaiton issues...
http://support.microsoft.com/kb/2645996

Hope this helps
0
Newly released Acronis True Image 2019

In announcing the release of the 15th Anniversary Edition of Acronis True Image 2019, the company revealed that its artificial intelligence-based anti-ransomware technology – stopped more than 200,000 ransomware attacks on 150,000 customers last year.

stu29Commented:
I am with Spec01 .. start with the basics. DCDIAG /TEST:DNS to see if anything is misconfigured in DNS.  If it is try dcdiag /fix
0
Life1430Commented:
Please refer below URL it could have enough information to drive you towards resolution

http://support.microsoft.com/kb/2645996/en-gb
0
footechCommented:
Oh my gosh, how many people are going to provide the same link that I did in the first reply? :)

BTW, although running DCDIAG is always a good step when suspecting a replication problem, in this case it should only echo the problem as shown by repadmin /showrepl.  A problem with DNS wouldn't manifest itself as only a single AD partition failing to replicate and showing corruption.
0
stu29Commented:
ShailendraJadhav ... I think we need some more info to help you.  

Did you check the DS logs?

Did you try to up the logging level? (http://technet.microsoft.com/en-us/library/cc961809.aspx)
0
Life1430Commented:
@Footech...appologies..-):-):-):-) haven't compared prior posting
0
SandeshdubeySenior Server EngineerCommented:
It seems that AD database is corrupted and hence replication failure occuring.this could de due to errors on drive,drivers and firmware not update,AD database corrupt.For possible cause and resolution sse this:http://technet.microsoft.com/en-us/library/replication-error-8451-the-replication-operation-encountered-a-database-error(v=ws.10).aspx

http://social.technet.microsoft.com/Forums/windowsserver/en-US/d05d1174-3193-416f-a1b3-4dd61919f763/repadmin-syncall-error-the-replication-operation-encountered-a-database-error

Check the integrity of AD database if error is reported defrag the AD database:http://technet.microsoft.com/en-us/library/cc816754(v=ws.10).aspx http://support.microsoft.com/kb/232122

Run chkdsk in read only mode to check for drive errors.If error is reported run chkdsk/f to fix the same.Exclude ntds/sysvol/ntfrs folder from AV scan.

Alternately if the issue is not getting fix you can demote the dc forcefully followed by metadata cleanup and promote the server back as DC.

Reference link
Forcefull removal of DC: http://support.microsoft.com/kb/332199
Metadata cleanup: http://www.petri.co.il/delete_failed_dcs_from_ad.htm

Hope this helps
0
ShailendraJadhavAuthor Commented:
Hello All,
Thank you very much for your immediate response.
1.      Have performed file integrity check, which completed with database error as “Operation terminated with error -1206( JET_errDatabaseCorrupted, Non database file or corrupted db )”.

2.      Have performed defragmentation, which terminated with error as “Operation terminated with error -1605( JET_errKeyDuplicate, Illegal duplicate key )”.

3.      Have performed Semantic Database Analysis with Fixup, errors are reported during Semantic Database Analysis with Fixup. So have ran file maintenance: recover, which completed with output as database recovery is successful.

However when we run repadmin /showrepl, it shows replication fails for DomainDnsZones partition. Please see result only for DomainDnsZones partition as below.

DC=DomainDnsZones,DC=AEESINC,DC=COM
    CLT\ABCDCQ3 via RPC
        DSA object GUID: 7c1e8bc2-8dcf-4ea6-80a3-d5bf6311dd7f
        Last attempt @ 2013-09-10 06:13:41 failed, result 8451 (0x2103):
            The replication operation encountered a database error.
        3260 consecutive failure(s).
        Last success @ 2013-08-07 09:14:51.
    CLT\ABCDCQ4 via RPC
        DSA object GUID: b9ce7848-b161-4882-a797-a0f9a03c2c6b
        Last attempt @ 2013-09-10 06:13:41 failed, result 8451 (0x2103):
            The replication operation encountered a database error.
        3259 consecutive failure(s).
        Last success @ 2013-08-07 09:14:51.
    NAS2\ABCDCQ1 via RPC
        DSA object GUID: 6b81deee-fa46-4f14-ae09-70f4f148be80
        Last attempt @ 2013-09-10 06:15:14 failed, result 8451 (0x2103):
            The replication operation encountered a database error.
        43370 consecutive failure(s).
        Last success @ 2013-08-07 09:09:54.

Please let me know if you have any suggestion.

Thanks.
0
footechCommented:
After your step 3 did you perform an offline defrag?
Sandeshdubey posted a link for the procedure, but here it is again.
http://support.microsoft.com/kb/232122

If after the reboot the event logs are still reporting errors and replication still isn't working, I would proceed with the demote/promote.
0
compdigit44Commented:
IS your client running a Windows 2003 AD domain?
http://support.microsoft.com/kb/832851


ALso please review the following link: http://eniackb.blogspot.com/2009/06/active-directory-database.html
0
compdigit44Commented:
Have you checked the problem DC for hardware errors?
0
ShailendraJadhavAuthor Commented:
Thank you all for your kind support.

1. Have already performed offline defragmentation after step3 and also rebooted the server, however still replication is not working.

2. We are using Windows 2008 r2 AD domain.

3. Have already performed hardware check, no any hardware issue we observe on our Dell model server.

Thanks.
0
stu29Commented:
At this point I would have to agree with Footech ... I would be demoting the box >>>> http://www.smallbusinesstech.net/more-complicated-instructions/windows/adding-and-removing-windows-server-2008-r2-domain-controllers
0
SandeshdubeySenior Server EngineerCommented:
The log indicates that defrag and integrity failed.As you have multiple DC the best way to deal as suggested by others is to demote the faulty DC and promote the server back as DC.

You cannot demote the faulty DC gracefully you need to do forcefull removal.You need to ran dcpromo/force removal and then run matadata cleanup on other DC(healthy) to remove the instance of faulty DC from AD database and DNS.If faulty DC is fsmo role holder server the you need to seize the FSMO role on other DC.

Once done you can promote the Server back as ADC.Also configure authorative time server role on PDC role holder server.

Reference link
Forcefull removal of DC: http://support.microsoft.com/kb/332199
Metadata cleanup: http://www.petri.co.il/delete_failed_dcs_from_ad.htm
Seize FSMO role: http://www.petri.co.il/seizing_fsmo_roles.htm
Authorative time server: http://support.microsoft.com/kb/816042
Configuring the time service on the PDC Emulator FSMO role holder

Hope this helps
0
compdigit44Commented:
This maybe a little of topic but this posting made me think for the following...

If you have a multi-master domain as the user has posted and you start to have issues on NTDS issues on one server would this automatically affect all servers since the NTDS DB is replicated to all servers. There for a corrupt DB on one server would corrupt all of them???

With this in mind I am a bit confused as to how demoting the problem server would correct a corrupt NTDS DB issue on one server in a multi-master environment
0
footechCommented:
@compdigit44 - I'm not really in a position to speak authoritatively on this, but here's my take.  Although I believe it's possible, typically this won't be the case (I'm wondering myself about percentages).  When corruption is detected you will usually see replication stopped which will prevent the spread.  I've heard of people having to manually stop replication to/from a particular server because of corruption, but I've never witnessed that situation myself.
0
ShailendraJadhavAuthor Commented:
Hi,

When we run chkdsk command on affected server it gives error as "Windows found problems with the file system, run CHKDSK with the /F (fix) option to correct these".

Is this could be the reason for database corruption issue?

Thanks.
0
ShailendraJadhavAuthor Commented:
Demote and Re-promote resolved issue. Thanks to all for yours assistance.
0
footechCommented:
If you haven't already, run the chkdsk.  If it's a physical server with only a single drive (not RAID) run it with the /r switch.  I wouldn't think this would be related since if the file itself was corrupt (and not just something with its data) then I would expect more than just the DomainDNSZones replication being stopped, but I suppose it's possible and the disk errors could cause other problems.
0
SandeshdubeySenior Server EngineerCommented:
If ther are errors on disk you need to fix the same.Kindly take backup of server and then proceed with chkdsk/f.
0
compdigit44Commented:
I would also run a fully system diagnostic on this server as well at this point. Vendors like IBM, HP, Dell etc.. provide their own bootable diagnostic tools to scan your server.

Just a suggestion
0
compdigit44Commented:
How did you make out with this?
0
ShailendraJadhavAuthor Commented:
Demoted and then did metadatacleanup , removed DNS entries , removed entries from sites and services. Repromoted it again
0
compdigit44Commented:
Are so still having replication issues post demote / repromote???
0
SandeshdubeySenior Server EngineerCommented:
After the DC is promoted enusre to check the health of new dc with dcdiag /q and repadmin /replsum and post the log if error is reported.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Legacy OS

From novice to tech pro — start learning today.