Unable to replicate exchange database.

Hello Exchange experts/guru/saviors of my stress!!!!!!

I have been racking my brain on how to fix this issue where one database "SA-DB01" will not replicate to the DR server. The copy queue length keeps growing.

Event ID errors are:

Event ID 4138 MSExchange Repl

Event ID 4374 MSExchange Repl

Event ID 4113 MSExchange Repl

Event ID 1009 MSExchangeFastSearch

Here is my Attempts step by step:

SA-DB01 will not replicate to the DR server. The copy queue keeps growing and the database is on passive failed and suspended.

Attempt Fix 1: Ran an Update-MailboxDatabaseCopy "SA-DB01\SA-EXDR-P01" -DeleteExistingFiles -BeginSeed -SourceServer SA-EX-P01

The copy queue length seems to increase as the seeding is taking place. Then it runs fine for 2-4 hours then craps out me with the error below.

“The seeding operation failed. Error: An error occurred while performing the seed operation. Error: An error occurred while communicating with server 'SA-EX-P01'. Error: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.”

The Database then goes to back to passive failed and suspended. I try to do resume and get the following error:

“The Microsoft Exchange Replication service encountered an error while inspecting the logs and database for SA-DB01\SA-EXDR-P01 on startup. Error: File check failed : Database file 'F:\SA-DB01\SA-DB01.edb' was not found.”

Sure enough, the edb wasn’t there.

Attempt fix 2: I remove the old DR database copy and start a new one. I select that it copies from SA-EX-P01. Sure enough, after 4 hours, it will fail and go back to passive failed and suspended. I hit resume and the get the same error above.

I ran a health database test against SA-DB01 on SA-EX-P01 (production) and everything checked out find.

Attempt fix 3: Restarted the MS exchange fast search and MS exchange replication server. Tried to reseed again but got the same error above.

Attempt fix 4: I reboot my DR and P01 server, thinking this will work and tried to run another reseed. Sure enough, after 4 hours, the copy craps out on my and I get the same error above.

Workaround/Testing: I created a new database and did a copy to the DR, all worked fine. I moved my mailbox there and it copied to the new database with no problems and it can replicate to the DR server.

Workaround for queue length: Enabled Windows backup and it decreased the queue length from 40k to 10k.

Seems to be just one database can’t replicate to the DR server. All the other ones will replicate with no issues. I would hate to think that it is a database problem. If it is, I would have to move all mailboxes from SA-DB01 to SA-DB05. At this point, I don’t know what else to try. Can anyone help or have experience with this problem?
Kenny PlacidoSr System AdministratorAsked:
Who is Participating?
 
timgreen7077Exchange EngineerCommented:
yes you have to dismount the DB to run eseutil. did you verify that space isn't an issue on the DB.
Also you mentioned after 4 hours it fails, is that at a certain time of day or anything similarities when it fails. also were there any changes in your org.
0
 
timgreen7077Exchange EngineerCommented:
Are the Prod and DR in the same site or different site locations. If they are in different site locations i would attempt to seed from a healthy copy in the same site.
0
 
Kenny PlacidoSr System AdministratorAuthor Commented:
They are both on different site location.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
timgreen7077Exchange EngineerCommented:
can you attempt the seed from a healthy copy in the same site.
0
 
timgreen7077Exchange EngineerCommented:
also attempt the seed after hours when mailflow traffic is slower.
0
 
Kenny PlacidoSr System AdministratorAuthor Commented:
I did the reseed on a weekend when no one would be working. I got the same thing results. I have to do an integrity check on SA-DB01, but I believe i have to dismount the database first to run a eseutil /mh, correct?

Also, I am running exchange 2016 CU7
0
 
Kenny PlacidoSr System AdministratorAuthor Commented:
I am seeding from a health copy but there is only 1 copy. I have 2 other DBs that are SA-DB02 and SA-DB05, they seed over with no problems.
0
 
Kenny PlacidoSr System AdministratorAuthor Commented:
Space isnt an issue, I have 1.37TB of free space. I run it during different times of day and still get the same issue. 2 changes were made, we moved our main DC from virtual to physical (all information such IP, DNS, FSMO remained the same). We did a firmware upgrade on our 10G switch.
0
 
timgreen7077Exchange EngineerCommented:
did this happen right after your environment changes.
I'm wondering if and issue with the network or storage happens while running the reseed. strange the reseed runs for hours and then just fails for no reason. I would check to see if there are any network drops between sites and any storage issues in the DR site such as high or excessive writes and reads. do you have fast storage.
0
 
MichelangeloConsultantCommented:
If other DBs are replicating just fine I suspect an issue with that particular db. How to proceed depends on how yr exchange is set up: moving all mailboxes out of thay db, to another one, would probably solve the issue.
0
 
Jeff GloverSr. Systems AdministratorCommented:
I assume the physical path to the DB on the DR server is identical to the physical path to the DB on the Main server.
0
 
Kenny PlacidoSr System AdministratorAuthor Commented:
Im just going to move the database to another database. I think it is corrupted. Thanks for the help everyone.
0
 
Kenny PlacidoSr System AdministratorAuthor Commented:
Update:

I found out that the problem was actually 2 problems. The database wasn't replication because it was dirty and had to be repaired. Would have taken long to repair then move the mailboxes. The other reason behind the failed replication was connection between the primary exchange and the DR exchange. I ran a iperf on both servers and I was transferring data at a rate of 3Mbits per sec. Found out that it was a bad port. Did a shut/noshut n the switch and it was fixed. Transfer rate was at 300 MBits per sec.
0
 
timgreen7077Exchange EngineerCommented:
Good find. I'm willing to bet it was that network connection more than anything. Thanks for the update :)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.