Mailbox database shows Disconnected and resynchronizing after activating the database in DR,DAG member Exchange Server 2013

Dear All,

I have a strange issue,we 3 Exchange 2013 mailbox server 2 in PR and 1 in DR,for testing the database health ,i activated one of the mailbox database in DR server to check if everything works fine,although i was able to successfully activate the database in DR and also connect to the mailbox in that database ,but when i tried to activate the database back to the primary site ,i am not able to do so i get the below error.

The seeding operation failed. Error: An error occurred while performing the seed operation. Error: A timeout occurred while communicating with server 'DRMBX'. Error: "A connection could not be completed within 15 seconds." [Database: DR-Drill-Testing, Server: PRMBX.lamprell.com]

i thought its happening because may be i activated the mailbox when the status was not healthy ,so i created a new database which has no mailbox ,when i tried the samething activating it in DR and again in PR i still get the same error.

i can see event id 1111 logged in which says

Automatic Reseed Manager failed to execute repair workflow 'FailedSuspendedCopyAutoReseed' for database 'DR-Drill-Testing'. Error: The Automatic Reseed Manager encountered an error: The automatic repair operation for database copy 'DR-Drill-Testing\PRMBX' will not be run because one of the prerequisite checks failed. Error: There are no Exchange volumes mounted under root path 'C:\ExchangeVolumes', which are required for the Automatic Reseed component.

please help me with this
SysguysAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Will SzymkowskiSenior Solution ArchitectCommented:
Have you tried to remove the database completely database/logs and the do a complete reseed of the database again. That should resolve your issue.

Will.
SysguysAuthor Commented:
Thanks for the reply,i tried that it does not work,infact i created a new mailbox database without any mailbox ,even then the issue is still the same.
BembiCEOCommented:
Just a fast shot...
How are the severs connected (speed) ?
Have you checked DNS, if you can resolve both server from bit side with all names? (FQDN, NetBIOS name, IP)
Making Bulk Changes to Active Directory

Watch this video to see how easy it is to make mass changes to Active Directory from an external text file without using complicated scripts.

SysguysAuthor Commented:
Hi Bembi,

We are connected through MPLS link to the DR  site,i confirmed doing nslookup its working fine from both the ends,

We have lot of application replication between PR and DR ,i can see all the logs gets replicated to DR properly but the link is little bit slow there is always a lag but it gets complete everyday.
BembiCEOCommented:
I interpret DR and PR as two different sites / locations right?
And they are connected as you said by a MPLS line?
Again the question, what is the bandwidth of this connection and the latency?

Can all servers connect to a local AD controller with global catalog?
Can all the servers resolve and connect to the whitness server?

At least from the error message it looks like a connection issue somewhere.
SysguysAuthor Commented:
Yes DR ( disaster recovery site) and PR (primary site) are two different sites in different subnets connected through mpls link

we have local DC and GC in DR also, as we have odd number of server that is 2 mailbox server in PR and one in DR ,we are doing manual datacentre switchover, do I still need file share witness for this?
we have 11 mailbox database only one is activated in DR rest all are in PR , DR databases are replicating to DR without any issues, only database from DR to PR is not replicating.
BembiCEOCommented:
Again the question, how fast is the mpls and the latency?

Yes, for unequal members, you don't need a whitness.

Why I'm asking for the bandwidth is because we are talking about databases. They are transaction based what make it less critical, but you move a lot of data around. For a test database it is fine, for a large amount of mailboxes maybe a critical point. A very often constellation is, that the replications between the DAG servers have their own backbone and not to use the client network. Even in the DAG configuration it is constructed this way.

If the seeding fails it also can have something to do with backup. As the transaction logs are replicated, it may happen, that a backup software makes a backup what forces the instance to delete the transaction logs. That means that on one node the transaction logs are away nevertheless another node expects them. If this is the case, check your backup strategy.

To get them back, you can shut them down, check the databases with ESEUTIL / MP for clan shutdown, and if this is the case, you can delete the logs and restart the databases.

For Ex2013 on W2012, you should also check this article to verify the correct procedure.
https://technet.microsoft.com/en-us/library/ff367878(v=exchg.150).aspx
SysguysAuthor Commented:
Bembi,Thanks for the reply,

I have some more info ,we have our 2010 exchange also in coexistence mode,although we have have migrated all the mailboxes to 2013,the servers are still not decommisioned,
What i did for testing was i created a mailbox database on exchange 2010 in PR and then tried activating it on the DR 2010 server its working smootly without any issues,we have 100 mbps link ,this can confirm that the netwrok is not the issue.
But there is one difference the 2010 server in DR is physical server,and 2013 server in DR is VM in hyper V
So i am doubting is it something to do with the Hyper V .
At the same time i have 1.2 TB of DB datase in PR which is replicating fine to DR only from DR i am not able to do so.
BembiCEOCommented:
So, if 2010 works, you may have a look on the configuration just to compare if something is different.
But in general, the 2010 should not really involve because the  contact point should be 2013.

Physical or virtual also should not be the issue.

What we see first is the connection timeout. This may point (in genera) to network configuration problem, name resolution or even AD connectivity (there is stored the configuration). ut also can be connecte dwith the second error.

What we see in the second error:
There are no Exchange volumes mounted under root path 'C:\ExchangeVolumes', which are required for the Automatic Reseed component.
The first error states: while communicating with server 'DRMBX'
So the DR cannot communicate with itself....
SysguysAuthor Commented:
i mean my 2010 DAG and 2013 dag are different just for testing i created an database in 2010 Dag to check the same issue, its working fine in 2010
SysguysAuthor Commented:
My exchange volumes are placed in E:/ drive
BembiCEOCommented:
You should check
Get-DatabaseAvailabilityGroup DAG1 | Format-List *auto*
to see the configuration of your DAB if you bfind the 'C:\ExchangeVolumes' there. It is the default for Reseed, so if you don't have  a reseed volume, exchange cannot make a autoreseet.

What happens if you ceate a new DB in DR first and then  replicate it to PR. Does this work and is stable?

I guess you know these ones, right?
https://technet.microsoft.com/en-us/library/dd298065(v=exchg.150).aspx
https://technet.microsoft.com/en-us/library/dd351172(v=exchg.150).aspx
SysguysAuthor Commented:
Hi Bembi,

Thanks for trying to resolve the issue.
I found one thing i can see in the DR server there is only one network adapter,and i can see mutliple network adapters in primary servers,is this correct settings?
BembiCEOCommented:
It depends...
DAG uses its own virtual Network for replication. The idea is to separate the user network and the replication network. While the user network is the network the clients connect to exchange, the replication network is intended as high speed backbone the servers are communication with each other. In datacenters or small geo clusters (two connected datacenters on two locations. i.e. 2 buildings) you have sometime such a constellation with two networks.

So if you have only one physical network (here we have one MPLS line) between the data centers, the DAG network is more a logical network with works over the same line than the physical network.

For the DAG replication, the DAG network is important. You can even have a dedicated replication network between the RP while the DP don't have such a dedicated network. At least you may have different IP addresses or a VLAN on the replication side. This depend a bit from your network architecture.

You have to make sure, that the DP is able to communicate with the DAG network how it is setup in the DAG. Means, from the DP side you should not only be able to resolve all names, you should also be able to communicate with the DAG replication IP. So if the DAG network uses dedicated IP addresses or VLANs, so you may miss a route from the DR site...

At least this would partly explain, why you can replicate into te one direction, while you get a communication error in the other direction.
SysguysAuthor Commented:
This is what i get when i run DAg netwrok command HAMDAG is my 2010 DAG and LAMDAG is my 2013 DAG

[PS] C:\Windows\system32>Get-DatabaseAvailabilityGroupNetwork

Identity                                ReplicationEnabled                      Subnets
--------                                ------------------                      -------
HAMDAG\MAPI                             False                                   {{10.20.1.0/24,Up}}
HAMDAG\Replication                      True                                    {{10.20.80.0/24,Up}, {10.50.1.0/24,Up}}
HAMDAG\Storage                          False                                   {{10.20.50.0/24,Up}}
LAMDAG\MapiDagNetwork                   True                                    {{10.20.36.0/25,Up}, {10.50.1.0/24,Up}}
LAMDAG\ReplicationDagNetwork01          True                                    {{10.20.38.0/26,Up}}
LAMDAG\ReplicationDagNetwork02          True                                    {{10.20.37.0/26,Up}}
LAMDAG\ReplicationDagNetwork03          True                                    {{192.168.20.0/27,Up}}


[PS] C:\Windows\system32>
SysguysAuthor Commented:
please find more info i hope this will help to understand what is going on.

the servers are EX2010PR1 and EX2010PR2 are exchnage 2010 in primary site and EX2010DR is 2010 exchange server in DR

Ex2013PR1 and EX2013PR2 are exchnage server in primary site and EX2013DR is 2013 is exchnage DR this is the server which has issue

[PS] C:\Windows\system32>Get-DatabaseAvailabilityGroupNetwork | fl


RunspaceId         : 9d65d0b1-7188-4800-834c-29bba0ac86ac
Name               : MAPI
Description        :
Subnets            : {{10.20.1.0/24,Up}}
Interfaces         : {{EX2010PR1,Up,10.20.1.108}, {EX2010PR2,Up,10.20.1.188}}
MapiAccessEnabled  : True
ReplicationEnabled : False
IgnoreNetwork      : False
Identity           : HAMDAG\MAPI
IsValid            : True
ObjectState        : New

RunspaceId         : 9d65d0b1-7188-4800-834c-29bba0ac86ac
Name               : Replication
Description        :
Subnets            : {{10.20.80.0/24,Up}, {10.50.1.0/24,Up}}
Interfaces         : {{EX2010DR,Up,10.50.1.79}, {EX2010PR1,Up,10.20.80.21}, {EX2010PR2,Up,10.20.80.20}}
MapiAccessEnabled  : True
ReplicationEnabled : True
IgnoreNetwork      : False
Identity           : HAMDAG\Replication
IsValid            : True
ObjectState        : New

RunspaceId         : 9d65d0b1-7188-4800-834c-29bba0ac86ac
Name               : Storage
Description        :
Subnets            : {{10.20.50.0/24,Up}}
Interfaces         : {{EX2010PR2,Up,10.20.50.56}}
MapiAccessEnabled  : False
ReplicationEnabled : False
IgnoreNetwork      : False
Identity           : HAMDAG\Storage
IsValid            : True
ObjectState        : New

RunspaceId         : 9d65d0b1-7188-4800-834c-29bba0ac86ac
Name               : MapiDagNetwork
Description        :
Subnets            : {{10.20.36.0/25,Up}, {10.50.1.0/24,Up}}
Interfaces         : {{EX2013DR,Up,10.50.1.82}, {EX2013PR1,Up,10.20.36.30}, {EX2013PR2,Up,10.20.36.31}}
MapiAccessEnabled  : True
ReplicationEnabled : True
IgnoreNetwork      : False
Identity           : LAMDAG\MapiDagNetwork
IsValid            : True
ObjectState        : New

RunspaceId         : 9d65d0b1-7188-4800-834c-29bba0ac86ac
Name               : ReplicationDagNetwork01
Description        :
Subnets            : {{10.20.38.0/26,Up}}
Interfaces         : {{EX2013PR1,Up,10.20.38.30}, {EX2013PR2,Up,10.20.38.31}}
MapiAccessEnabled  : False
ReplicationEnabled : True
IgnoreNetwork      : False
Identity           : LAMDAG\ReplicationDagNetwork01
IsValid            : True
ObjectState        : New

RunspaceId         : 9d65d0b1-7188-4800-834c-29bba0ac86ac
Name               : ReplicationDagNetwork02
Description        :
Subnets            : {{10.20.37.0/26,Up}}
Interfaces         : {{EX2013PR1,Up,10.20.37.30}, {EX2013PR2,Up,10.20.37.31}}
MapiAccessEnabled  : False
ReplicationEnabled : True
IgnoreNetwork      : False
Identity           : LAMDAG\ReplicationDagNetwork02
IsValid            : True
ObjectState        : New
BembiCEOCommented:
OK, What I see here is...
You have two replication networks between PR1 and PR2, why?
The replication Network 3 is not used?

Is see...
{{EX2013DR,Up,10.50.1.82}, {EX2013PR1,Up,10.20.36.30}, {EX2013PR2,Up,10.20.36.31}}

On 2010 the ReplNetwork is used for Mapi too..
On 2013 the MapiNetwork is used for Repl too..

On 2010 you have a dedicated MAPI network, while the 2013DR network is on the MAPI Network.
And 2010 MAPI and 2013 MAPI are on the same subnet.

The major difference here is, that you use on 2010 site a
dedicated MAPI network for PR1 and PR2 and a
dedicated replication network, which handles RP1, RP2 and DR replication and DR MAPI
So you put the MAPI onto a replication network

On 2013 site you use
dedicated MAPI Network for RP1, RP2, DR MAPI and RP1, RP2 and DR replication
2 dedicated Repl Network for RP1 and PR2
So you put the replication onto a MAPI network

Or from the DR2010 perspective, the replication points are in the replication network
From the DR2013 perspective , the replication points are in the MAPI network.

You may check from the DR Network
ping 10.20.36.30
ping 10.20.36.31

also try the other repl networks
ping 10.20.37.30 + 10.20.38.30
ping 10.20.37.31 + 10.20.38.31

Ping from PR to 10.50.1.79 should work due to 2010 so 10.50.1.82 should work as well

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
SysguysAuthor Commented:
Bembi has an deep understanding of exchange server,i really appreciate his technical skills for pin pointing the exact issue
SysguysAuthor Commented:
Hey Bembi,

As you said 10.20.38.30 and 10.20.38.31 were not pinging from DR and even in the Primary network,i asked the network guy he said the vlan was not created ,as soon as he created the vlan for 38 everything started working fine.
BembiCEOCommented:
Thank you, happy that's working now ;)
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Exchange

From novice to tech pro — start learning today.