Link to home
Start Free TrialLog in
Avatar of Sysguys
SysguysFlag for United Arab Emirates

asked on

Mailbox database shows Disconnected and resynchronizing after activating the database in DR,DAG member Exchange Server 2013

Dear All,

I have a strange issue,we 3 Exchange 2013 mailbox server 2 in PR and 1 in DR,for testing the database health ,i activated one of the mailbox database in DR server to check if everything works fine,although i was able to successfully activate the database in DR and also connect to the mailbox in that database ,but when i tried to activate the database back to the primary site ,i am not able to do so i get the below error.

The seeding operation failed. Error: An error occurred while performing the seed operation. Error: A timeout occurred while communicating with server 'DRMBX'. Error: "A connection could not be completed within 15 seconds." [Database: DR-Drill-Testing, Server: PRMBX.lamprell.com]

i thought its happening because may be i activated the mailbox when the status was not healthy ,so i created a new database which has no mailbox ,when i tried the samething activating it in DR and again in PR i still get the same error.

i can see event id 1111 logged in which says

Automatic Reseed Manager failed to execute repair workflow 'FailedSuspendedCopyAutoReseed' for database 'DR-Drill-Testing'. Error: The Automatic Reseed Manager encountered an error: The automatic repair operation for database copy 'DR-Drill-Testing\PRMBX' will not be run because one of the prerequisite checks failed. Error: There are no Exchange volumes mounted under root path 'C:\ExchangeVolumes', which are required for the Automatic Reseed component.

please help me with this
Avatar of Will Szymkowski
Will Szymkowski
Flag of Canada image

Have you tried to remove the database completely database/logs and the do a complete reseed of the database again. That should resolve your issue.

Will.
Avatar of Sysguys

ASKER

Thanks for the reply,i tried that it does not work,infact i created a new mailbox database without any mailbox ,even then the issue is still the same.
Just a fast shot...
How are the severs connected (speed) ?
Have you checked DNS, if you can resolve both server from bit side with all names? (FQDN, NetBIOS name, IP)
Avatar of Sysguys

ASKER

Hi Bembi,

We are connected through MPLS link to the DR  site,i confirmed doing nslookup its working fine from both the ends,

We have lot of application replication between PR and DR ,i can see all the logs gets replicated to DR properly but the link is little bit slow there is always a lag but it gets complete everyday.
I interpret DR and PR as two different sites / locations right?
And they are connected as you said by a MPLS line?
Again the question, what is the bandwidth of this connection and the latency?

Can all servers connect to a local AD controller with global catalog?
Can all the servers resolve and connect to the whitness server?

At least from the error message it looks like a connection issue somewhere.
Avatar of Sysguys

ASKER

Yes DR ( disaster recovery site) and PR (primary site) are two different sites in different subnets connected through mpls link

we have local DC and GC in DR also, as we have odd number of server that is 2 mailbox server in PR and one in DR ,we are doing manual datacentre switchover, do I still need file share witness for this?
we have 11 mailbox database only one is activated in DR rest all are in PR , DR databases are replicating to DR without any issues, only database from DR to PR is not replicating.
Again the question, how fast is the mpls and the latency?

Yes, for unequal members, you don't need a whitness.

Why I'm asking for the bandwidth is because we are talking about databases. They are transaction based what make it less critical, but you move a lot of data around. For a test database it is fine, for a large amount of mailboxes maybe a critical point. A very often constellation is, that the replications between the DAG servers have their own backbone and not to use the client network. Even in the DAG configuration it is constructed this way.

If the seeding fails it also can have something to do with backup. As the transaction logs are replicated, it may happen, that a backup software makes a backup what forces the instance to delete the transaction logs. That means that on one node the transaction logs are away nevertheless another node expects them. If this is the case, check your backup strategy.

To get them back, you can shut them down, check the databases with ESEUTIL / MP for clan shutdown, and if this is the case, you can delete the logs and restart the databases.

For Ex2013 on W2012, you should also check this article to verify the correct procedure.
https://technet.microsoft.com/en-us/library/ff367878(v=exchg.150).aspx
Avatar of Sysguys

ASKER

Bembi,Thanks for the reply,

I have some more info ,we have our 2010 exchange also in coexistence mode,although we have have migrated all the mailboxes to 2013,the servers are still not decommisioned,
What i did for testing was i created a mailbox database on exchange 2010 in PR and then tried activating it on the DR 2010 server its working smootly without any issues,we have 100 mbps link ,this can confirm that the netwrok is not the issue.
But there is one difference the 2010 server in DR is physical server,and 2013 server in DR is VM in hyper V
So i am doubting is it something to do with the Hyper V .
At the same time i have 1.2 TB of DB datase in PR which is replicating fine to DR only from DR i am not able to do so.
So, if 2010 works, you may have a look on the configuration just to compare if something is different.
But in general, the 2010 should not really involve because the  contact point should be 2013.

Physical or virtual also should not be the issue.

What we see first is the connection timeout. This may point (in genera) to network configuration problem, name resolution or even AD connectivity (there is stored the configuration). ut also can be connecte dwith the second error.

What we see in the second error:
There are no Exchange volumes mounted under root path 'C:\ExchangeVolumes', which are required for the Automatic Reseed component.
The first error states: while communicating with server 'DRMBX'
So the DR cannot communicate with itself....
Avatar of Sysguys

ASKER

i mean my 2010 DAG and 2013 dag are different just for testing i created an database in 2010 Dag to check the same issue, its working fine in 2010
Avatar of Sysguys

ASKER

My exchange volumes are placed in E:/ drive
You should check
Get-DatabaseAvailabilityGroup DAG1 | Format-List *auto*
to see the configuration of your DAB if you bfind the 'C:\ExchangeVolumes' there. It is the default for Reseed, so if you don't have  a reseed volume, exchange cannot make a autoreseet.

What happens if you ceate a new DB in DR first and then  replicate it to PR. Does this work and is stable?

I guess you know these ones, right?
https://technet.microsoft.com/en-us/library/dd298065(v=exchg.150).aspx
https://technet.microsoft.com/en-us/library/dd351172(v=exchg.150).aspx
Avatar of Sysguys

ASKER

Hi Bembi,

Thanks for trying to resolve the issue.
I found one thing i can see in the DR server there is only one network adapter,and i can see mutliple network adapters in primary servers,is this correct settings?
It depends...
DAG uses its own virtual Network for replication. The idea is to separate the user network and the replication network. While the user network is the network the clients connect to exchange, the replication network is intended as high speed backbone the servers are communication with each other. In datacenters or small geo clusters (two connected datacenters on two locations. i.e. 2 buildings) you have sometime such a constellation with two networks.

So if you have only one physical network (here we have one MPLS line) between the data centers, the DAG network is more a logical network with works over the same line than the physical network.

For the DAG replication, the DAG network is important. You can even have a dedicated replication network between the RP while the DP don't have such a dedicated network. At least you may have different IP addresses or a VLAN on the replication side. This depend a bit from your network architecture.

You have to make sure, that the DP is able to communicate with the DAG network how it is setup in the DAG. Means, from the DP side you should not only be able to resolve all names, you should also be able to communicate with the DAG replication IP. So if the DAG network uses dedicated IP addresses or VLANs, so you may miss a route from the DR site...

At least this would partly explain, why you can replicate into te one direction, while you get a communication error in the other direction.
Avatar of Sysguys

ASKER

This is what i get when i run DAg netwrok command HAMDAG is my 2010 DAG and LAMDAG is my 2013 DAG

[PS] C:\Windows\system32>Get-DatabaseAvailabilityGroupNetwork

Identity                                ReplicationEnabled                      Subnets
--------                                ------------------                      -------
HAMDAG\MAPI                             False                                   {{10.20.1.0/24,Up}}
HAMDAG\Replication                      True                                    {{10.20.80.0/24,Up}, {10.50.1.0/24,Up}}
HAMDAG\Storage                          False                                   {{10.20.50.0/24,Up}}
LAMDAG\MapiDagNetwork                   True                                    {{10.20.36.0/25,Up}, {10.50.1.0/24,Up}}
LAMDAG\ReplicationDagNetwork01          True                                    {{10.20.38.0/26,Up}}
LAMDAG\ReplicationDagNetwork02          True                                    {{10.20.37.0/26,Up}}
LAMDAG\ReplicationDagNetwork03          True                                    {{192.168.20.0/27,Up}}


[PS] C:\Windows\system32>
Avatar of Sysguys

ASKER

please find more info i hope this will help to understand what is going on.

the servers are EX2010PR1 and EX2010PR2 are exchnage 2010 in primary site and EX2010DR is 2010 exchange server in DR

Ex2013PR1 and EX2013PR2 are exchnage server in primary site and EX2013DR is 2013 is exchnage DR this is the server which has issue

[PS] C:\Windows\system32>Get-DatabaseAvailabilityGroupNetwork | fl


RunspaceId         : 9d65d0b1-7188-4800-834c-29bba0ac86ac
Name               : MAPI
Description        :
Subnets            : {{10.20.1.0/24,Up}}
Interfaces         : {{EX2010PR1,Up,10.20.1.108}, {EX2010PR2,Up,10.20.1.188}}
MapiAccessEnabled  : True
ReplicationEnabled : False
IgnoreNetwork      : False
Identity           : HAMDAG\MAPI
IsValid            : True
ObjectState        : New

RunspaceId         : 9d65d0b1-7188-4800-834c-29bba0ac86ac
Name               : Replication
Description        :
Subnets            : {{10.20.80.0/24,Up}, {10.50.1.0/24,Up}}
Interfaces         : {{EX2010DR,Up,10.50.1.79}, {EX2010PR1,Up,10.20.80.21}, {EX2010PR2,Up,10.20.80.20}}
MapiAccessEnabled  : True
ReplicationEnabled : True
IgnoreNetwork      : False
Identity           : HAMDAG\Replication
IsValid            : True
ObjectState        : New

RunspaceId         : 9d65d0b1-7188-4800-834c-29bba0ac86ac
Name               : Storage
Description        :
Subnets            : {{10.20.50.0/24,Up}}
Interfaces         : {{EX2010PR2,Up,10.20.50.56}}
MapiAccessEnabled  : False
ReplicationEnabled : False
IgnoreNetwork      : False
Identity           : HAMDAG\Storage
IsValid            : True
ObjectState        : New

RunspaceId         : 9d65d0b1-7188-4800-834c-29bba0ac86ac
Name               : MapiDagNetwork
Description        :
Subnets            : {{10.20.36.0/25,Up}, {10.50.1.0/24,Up}}
Interfaces         : {{EX2013DR,Up,10.50.1.82}, {EX2013PR1,Up,10.20.36.30}, {EX2013PR2,Up,10.20.36.31}}
MapiAccessEnabled  : True
ReplicationEnabled : True
IgnoreNetwork      : False
Identity           : LAMDAG\MapiDagNetwork
IsValid            : True
ObjectState        : New

RunspaceId         : 9d65d0b1-7188-4800-834c-29bba0ac86ac
Name               : ReplicationDagNetwork01
Description        :
Subnets            : {{10.20.38.0/26,Up}}
Interfaces         : {{EX2013PR1,Up,10.20.38.30}, {EX2013PR2,Up,10.20.38.31}}
MapiAccessEnabled  : False
ReplicationEnabled : True
IgnoreNetwork      : False
Identity           : LAMDAG\ReplicationDagNetwork01
IsValid            : True
ObjectState        : New

RunspaceId         : 9d65d0b1-7188-4800-834c-29bba0ac86ac
Name               : ReplicationDagNetwork02
Description        :
Subnets            : {{10.20.37.0/26,Up}}
Interfaces         : {{EX2013PR1,Up,10.20.37.30}, {EX2013PR2,Up,10.20.37.31}}
MapiAccessEnabled  : False
ReplicationEnabled : True
IgnoreNetwork      : False
Identity           : LAMDAG\ReplicationDagNetwork02
IsValid            : True
ObjectState        : New
ASKER CERTIFIED SOLUTION
Avatar of Bembi
Bembi
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Sysguys

ASKER

Bembi has an deep understanding of exchange server,i really appreciate his technical skills for pin pointing the exact issue
Avatar of Sysguys

ASKER

Hey Bembi,

As you said 10.20.38.30 and 10.20.38.31 were not pinging from DR and even in the Primary network,i asked the network guy he said the vlan was not created ,as soon as he created the vlan for 38 everything started working fine.
Thank you, happy that's working now ;)