Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 633
  • Last Modified:

Exchange 2010 Stretched DAG Copy Fails

We've got Exchange 2010 SP3 in a 2 node active/active DAG stretched across 2 locations (350 miles apart, connected by 100 meg link) and every time I add a database copy, on any of our 20 databases, it fails with either "a log file missing on the active copy" or a "source-side 0xfffffae7" error.

If I run Update-MailboxDatabaseCopy -DeleteExistingFiles it will work, but only directly after a backup has run...if I try to seed a copy the next day, it fails.

Any clues as to why my window of opportunity is right after a backup?
Thanks!
0
exchangeedg
Asked:
exchangeedg
  • 9
  • 7
1 Solution
 
AmitIT ArchitectCommented:
Can you run and post results for below command

Test-Servicehealth
Test-replicationhealth
0
 
exchangeedgAuthor Commented:
Test-ServiceHealth:

Role                                     : Mailbox Server Role
RequiredServicesRunning : True
ServicesRunning                 : {IISAdmin, MSExchangeADTopology, MSExchangeIS, MSExchangeMailboxAssistants, MSExchangeMailSubmission, MSExchangeRepl, MSExchangeRPC, MSExchangeSA, MSExchangeSearch, MSExchangeServiceHost, MSExchangeThrottling, MSExchangeTransportLogSearch, W3Svc, WinRM}
ServicesNotRunning            : {}

Test-replicationhealth:


Server               Check                              Result     Error                                                            
------                   -----                                  ------        -----                                                            
HOMBX1          ClusterService                 Passed                                                                      
HOMBX1          ReplayService                  Passed                                                                      
HOMBX1          ActiveManager                 Passed                                                                      
HOMBX1          TasksRpcListener             Passed                                                                      
HOMBX1          TcpListener                       Passed                                                                      
HOMBX1          ServerLocatorService       Passed                                                                      
HOMBX1          DagMembersUp               Passed                                                                      
HOMBX1          ClusterNetwork                Passed                                                                      
HOMBX1          QuorumGroup                 Passed                                                                      
HOMBX1          FileShareQuorum            Passed                                                                      
HOMBX1          DBCopySuspended          Passed                                                                      
HOMBX1          DBCopy                             Failed         Error : Continuous Replication for database 'LAFAYETTE\HOMBX1' is in a 'Failed' state on machine 'HOMBX1'. The specific message is: The required log file 30798 for LAFAYETTE\HOMBX1 is missing on the active copy. If you removed the log file, please replace it. If the log file is lost, the database copy will need to be reseeded using Update-MailboxDatabaseCopy.
HOMBX1          DBInitializing                     Passed                                                                      
HOMBX1          DBDisconnected                Passed                                                                      
HOMBX1          DBLogCopyKeepingUp      Passed                                                                      
HOMBX1          DBLogReplayKeepingUp   Passed
0
 
AmitIT ArchitectCommented:
Have you excluded Exchange binaries, folders and logs from the AV scanning. If not, please do so first. AV are known for deleting log files. Check AV logs also, if you can see any logs deleted by AV.
0
Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

 
exchangeedgAuthor Commented:
We don't have A/V on any of the mail servers
0
 
AmitIT ArchitectCommented:
Are you sure? You don't have any Antivirus? If yes, then you are running at a virus risk. If might be possible that you server is virus infected. Install AV ASAP and scan your servers.
0
 
exchangeedgAuthor Commented:
We have the Barracuda A/V Agent running, since there's little reason for file-level scanning on Exchange. Anyhow, I highly doubt an infection would cause the DAG copy issues we are having.

Note: I have tried with the 'cuda agent disabled as well.
0
 
AmitIT ArchitectCommented:
Ok, run Cluster Validation Wizard using cluster snap-in and check if any error's shows up.
0
 
exchangeedgAuthor Commented:
This is the only Warning I see of note:

-Nodes are not consistently configured with IPv4 and/or IPv6 addresses on network adapters
that are usable by the cluster.

-Node HOMBX1.edg.net is configured with IP addresses from protocol IPv4 only.

-Node NOMBX1.edg.net is configured with IP addresses from protocol IPv4 and IPv6.
0
 
AmitIT ArchitectCommented:
Error indicates log file missing. So, reseed is the only option to fix it. Next, are you taking Passive DB backup?
0
 
AmitIT ArchitectCommented:
Also check if VSS writers are stable.
0
 
exchangeedgAuthor Commented:
Reseed only works directly after the backup has run.
We are backing up the passive copy, using Commvault, with no errors.
0
 
AmitIT ArchitectCommented:
In this case, i would disable the backup for a day and see if same issue happens again, if not then you need to focus on backup software, better to open a case with backup vendor.
0
 
exchangeedgAuthor Commented:
Alright I'll try that, thanks
0
 
exchangeedgAuthor Commented:
The copy on that database from yesterday completed, but all the others I've tried today failed with the same error.
I've opened a case with MS PSS...we're based in New Orleans and kinda need to get the DAG healthy before the storms start rolling in.
0
 
exchangeedgAuthor Commented:
PSS had me dismount each DB, move the logs to a different folder, then remount. After that they all seeded correctly.
Not sure why all the DB's had log file corruption, but its fixed now.
0
 
exchangeedgAuthor Commented:
Had to contact MS support to rectify the situation.
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 9
  • 7
Tackle projects and never again get stuck behind a technical roadblock.
Join Now