Solved

SCR not replaying logs -- ReplayLogQueueLength keeps growing

Posted on 2008-10-23
24
2,156 Views
Last Modified: 2012-06-27
Having an issue setting up SCR between a NY server and a UK server.Logs are sent from NY to UK just fine, but they don't get rolled up into the database. I set this up about a week ago and the number of transaction logs keeps growing and growing.  Not sure which is the chicken and which is the egg, but when I do a get-storagegroupcopystatus -standbymachine, the LatestFullBackupTime parameter is blank (even though the backup are run regularly and logs are flushed on the source as expected). I've tried several things including:

*  disable and re-enabling SCR
* changing the replaylagtime from the default to 30 seconds
* restarting the replication service
* installing the lastest rollup (sp1 rollup 4) on both machines
* reboot each machine

Nothing seems to have any effect. Event Log doesn't seem to be giving me any information either. Below are the output of get-storagegroupcopystatus and test-replicationhealth commands. Any help you be greatly appreciated:

get-storagegroupcopystatus -identity "svr01exch\first storage group" -standbymachine svr02exch

Identity                         : SVR01EXCH\First Storage Group
StorageGroupName                 : First Storage Group
SummaryCopyStatus                : Healthy
NotSupported                     : False
NotConfigured                    : False
Disabled                         : False
ServiceDown                      : False
Failed                           : False
Initializing                     : False
Resynchronizing                  : False
Seeding                          : False
Suspend                          : False
CCRTargetNode                    :
FailedMessage                    :
SuspendComment                   :
CopyQueueLength                  : 3
ReplayQueueLength                : 2753
LatestAvailableLogTime           : 10/23/2008 6:38:33 PM
LastCopyNotificationedLogTime    : 10/23/2008 6:38:33 PM
LastCopiedLogTime                : 10/23/2008 6:37:21 PM
LastInspectedLogTime             : 10/23/2008 6:37:21 PM
LastReplayedLogTime              : 10/18/2008 3:55:34 PM
LastLogGenerated                 : 693557
LastLogCopyNotified              : 693555
LastLogCopied                    : 693554
LastLogInspected                 : 693554
LastLogReplayed                  : 690801
LatestFullBackupTime             :
LatestIncrementalBackupTime      :
LatestDifferentialBackupTime     :
LatestCopyBackupTime             :
SnapshotBackup                   :
SnapshotLatestFullBackup         :
SnapshotLatestIncrementalBackup  :
SnapshotLatestDifferentialBackup :
SnapshotLatestCopyBackup         :
OutstandingDumpsterRequests      : {}
DumpsterServersNotAvailable      :
DumpsterStatistics               :
IsValid                          : True
ObjectState                      : Unchanged

test-replicationhealth
Server          Check                      Result     Error
------          -----                      ------     -----
SVR02EXCH       ReplayService              Passed
SVR02EXCH       SGCopySuspended            Passed
SVR02EXCH       SGCopyFailed               Passed
SVR02EXCH       SGInitializing             Passed
SVR02EXCH       SGCopyQueueLength          Passed
SVR02EXCH       SGReplayQueueLength        Passed
SVR02EXCH       SGStandbyReplayLag         *FAILED*   Standby Continuous Replic
                                                      ation for storage group '
                                                      SVR01EXCH\First Storage G
                                                      roup' on computer 'SVR02E
                                                      XCH' has configured repla
                                                      y lag time of 00:00:30 an
                                                      d a replay queue length o
                                                      f 2748. However, log repl
                                                      ay is actually behind by
                                                      a period of 733337.18:23:
                                                      20.4279554 which is great
                                                      er by more than the Failu
                                                      re window of 01:00:00. Ei
                                                      ther log replay is slow o
                                                      r is not making progress.
                                                       Please investigate furth
                                                      er.

0
Comment
Question by:dinkum305
  • 10
  • 9
  • 4
  • +1
24 Comments
 
LVL 33

Expert Comment

by:Exchange_Geek
ID: 22790230
Waiting for your reply on the other forum
0
 

Author Comment

by:dinkum305
ID: 22790249
Regarding the size of the database: This database is 80GB. I have another 80GB database I want to replicate as well after I get past this issue.

Thanks
0
 
LVL 33

Expert Comment

by:Exchange_Geek
ID: 22790350
Ok to get over this issue change the ReplayLagTime cmdlet to assingn 15 - 30 minutes and the TruncateLagTime cmdlet with 60 mins (this matches the failure point time).

If it does not work - you may want to suspend and resume storage group copy. Restarting the Replication service might not be a good idea at the moment.
0
 
LVL 32

Expert Comment

by:gupnit
ID: 22790493
Hi,
Well I would go and Seed the Target. See a half updated DB on target would not be a good idea and to keep it like that for long even worse. http://technet.microsoft.com/en-us/library/bb738131(EXCHG.80).aspx
Else try this.....
  • disable-storageggroupcopy -standbymachine YOURSERVER
  • When you are prompted to delete files in target, ignore that and do not delete.
  • Enable SCR again .... enable-storagegroupcopy
  • See if that initiates the Relay
Cheers
Nitin
0
 

Author Comment

by:dinkum305
ID: 22790748
Thanks both,

I just entered the following commands:

Disable-StorageGroupCopy -identity "svr01exch\first storage group" -StandbyMachine svr02exch

Enable-StorageGroupCopy -identity "svr01exch\first storage group" -StandbyMachine svr02exch -ReplayLagTime 0.0:15:0 -TruncationLagTime 0.1:0:0

Will see if that helps. Is there anything I should do to prime it to replay (such as running a backup on the source) or should it start playing the logs into the target db on it's own?

Thanks again
0
 
LVL 33

Expert Comment

by:Exchange_Geek
ID: 22790843
Personally i would recommend re-seeding however, i wanted to provide this option if my plan A fails. However, now since i have mentioned this - if my planA fails - simply re-seed however keep the same counters fixed when you work with blank DB on target.

There isn't anything that needs to be done on source server. Relax.
0
 

Author Comment

by:dinkum305
ID: 22790898
get-storagegroupcopystatus now yields:

Identity                         : CIF01EXCH\First Storage Group
StorageGroupName                 : First Storage Group
SummaryCopyStatus                : Failed
[snip]
FailedMessage                    : Log file action LogCopy failed for storage g
                                   roup CIF01EXCH\First Storage Group. Reason:
                                   The handle is invalid.
[snip]
CopyQueueLength                  : 0
ReplayQueueLength                : 0
[snip]
LastLogGenerated                 : 693709
LastLogCopyNotified              : 0
LastLogCopied                    : 0
LastLogInspected                 : 0
LastLogReplayed                  : 0
[snip]
IsValid                          : True
ObjectState                      : Unchanged

Any other ideas or am I forced to reseed? (Not an easy task for 80 GB over a T1.)
0
 
LVL 33

Expert Comment

by:Exchange_Geek
ID: 22791210
If its possible - please dismount stores on source - copy all the log files to a different directory from source storage group - this ensures we do not have any log files except our sweet little 80Gb database file.

Now, remount your stores - and work with the same command again.

If a dismount and remount is not possible try removing logs except for last 20 odd files.

then work with the same command again.

Else, if it doesn't work - re-seed is what i visualize further.
0
 

Author Comment

by:dinkum305
ID: 22791558
Can't dismount, but I was able to copy all the log files except for e00.log. I put them in the log directory on the target, but I'm getting a different error now. I think its telling me which log has the invalid handle

Event Type:      Error
Event Source:      ESE
Event Category:      General
Event ID:      481
Date:            10/23/2008
Time:            11:45:44 PM
User:            N/A
Computer:      SVR02EXCH
Description:
Microsoft.Exchange.Cluster.ReplayService (5320) Log Verifier e00 63835064: An attempt to read from the file "\\SVR01EXCH\1fd53a2e-3679-4740-94bb-c8af43ab2327$\E00000A95C5.log" at offset 593920 (0x0000000000091000) for 65536 (0x00010000) bytes failed after 1 seconds with system error 6 (0x00000006): "The handle is invalid. ".  The read operation will fail with error -1022 (0xfffffc02).  If this error persists then the file may be damaged and may need to be restored from a previous backup.

For more information, click http://www.microsoft.com/contentredirect.asp.

I re-copied that one log from the source and now I'm back to the "handle is invalid" point I was before. I'm a bit concerned that I have corruption on the source database.

0
 
LVL 33

Accepted Solution

by:
Exchange_Geek earned 500 total points
ID: 22791640
-1022 normally comes along when database is not being able to be read properly - could be various reason - disk / third party applications / high RPC perf issue. Plenty.

I guess let us not waste time and do a reseed by removing all those database and log files from target, fresh reseed makes sense. i know it sounds a bit frustrating but thats how E2k7 is - if it works its great if not its nasty
0
 

Author Comment

by:dinkum305
ID: 22791864
OK, well thanks for your assistance on this. Probably looking at a week at least to reseed, but I feel a little better about it knowing that it can't be helped.

I'm actually in the process of seeding the second database, hopefully I'll be ready to setup SCR for that this weekend. Maybe I'll have better luck.
0
 
LVL 33

Expert Comment

by:Exchange_Geek
ID: 22791882
Wish you all the very best.
0
Too many email signature changes to deal with?

Are you constantly being asked to update your organization's email signatures? Do they take up too much of your time? Wouldn't you love to be able to manage all signatures from one central location, easily design them and deploy them quickly to users. Well, you can!

 

Author Comment

by:dinkum305
ID: 22844299
Well, I setup SCR for the other store and it seems to be behaving properly. Hopefully I will be done seeding this weekend and I can retry with the original store.
0
 
LVL 33

Expert Comment

by:Exchange_Geek
ID: 22844440
How do you qualify strange - could you pen down some words - that would be helpful.
0
 
LVL 32

Expert Comment

by:gupnit
ID: 22844497
:-) !
0
 

Author Comment

by:dinkum305
ID: 22844543
Not sure what you mean by that comment.
0
 
LVL 33

Expert Comment

by:Exchange_Geek
ID: 22844671
"Well, I setup SCR for the other store and it seems to be behaving properly"

I am not even sure what does this mean, just wanted to ask you - what's wrong ??
0
 
LVL 32

Expert Comment

by:gupnit
ID: 22844712
Well, chill it should be fine :-)
0
 

Author Comment

by:dinkum305
ID: 22844728
Was just giving an update to the thread in case it would help any lurkers.

The original issue was that the transaction logs weren't replaying on the target. You advised me to re-seed which I'm doing. But, in the meantime, I've got a second database seeded ... I setup SCR and I'm not seeing the original issue. I'm hopeful that when the other database is done seeding this weekend, I'll be all set.
0
 
LVL 33

Expert Comment

by:Exchange_Geek
ID: 22844794
Ah ok the second store is seeded and no errors till now, i thought the second one has also gone to the dogs......life saved...i hope your weekend doesn't get spoiled and the first store behaves as good as the second.
0
 
LVL 32

Expert Comment

by:gupnit
ID: 22844814
You should be.....and yes thanks for the update....it would 100% help others too
Cheers
Nitin
0
 

Expert Comment

by:Alex_Mann
ID: 22948997
I had a similar problem. I have solved it having restarted services Microsoft Exchange Replication Service. Good luck!
0
 

Author Comment

by:dinkum305
ID: 22949112
After reseeding, everything looks good. A bit of a pain, but alls well that ends well
0
 
LVL 33

Expert Comment

by:Exchange_Geek
ID: 22949139
Wow that is such a good end to such a long story.

Congrats.
0

Featured Post

The problems with reply email signatures

Do you wish that you could place an email signature under a reply? Well, unfortunately, you can't. That great Exchange/Office 365 signature you've created will just appear at the bottom of an email chain. What a pain! Is there really no way to solve this? Well, there might be...

Join & Write a Comment

This process describes the steps required to Import and Export data from and to .pst files using Exchange 2010. We can use these steps to export data from a user to a .pst file, import data back to the same or a different user, or even import data t…
Scam emails are a huge burden for many businesses. Spotting one is not always easy. Follow our tips to identify if an email you receive is a scam.
The video tutorial explains the basics of the Exchange server Database Availability groups. The components of this video include: 1. Automatic Failover 2. Failover Clustering 3. Active Manager
This video discusses moving either the default database or any database to a new volume.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now