Exchange 2007 SP2 Upgrade

Hi,

I was running Exchnage 2007 SP1 Rollup 6 on Windows server 2008. Yesterday I updated all four of my exchange servers to SP2. The HTand CAS server went fine while I had problem in upgrading CCR.
After upgrading passive node to SP2, I stopped and moved  CMS. The upgradecms command failed with an error 0x8007139A (sceenshot attached). I changed the IIS and SA timeout but no use. Reboot didnot help either.Was not able to stop/move CMS.
I upgraded the another node(active) to 2007 SP2 as well. All the mailflow is working fine and I was able to backup the cluster as well.
After checking the cluster, I found one the resouce (Recovery Storage Group) was offline and was failing. I went ahead and deleted the RSG and CCR came online (screenshot attached). UpgradeCMS was still erroring out.

The problem is that all the databases (four in four storage groups) are suspended since yesterday and I think the logs are not replicated to another node(looking at the size of logbase file).

Is there a way to bring these storage groups (copy status) in healthy condition?Should i be worry as CMS is still not upgarded?

Thanks

Suren

 

jmlsAsked:
Who is Participating?
 
jmlsAuthor Commented:
I had to reseed the database. I did automatic reseed with update-storagegroupcopy with-deleteexistingfiles option. That worked..
0
 
jmlsAuthor Commented:
Here are the screenshots
CMSError.JPG
CMSError2.JPG
0
 
BusbarSolutions ArchitectCommented:
have you made sure that cluster service is running on both PCs
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
jmlsAuthor Commented:
Cluster Service is running (automatic) on both the nodes
0
 
tusharnextgenCommented:
1. make sure from registry or from exchange management console if CMS is showing exchange 2007 sp 1 or sp2 (if it is showing sp1 then upgrade CMS is not successful)
2. Restart the remote registry service on passive node. Try failing over after restart of remote registry service
3. Check the replciation health status of CCR
4. take the full backup of active node if they are mounted and you are sure that they have complete data.
5. Now try to update the passive DATABASE copy of CCR
6. upgradecms will only run when your cms is in stop state.
7. Try running ExBPA also
0
 
jmlsAuthor Commented:
Thanks for the reply...

1. I have checked the CMS..It says version 8.2 (which means SP2)..
2. I will restart the remote registry service tonight (after hours). What do mean by trying Failing over?
3. Will run the test-replicationhealth after that
4,5 - depending on 2,3 result, will do it tomorrow.
6 UpgradeCMS is not needed
7 WIll do if above steps does not work..

0
 
maumenCommented:
Check the steps provided in the following link:
http://technet.microsoft.com/en-us/library/bb676320(EXCHG.80).aspx
0
 
jmlsAuthor Commented:
I did follow the above mentioned document. Step 11 errord out so I went ahead and finished all the remaining steps. Rightnow CMS is also upgraded to SP2.
0
 
jmlsAuthor Commented:
I noticed that under CMS properties(Exchnage Management Console), Active and Quorum owner roles are being designated to one node while the other node node has no role. I am sure before the upgrade both nodes has one role each. I think that could be the reason for the logs not being copied over or suspended databases. Any thoughts?
0
 
jmlsAuthor Commented:
I tried few things..
1. Restarting Remote registry service on passive node didnot help..
2. When ran test-replicationhealth, got SGCopySuspended Failed on passive server(it used to be active server but the CMS was moved during CMS upgrade). All other check were passed
3.When tried to move the CMS to passive node (like it was in the past), it errored out with continuous replication is in a failed,seeding or suspended state on storage group.
4.Rebooting passive node didnot help either
5. The storage groups are still in suspended start and CMS is online
6. The symantec backup is running fine but surprisingly the log file size is increasing on CMS.
0
 
tusharnextgenCommented:
Try updating the passive copy of the database and make it healthy
http://technet.microsoft.com/en-us/library/aa998853(EXCHG.80).aspx

You can also go with re-seeding database option.
How to Seed a Cluster Continuous Replication Copy
http://technet.microsoft.com/en-us/library/bb124706(EXCHG.80).aspx

Before going for above mentioned option this please check
How to View the Status of a Clustered Mailbox Server
http://technet.microsoft.com/en-us/library/bb123923.aspx

and make sure both nodes are visible in front of following attributes
InUseReplicationHostNames
OperationalReplicationHostNames
OperationalMachines
0
 
jmlsAuthor Commented:
1. I have checked the CMS status and both nodes are visible for the above specified attributes
2. I will try reseed option tonight with one of the database as test

Thanks
0
 
jmlsAuthor Commented:
Will reseeding the database help as I know storage group is in suspended state? My main concern is to bring the storage group to Healthy condition from suspended copy status. Is there a way to change that directly without database being copied over to passive server. I want to avoid database copyover..
0
 
maumenCommented:
Have you tried using Resume-StorageGroupCopy?
0
 
jmlsAuthor Commented:
I tried that through Exchange Management Console..It shows Healthy for a momemt and then it goes again to the suspended state.
0
 
maumenCommented:
This link might help, it provides information on diagnosing CCR using the Get-StorageGroupCopyStatus cmdlet:
http://technet.microsoft.com/en-us/library/aa996020(EXCHG.80).aspx 
0
 
jmlsAuthor Commented:
My Get-StorageGroupCopyStatus says suspended and there is no information regarding that on the above specifed URL..surprisingly symantec backup is working fine from Active node but the logs are not flushing..
0
 
maumenCommented:
Try using the following cmdlet:
Restore-StorageGroupCopy activates the passive copy so that it can be mounted:
http://technet.microsoft.com/en-us/library/aa996024(EXCHG.80).aspx
You might also try Update-StorageGroupCopy to resynchronize replication to an invalid database.
 

0
 
tusharnextgenCommented:
jmls:

1. Backup is not deleting log files because those log files are not yet successfully replayed on the passive copy of the database.
Option you have is
On active node
Suspend storage group copy
1. Dismount database in off business hour. Move all the log file including .log .chk .jrs files to different location (we can delete them after successful mounting and backup of database)
2. also copy the database file to the passive location(database location on the passive node). move all existing files i.e. database file and log files including .log, .chk. jrs and .edb from passive location to different folder (we will delete once we are sure if our issue is resolved)
3. Mount the database on Active node
4. Resume storage group copy
 Please reply if you have difficulties in understanding my comment
The steps from manual copiing of database to passive node are also present in the article i provide
http://technet.microsoft.com/en-us/library/bb124706(EXCHG.80).aspx
0
 
jmlsAuthor Commented:
Thanks for the detailed instructions..

As I have one database per storage group, I will test these steps tonight for two storage groups (Primary and Secondary Storage groups). Depending on the results I will schedule the other databases as they are very critical and larger in size.

Whats the difference between Resume-storagegroupcopy/Restore-storagegroupCopy ?

I hope resuming storage  group copy will being the storage group in healthy condition.
0
 
tusharnextgenCommented:
Resume will start replicating of log files which was suspended.
But log file will only replicated and replayed if both the databases are identical with difference of some log files and all required log files are present and in sequence.
In short you have to use resume-storagegroupcopy after completion of steps.
0
 
jmlsAuthor Commented:
I tried the above mentioned steps(tusharnextgen). It stayed in Healthy condition momentarily and then went to backyp suspended state. The same thiing was happening before as well (by directly resume-storagegroupcopy without copying over the data).

It seems I need to contact MS for this. I hate CMS problems..
0
 
tusharnextgenCommented:
I guess Ms engineer will also follow the same steps as we followed again.
But i am curious about the solution if it is other than this.

Please let me know once it is solved
Thank you very much and
Best of luck
0
 
tusharnextgenCommented:
I have no objection.

Solution used by ms is also suggested in article
http://technet.microsoft.com/en-us/library/bb124706(EXCHG.80).aspx

and following link has all the switches for update-storagegroup copy.

http://technet.microsoft.com/en-us/library/aa998853(EXCHG.80).aspx

Thank you for such a good question where in I have recalled all my CCR cases which I have solved
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.