Solved

exchange 2010 failover clustering error 1207 and 1135 troubleshooting

Posted on 2014-02-05
18
2,162 Views
Last Modified: 2014-02-24
Greetings,

I have a two-server DAG. one of the servers is logging two event id 1207 entries every 15 minutes. They state:

Cluster with name "ClusterName" could not be brought online. The computer object associated with the resource could not be updated in domain "domain.local" for the following reason: unable to update password for computer account.

The text for the associated error code is:  RPC server unavailable

The cluster identity may lack permissions required to update the object.

and

"Cluster network name resource '%1' cannot be brought online. The computer object associated with the resource could not be updated in domain '%2' for the following reason: %3.

The text for the associated error code is: %4

The cluster identity '%5' may lack permissions required to update the object. Please work with your domain administrator to ensure that the cluster identity can update computer objects in the domain."


the other server in the DAG had one entry for error ID 1135 stating:

"Cluster node "EXCH10-2" was removed from the active failover cluster membership. Cluster service may have stopped...."

When I go to the DAG computer object in AD, I only see the exchange server with event ID 1135 listed in the permissions.

Also, if I do a nslookup for 192.168.35.20 (DAG IP), it fails. If I do a nslookup for DAG, it returns the previous IP. Not sure if that means anything to troubleshooting the 1207 error.

I am not sure if I should add the 1207 exchange server to the AD DAG computer object permissions or what. Not really sure where to start. I've looked at several resources online but am confused as to which AD objects I should look at and potentially edit, if any. This started 2 days ago.

Thanks a lot!
0
Comment
Question by:rpliner
  • 13
  • 2
  • 2
  • +1
18 Comments
 
LVL 7

Author Comment

by:rpliner
ID: 39836511
I restarted the cluster service on the exch10-2 node. It only has one public database. It was dismounted for a few minutes, then went through recovery steps to re-mount it, which it did successfully. I am now seeing event 1207 on the other DAG node. Initially, I only saw this error on the exch10-2 node.

Further, I am now seeing these event IDs

1564 (FSW failed to arbitrate for the file share \\FSW.domain.local\DAG.domain.local. please ensure it exists)

1069 (cluster resource FSW \\FSW.domain.local\DAG.domain.local in clustered service or application 'cluster group' failed)

1573 (node exch10-2 failed to form a cluster. this is because the witness was not accessible. please ensure the witness is online and available) The DAG share appears when I access the FSW server through a UNC path. Although, I am unable to open it (likely because I am logged on as administrator).

To note, I have not received event ID 1207 in the last 30 minutes. It has skipped the last two entries. Errors 1564 and 1069 appeared 3 minutes before the expected 1207.

I think DAG is broken and not even trying.

Thanks for any help
0
 
LVL 17

Accepted Solution

by:
Spartan_1337 earned 167 total points
ID: 39836524
Run this command to verify your FSW.

Get-DatabaseAvailabilityGroup -Identity <DAG Name> -Status | fl

Verify "Witness Directory" is the correct location of your FSW.
0
 
LVL 7

Author Comment

by:rpliner
ID: 39836525
event ID 1207 is showing up every 15 minutes on the first node now. It had not done this until I restarted the cluster service on the second node.
0
 
LVL 17

Expert Comment

by:Spartan_1337
ID: 39836530
Or just open EMC go to Organization Configuration - Mailbox - Database Availability Groups.
You should see DAG Name, member server and Witness directory
0
 
LVL 7

Author Comment

by:rpliner
ID: 39836557
Thanks for replying Spartan_1337.

Everything is correct in the output of that command. The witness directory is correct.
0
 
LVL 7

Author Comment

by:rpliner
ID: 39836569
EMC shows both nodes and the correct FSW and DAG share as well. Still no 1207 on the second node. They are showing up on the first node every 15 minutes still.

Thanks
0
 
LVL 7

Author Comment

by:rpliner
ID: 39836654
found this earlier

http://technet.microsoft.com/en-us/library/dd353973(v=ws.10).aspx

everything checked out. Both nodes show up.

I wonder if moving the FSW share will do anything to correct this. I am looking for a list of the correct permissions in AD, and which objects to apply them to.
0
 
LVL 7

Author Comment

by:rpliner
ID: 39836959
so I rebooted the FSW server and I am now receiving error 1207 on the second node again but no longer on the first node.

DAG in EMC shows network up. All databases show mounted / healthy.

Not sure where to go from here.
0
 
LVL 38

Assisted Solution

by:Philip Elder
Philip Elder earned 167 total points
ID: 39837074
Have you seen this TechNet blog post: http://bit.ly/1lBiMeo
Exchange 2007 SP1 CCR- Windows 2008 Clusters - File Share Witness (FSW) failures.

There are a number of points addressed in this blog post that also incorporates the errors you originally list.

Philip
0
 
LVL 7

Author Comment

by:rpliner
ID: 39837141
I had not run across this. Thanks Philip. I will look over it soon.
0
 
LVL 7

Author Comment

by:rpliner
ID: 39840415
I checked the DAG DNS entry and the check box for 'delete this record when it becomes stale' was selected and the record time stamp is dated 2/3/2014 9:00 PM. Scavenging is not enabled on the forward lookup zone though, so I am not sure if it matters. Also, there was no reverse lookup zone entry for the DAG. I couldn't ping by IP earlier.

I also noticed that the DAG object properties in AD>Computers>DAG 'right-click'>Properties>Object tab show the object was created 7/11/2013 and modified 2/3/2014 at 8:41 PM.
0
 
LVL 12

Assisted Solution

by:SreRaj
SreRaj earned 166 total points
ID: 39841217
Hi,

The cluster node computer objects should have full control permission over cluster resource computer object in order to update the object. Please verify full control permission is set for these.

If you have a DAG with the name DAG01, then there will be a computer object in AD with the name DAG01 and computer accounts for exchange servers Exch10-2 should have full permission over this object to update it. Also there is a limit on the number of computer objects which can be created in a Domain. If this limit is reached for an object, then it could have trouble in updating other objects.

Please refer 'Items to review in Active Directory' section in the following article.

http://technet.microsoft.com/en-us/library/cc773451(v=ws.10).aspx
0
 
LVL 7

Author Comment

by:rpliner
ID: 39841706
Thanks Sre Raj. I noticed that Exch10-2 was not listed at all on the DAG computer object in AD, while the other node. I'm going to add Exch10-2 and duplicate the permissions.

I had seen that link but it only mentioned checking the quota and that the DAG computer object should have full permissions. There was another article that mentioned, as you did, that the nodes should have certain permissions as well. That said, how do I check the quota? I only see an option to change the quota in ADSIedit, not view the current usage.

Thanks a lot!
0
 
LVL 7

Author Comment

by:rpliner
ID: 39843582
SreRaj, I added exch10-2 and matched the permissions of the first node. I am still getting the same 1207 errors as before. Thx
0
 
LVL 12

Expert Comment

by:SreRaj
ID: 39849988
While editing the value, ms-DS-MachineAccountQuota you could see the current value set for this attribute and by default it will be 10.

Please try modifying this value and also try restarting the node exch10-2.
0
 
LVL 7

Author Comment

by:rpliner
ID: 39850476
I will check that SreRaj. I will reboot the node when possible. If the reboot does not clear the error, I am going to call Microsoft and have support check it out. Will update.

thx
0
 
LVL 7

Author Comment

by:rpliner
ID: 39883225
I have not been able to reboot the server yet. However, all of a sudden, on the 20th the errors stopped on the second node and have not reappeared on either node. DAG info in EMC looks good. Again, I did nothing (no reboot, no disabling / re-enabling NIC, etc.).
0
 
LVL 7

Author Closing Comment

by:rpliner
ID: 39883231
none of these fixed the issue directly but I awarded points for the replies and troubleshooting tips.
0

Join & Write a Comment

ADCs have gained traction within the last decade, largely due to increased demand for legacy load balancing appliances to handle more advanced application delivery requirements and improve application performance.
Local Continuous Replication is a cost effective and quick way of backing up Exchange server data. The following article describes the steps required to configure Local Continuous Replication. Also, the article tells you how to restore from a backup…
In this video we show how to create a Shared Mailbox in Exchange 2013. We show this process by using the Exchange Admin Center. Log into Exchange Admin Center.: First we need to log into the Exchange Admin Center. Navigate to the Recipients >> Sha…
In this video we show how to create a mailbox database in Exchange 2013. We show this process by using the Exchange Admin Center. Log into Exchange Admin Center.: First we need to log into the Exchange Admin Center. Navigate to the Servers >> Data…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now