hqpsystems
asked on
2003 Cluster fails after 4 successful days running under a new Cluster Service account
OurCorporate policy is to change 'system' passwords when someone leaves. I was asked to modify the Domain Administrator account for my 2003 domain which was also being used as the Cluster Service account for my two Windows 2003 cluster nodes. I decided to create a different account clusadmin to sue for the Cluster, and gave it the correct rights as per the MS Knowlegebase article 269229.
I stopped the Cluster Service on both nodes, changed the service account and brought up the cluster without issue. Later, I then changed the domain administrator password. Everything went well for four days (about 100 hours) and then all the cluster resources started failing.
The error for each clustered resource was '9016 DNS signature failed to verify.'
The only way I managed to get round the problem was to change the Cluster Service account back to the domain administrator account (using the new password), restart the first node, and all came up fine. The second node was then brought up successfully.
What could cause this behaviour, after 4 successful days? I gave the clusadmin account all the rights I believe it should have had. Does some backend process run after 100 hours or something that could cause this? Thanks.
I stopped the Cluster Service on both nodes, changed the service account and brought up the cluster without issue. Later, I then changed the domain administrator password. Everything went well for four days (about 100 hours) and then all the cluster resources started failing.
The error for each clustered resource was '9016 DNS signature failed to verify.'
The only way I managed to get round the problem was to change the Cluster Service account back to the domain administrator account (using the new password), restart the first node, and all came up fine. The second node was then brought up successfully.
What could cause this behaviour, after 4 successful days? I gave the clusadmin account all the rights I believe it should have had. Does some backend process run after 100 hours or something that could cause this? Thanks.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
The c:\windows\cluster\cluster .log file will probably say what went wrong when the cluster failed, if the "network name" resources are what failed, unchecking "require dns registration to succeed" will fix that temporarily, but will need to fix the problem eventually.
ASKER