I have two virtual Exchange 2010 SP1 CAS servers in NLB configuration. I am using a single NIC on each server both for mail traffic and for the NLB communication. This is a supported configuration by Microsoft and I have setup the NLB cluster to work in multicast mode (this is the recommendation by Microsoft in order for this configuration to work - http://technet.microsoft.com/en-us/library/cc776178(WS.10).aspx
The NLB cluster was initially created successfully and the two nodes converged without errors. I also tested failover initially and it was working ok. After some time, when I try to connect to the cluster using NLB manager from a non-cluster node (any other server in my environment) I can see the cluster and its nodes properly. When I try to connect using NLB manager from any of the two cluster nodes, then only the local node is visible in the cluster and I receive the following error (event ID 4 - Kerberos) in the event log of the local CAS server:
The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server <local-CAS-server>$. The target name used was RPCSS/<other-CAS-server>. This indicates that the target server failed to decrypt the ticket provided by the client. This can occur when the target server principal name (SPN) is registered on an account other than the account the target service is using. Please ensure that the target SPN is registered on, and only registered on, the account used by the server. This error can also happen when the target service is using a different password for the target service account than what the Kerberos Key Distribution Center (KDC) has for the target service account. Please ensure that the service on the server and the KDC are both updated to use the current password. If the server name is not fully qualified, and the target domain (DOMAIN.LOCAL) is different from the client domain (DOMAIN.LOCAL), check if there are identically named server accounts in these two domains, or use the fully-qualified name to identify the server.
I have run setspn -L <other-CAS-server> and in the list of SPNs I cannot see the RPCSS/<other-CAS-server> SPN.
The same error as above occurs also when I try to view the OAB of the second CAS server from EMC on the first CAS server.
Any suggestions or assistance as to how to troubleshoot this further would be highly appreciated.