asked on

Exchange cluster fail

this morning i noticed bunch of cluster errors on almost all exchange servers that had happened about 3 am last night. so kept digging and finally i pulled the cluster log files from all exchange servers via
Get-ClusterLog -TimeSpan 600 -Destination C:\temp\ClusterLog

and i noticed bellow errors that shows entire cluster went down.

00016538.00016480::2018/03/02-05:44:05.417 INFO [CHM] Received notification for two consecutive missed HBs to the remote endpoint 172.19.17.202:~3343~ from 172.19.16.149:~3343~
00016538.00016480::2018/03/02-11:46:05.102 INFO [IM] Marking Route from 172.19.16.149:~3343~ to 172.17.166.38:~3343~ as down
00016538.00016480::2018/03/02-11:46:05.103 INFO [IM] got event: Remote endpoint 172.17.166.39:~3343~ unreachable from 172.19.16.149:~3343~
00016538.00016480::2018/03/02-11:46:05.103 INFO [NDP] Checking to see if all routes for route (virtual) local 169.254.1.95:~0~ to remote 169.254.6.31:~0~ are down

now the interesting part is that the network team investigataed everything and they confirmed there was no network connectivity of any kind around that time between any of the exchange severs.

my question is that if there was no network connectivity what else might have caused all this? everything has come back online and normal on its own. but i wana know the reason for this.

ASKER CERTIFIED SOLUTION

timgreen7077

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Riaz Alexander Ansary

ASKER

Ended up that Our new backup system Comvault took snapshots of each server and that caused the blip. not sure why backup team had the snapshot setup for exchange servers. they are excluded now. thanks for the response

timgreen7077

Good deal

timgreen7077

Assisted author in troubleshooting the issue.