Organisation is organized in a pattern to flow the day to day business, every application and system is interdepended on each other and when very important “Exchange Server downtime” happened.
In a nutshell, To business it means losses and leads to putting their business-critical application unavailable or not working. The critical and panic situation as resource of the company only choice you are left with is to bring back online everything.
So how do you plan to minimise downtime for an Exchange Server? Well there are a number of ways. To list some of them, you could plan for high availability, meaning you can have multiple Exchange Mailbox Servers in a DAG, either in 1 data centre or spanning across multiple data centres.
Secondly you could look at mail spooling, which means mail wont bounce but will be spooled until the Exchange Server is back online again.
You could also look at continuity, this means you have a plugin from a 3rd party vendor running in outlook so when the connection drops, your users will enter into continuity and mail will still flow until you restore your failed Exchange Server.
Let’s dive into a few more options to ensure you do not have an Exchange Server down.
- Drive space
- Test failovers
- Ensure transactional logs are on different volumes
- Perform backups/Test your backups
- Ensure you have more than 1 domain controller in your environment
- No snapshots
- Backup Power
- Mix Server Roles
Exchange writes a lot of information all the time. Many times, Admins do not adequately spec an Exchange Server and run out of space on a disk, whether it is the C:\ drive or a mount point. The Exchange calculator will advise what size to build the drives but you need to maintain it. CAS servers write a lot of IIS Log files and these can be pretty large depending on how big your environment is. If Exchange reaches a threshold, in this scenario disk space, it will cause issues and downtime.
It is all well that you as an Exchange Admin built this DAG but if you do not perform failovers, you will not know if there is a problem. Let’s say for example, you have a 2 node DAG, Server 1 goes down and you expect Server 2 to mount the databases but in reality it doesn’t, well this is a problem for business when its high availability setup is not working. It could be many things which can be discussed on this, but as an Exchange Admin, you need to test this, weekly to ensure that you have a working environment. One way to test this is to failover your server and then put that 1 server in maintenance mode so you can patch it and check if you remain connected to Exchange.
Ensure transactional logs are on different volumes
This is self-explanatory, but it ensures that if you lose the boot volume, your log files are still intact and you could recover your exchange Database by log files.
Perform backups/Test your backups
Backups are a critical part of an Exchange Admins job. You probably have a team that manages backups and you need to ensure that the backups are running daily, this ensures log files are truncated so you don’t end up with a problem when your server reboots unexpectedly and the databases wont mount because of inconsistency.
The second part is to test your backups. Testing it and signing off you daily check sheet that you are able to successfully restore your database and mount it means that your backups are in good order.
Ensure you have more than 1 domain controller in your environment
Exchange and domain controllers work very closely together. If you have 1 domain controller in your environment and it goes down, Results to Exchange lose access to it and eventually fall over because it cannot start its services etc. Having multiple domain controllers in your environment ensures that Exchange will remain working, yes it will log an error when it cannot communicate with the domain controller because that is down, but will switch to another working one.
Many IT Admins think they can snapshot a machine and if they have an Exchange Server failure, In this case we can just restore the snapshot. Unfortunately, the bad news is that Microsoft firstly don’t support this and secondly you will have data loss. Exchange writes data real-time and if you perform a snapshot you are going to have more downtime trying to restore data etc. and making more work at the end of the day for yourself as an IT Admin.
This is a big and Critical activity of Exchange adminand can be discussed in broader aspect. It is essential that you as an Exchange Admin, You have some form of monitoring in place for your Exchange Environment, this could be System Centre Operations Manager from Microsoft or Labtech etc. but whichever one you conclude on, it needs to be setup properly. Not only it ensure that you have full visibility of everything but you can action something that comes up before you have complete downtime. Many times, this kind of monitoring will advise you that on a disk that is going to fail or that you are nearing disk capacity to finish on your drives.
Having a generator or UPS attached to your Server equipment is vital in your environment. If Sussie boils the kettle and it trips the power in the office, you will definitely end up with corruption on your Exchange Server databases and this will result in downtime. When you will try to recover Your server, It should be on it’s DB board with its adequate power, so in an event of a failure you can continue to operate.
Mix Server Roles
This is a very important point. Many companies don’t want to spend money on equipment and tell the Admins, “just make it work”. As an example When Exchange being installed on a domain controller or on a SQL server. Firstly, this is not supported from Microsoft and secondly you will run into a lot of downtime because the SQL admin might need to do a reboot midday etc. Ensure that Exchange is on its own server to prevent downtime caused by other applications.
On a final note, Considering all point discussed above and considering all of these challenges, Every Exchange Admins must to have and can always rely on Stellar Phoenix and there Exchange Toolkit to come to the rescue from the exchange server database turbulent downtime situation and as it ensure that you have all your data available.