Most Common Mistakes made by Exchange Administrators/Engineers when deploying Exchange Mailbox Servers

People frequently reach out to me to assist with outages in their Exchange environments. It's alarming how many Exchange Engineers/Administrators hold current Microsoft certification, yet continue to make these mistakes over and over. Here are some tips you should follow when deploying an Exchange Mailbox server.

Place Exchange Database and Logs on Thick Volumes

Never use thin provisioned disks in Exchange. When a thin provisioned volume is added to a server, the server operating system has no insight into the storage array. Why does this matter? It matters because the storage array is only allocating what is being used. As Exchange data is processed, the storage has to zero-out the block and reserve it first before the transaction can be written. This is a performance bottleneck for Exchange, especially in large environments. No high I/O applications should ever use thin provisioning. Always be specific about your storage needs if another team handles your storage requests. If you don’t have access to your storage array, request a screen-shot from them when they provision storage for you. Always validate Exchange builds with a catalog of validation documents.

Follow Best Practice

Always obtain best practice documentation from all your vendors. Sometimes you will run into situations when you have contradictory best practice guidelines. Most of the time you should use the vendor’s guidelines if this comes up. Here is a good example:
Scenario: Deploying Exchange 2013 in a VMware virtual environment with NetApp SAN attached storage. After reading Microsoft Best Practice for Exchange, NetApp best practice and VMware best practice documents, you’ll come across situations where Microsoft says not to do something and NetApp says you should. In this example you would always follow NetApp guidelines (as long as this doesn’t violate a support agreement with Microsoft.) Sometimes VMware best practice will contradict NetApp guidelines, this can be tricky sometimes. If you are unsure how to proceed, reach out and get advice from experts. Sometimes you can find deployment documents written by your storage vendor and VMware.

Use correct Allocation Unit Size used when formatting disks

When you format a disk for Exchange make sure you change the allocation unit size from Default to 64K, unless #2 above indicates otherwise! If you aren’t the one who formatted the disk, you can check the allocation unit size with this command (assuming the disk you are looking at is M:) Fsutil fsinfo ntfsinfo m:
The line “Bytes Per Cluster” is the allocation unit size. You can also use diskpart to get the size, after selecting the volume, DISKPART> filesystem will give you size.

TCP Chimney Offload and Receive Side Scaling should be disabled on Network Controllers

Always use these commands to disable these on each Exchange DAG member
netsh int tcp set global chimney=disabled
netsh int tcp set global rss=disabled

NEVER use DHCP to assign a DAG IP Address

Always use a static IP address for your DAG. I can’t tell you how many times I’ve been asked to assist someone with an Exchange outage and discover the DHCP assigned IP to the DAG is a duplicate on the network.

Separate Replication Traffic

Use separate network interfaces for Exchange replication traffic. Make sure you disabled replication on the interfaces used for MAPI. It always helps to change the labels of these networks in Exchange so you know which one you are disabling! I’ve seen in several environments the server did have a separate interface for replication traffic, but the DAG configuration didn’t have replication disabled on the other interface.

Define DAG heartbeat threshold settings accurately for your environment

This is probably the scenario I get asked about the most. I’ve seen environments turn off DRS because of this causing problems on their servers. Always define the cluster heartbeat settings if a server will be subject to vMotion. Here are the commands to see the heartbeat settings (assuming my cluster name is DAG1 and the commands to change them:
Cluster /cluster:dag1 /list /prop:samesubnetdelay
Cluster /cluster:dag1 /list /prop:crosssubnetdelay
Cluster /cluster:dag1 /list /prop:crosssubnetthreshold
Cluster /cluster:dag1 /list /prop:samesubnetthreshold

Cluster /cluster:dag1 /prop samesubnetdelay=VALUE
Cluster /cluster:dag1 /prop samesubnetthreshold=VALUE
Cluster /cluster:dag1 /prop crosssubnetdelay=VALUE
Cluster /cluster:dag1 /prop crosssubnetthreshold=VALUE

DAG Latency

This one is simple, site link latency for a DAG should NEVER exceed 500ms. Make sure your network is able to support your Exchange environment.

Make sure VMDK’s are provisioned right

If you follow #2, you’ll know there is only one way to provision VMDK’s on Exchange servers. Always remember these four words: Thick Provision Eager Zeroed
And make sure you get those screen shots if you don’t have access to your virtualization environment!

VMFS volumes for Exchange should be segmented from other services

Exchange should never have to fight for I/O. Dedicate VMFS volumes for Exchange, this is especially important in large organizations.

Monitor your backups!

If you don’t have something monitoring your Exchange backups, do it now. This problem will creep up on you rather quickly and the consequences can be catastrophic, resulting in email data loss. Know how long your environment can withstand a backup failure before storage fills up and define your monitoring process to alert accordingly.

Comments (3)

badabing1

Commented: 2015-07-30

Brilliant article!
Thanks

Rik Tammegat

Commented: 2017-01-30

About storage... my pet peeve :)

I agree with your recommendation about full provisioning for log disks. These grow and shrink periodically, especially when using log backups. The storage array therefore has to allocate and deallocate constantly, which only adds overhead. I do, however, want to put up a side note for thin provisioning with the database disks. Some storage systems detect "zero-outs" on thin provisioned LUNs. The effect is that the writes are acknowledged to the Exchange servers, but won't get written to disk. Result is a faster "zero-out" than with full provisioning.

Bottom line: Know your storage box.

Albert Widjaja

IT Professional

CERTIFIED EXPERT

Commented: 2017-03-21

Thanks for sharing RKluxen,

I couldn't agree more on those points that you've shared above.

What about:

1. set the swapfile to static size ?
2. disabling IPv6
3. skipping the .NET framework update for all server running Exchange server.