They have been around for years and for thousands of Microsoft Outlook users and email administrators out there, they'd be lost without them: Personal Storage Table (PST) files. If you've worked with Outlook for very long, the name will immediately ring a bell; if you've ever administered Outlook, you may already know about the problems associated with this notorious file format.
In any corporate environment - or, for that matter, any environment with an Exchange Server - the use of PST files as a permanent solution to an email administrator's problems should be banned. Let's find out why.
Problem 1: File Sizes and Data Security
The number one issue with the PST format prior to Outlook 2003 was that it was ANSI (American National Standards Institute)-based. The ANSI PST format has a maximum size limit of 2GB, and other limitations exist with regard to the number of items which can be stored per folder. However, there was a particularly problematic bug which allowed data to be written to ANSI PSTs past the 2GB limit without warning. This would result in data loss, at least past the 2GB limit, but potentially loss of all the data stored in the file.
To address these concerns, Outlook 2003 and higher introduces a new PST format which runs on Unicode instead. This format stores up to 20GB of data, but it should be noted that upgrading Outlook does not
automatically upgrade any PST file(s). This must be completed manually, by creating a new Unicode file and transferring the data across.
Despite the improvements made, PST files are still susceptible to corruption issues - which will result in lost data. These become particularly prevalent as files become larger or you increase the volume of data which moves through the file. For most users, the prospect of losing precious or business-critical emails, reminders, tasks and contacts could be cause for significant concern. It shouldn't come as a surprise that you should make a regular backup of your PST file(s), but this is not completely safe, as a PST can go for weeks or months in a partially corrupted state before you realise you have a problem.
Problem 2: Network Access and Backups
PST files must be stored on a local hard disk. Accessing them over a network is not supported by Microsoft, and has not been since Exchange 4.0. Instabilities in the network, loss of network connectivity, speed issues in reading and writing from the file server can all cause issues. Furthermore, the mechanisms Outlook uses to read and write data in the PST file are not efficient when operated over a network; the commands used are optimised for handling as system calls to the local operating system. Passing these commands over the network interface is not efficient for the file server, and will cause the performance of that server to seriously deteriorate.
This has three implications for system administration:
Firstly, backups are already difficult to maintain, due to the issues with corruption going undetected, but will become ever more difficult to implement. As the PST cannot be run from the network, you must configure backups on each machine individually - and must ensure the backup does not run while Outlook is running. Backing up the Exchange Server is rather pointless, as the data is offloaded into the PST when the user downloads their email or hits the "Archive" button.
Second, your cost of administration increases significantly. Considering a typical organisation, which may have remote workers and several sites across different areas of the country or perhaps throughout the world, moving administration away from the server and towards the client lessens the design principles surrounding central administration, requiring more admin time to perform repetitive tasks on PST files. The system may quickly grow beyond your control, becoming exponentially difficult to track and maintain.
Problem 3: File Sharing and Remote Access
Your network performance will
suffer. As previously noted, the commands used by Outlook are not efficient file-handling mechanisms when passed over a network interface. File Servers storing PST files for many users have been known to grind to a halt when handling multiple PST files. As an example, the expansion of a PST file to allocate more space on disk requires allocation in the Master File Table (MFT - on an NTFS partition). This is a blocking process, which locks the entire volume on the file server while it takes place, causing all other I/O
to that volume to be queued. This results in a service interruption and potential server hangs to users storing PST files or other share data on those volumes. This is a known and well-documented
issue which can be traced back to raw data from performance monitor reports.
PST files do not
natively support file sharing between multiple users simultaneously. If you attempt to configure this, the mail file may be corrupted -- not to mention the fact you would need to run the file over the network, so problem #2, above, has already been invoked.
Storing data in PST files also has no benefits for remote access either. Exchange's Outlook Web Access (OWA) (or Outlook Web App, in Exchange 2010) allows users to remotely access their mailboxes, providing a near Outlook user interface for doing so. Data in PST files has usually been removed from the mailbox, so immediately becomes inaccessible to the user remotely.
Problem 4: Inefficient use of resources
You've invested in a powerful Exchange Server. It: has large, redundant disk arrays, processing power and RAM capacity; cost you thousands to purchase the hardware and software licenses; adds significantly to your energy and data centre cooling bill. If PST files are in use, your server is essentially going to waste; the functionality of the server you are actually
using is essentially the same as a free Linux mail server distribution running on an old workstation supporting POP3 clients.
Problem 5: Legal Implications
With the release of Exchange 2010, the Exchange Server product group introduced some fantastic new features which make it very easy to enforce legal holds, retention and compliance across your company's users - or a subset thereof - if required to do so. Multi-Mailbox Search
makes it very easy for a legal team to examine the content of mailboxes in the organisation, perhaps retrieving material in defence of a law suit or locating evidence to build a case against an employee. There are many such scenarios where this would be required. Another one of the features, dubbed "Dumpster 2.0", ensures users are unable to modify or delete anything
from their mailboxes. Mailbox data is truly immutable, and every change is tracked.
Enter PST files. When the PST is in use, the data is removed from the Exchange Server and stored in local archive files scattered across your network. It becomes next-to-impossible to track what the user stores in those local files or locate information which could act as vital evidence in a court room. The overhead for such discovery skyrockets, as PST files would need to be located and individually examined, which is prohibitive in terms of cost and time for your legal team. If the user wishes to hide evidence, they can do so easily, without any possibility of data recovery.
These new features are a fantastic improvement to Exchange, but are only 100% effective when all
of an organisation's mail data is stored within the Exchange infrastructure.
Despite the considerations above, you might still be wondering how to work around those common problems which PST files are oh so convenient for solving.
Use 1: Archiving
This is a mis-conception, brought about largely by Outlook's desire to continue annoying its users with AutoArchive prompts. There is no reason whatsoever that mail should be archived to each user's local PC. Consider the actions you would take to archive files off your file server; where would you put the archived data? On your own PC? On your manager's? On the CEO's? You'd do none of those three, as the data is unlikely to be backed up, and you cannot assure data security. Instead, you'd find some space on a share on your archive server - or create a LUN using spare space on some SAN.
The same applies to email. Off-loading email from your Exchange Server to user PCs has significant risks attached to it. Instead, you should use an enterprise mail archiving solution. Previously, such actions would require the purchase of a third-party piece of software, but with Exchange 2010, the product group have again delivered: Personal Archives
Retention and compliance serve their purpose in many industries which must assure the public that they are acting with integrity. However, as the amount of electronic data continues to grow, it becomes costly for an organisation to provision the resources to store such data in their central data centres. For Exchange, modern organisations might deploy one or more Exchange 2010 Database Availability Groups (DAGs), the clustered, highly available solution for ensuring mailbox data is always online and always available. The Exchange Servers which house mailbox data are likely to have very expensive disk subsystems, perhaps using a highly redundant configuration of 10k or 15k RPM Serial-attached SCSI (SAS) disks. The cost per GB of such storage can escalate very quickly, not to mention the RAM, CPU and power requirements for operating such environments, and the need to provision backups of rapidly changing mailbox data.
The Personal Archive
feature in Exchange 2010 is one method of counteracting the costs of storing an exponentially increasing amount of data. Data which reaches an age suitable for archiving is generally accessed less frequently than the newer data in a person's mailbox. As such, it seems absurd that highly available DAGs are set aside to handle 5+ year old data. On Exchange 2010 SP1 and higher, it is possible to store a user's personal archive into a separate mailbox database which is hosted on a different set of servers within the same Active Directory site -- so-called tiered storage
. Servers can be dedicated to the storage of archive mailboxes, and as such, have lower requirements in terms of their performance and disk configurations. You could reduce overhead by, for example:
dropping back to 7200rpm SATA disks in your archive servers, since the data is modified less frequently and is generally in less demand on a day-to-day basis
dispense with costly DAG configurations for archived data, provided it is acceptable that older data could be lost temporarily in the event of a server failure
run backups much less often, perhaps on a monthly rather than daily or weekly basis
The degree by which the infrastructure is cut back for archive mailboxes is a matter for an organisation to address internally based on their requirements. However, I can guarantee that the use of Personal Archives affords an Exchange Administrator and his/her compliance teams an improved level of access to corporate data. Multi-mailbox search can search personal archives in much the same way as a regular mailbox, making the compliance process a breeze in comparison to the PST approach.
From a user's perspective, personal archives are also much improved over the PST approach, as the data stored therein can still be accessed via Outlook Web App (OWA). Retention policies within Exchange allow the administrator and/or compliance team to control exactly when and how certain mailbox data is moved to the personal archive mailbox, removing the burden of archiving from the users and ensuring it is done correctly, every time.
Use 2: On the road
For users on the road, there is no need to store their mail in a PST file. Cached Exchange Mode has been available since the days of Exchange 2003 and Outlook 2003 and allows users to take a copy of their mailbox with them when they travel. When they reconnect to the network, the changes are seamlessly synchronised back to the server, including any messages they happen to write and send while in offline mode.
Use 3: Exmerge (Exchange 2003)/Export-Mailbox (newer versions)
This is just about the only use of PST files which I can agree to -- and I'll admit, I've used this approach myself in the past. If you migrate to a new mail system or rebuild your Exchange system, sometimes you cannot avoid using exmerge to take handy copies of the mailboxes - which can later be re-imported to the new system.
However, most corporations are unlikely to rebuild their entire Exchange infrastructure by taking it offline and formatting the servers. Chances are, you will bring new mailbox servers online before decommissioning the old ones. In this case, there is no reason to use the PST export approach to migrating data; simply use the Move Mailbox tools to migrate the mailboxes between servers.
Within the same Exchange organisation, a mailbox move is seemless and very easy to implement. If performing a cross-forest move, the requirements are slightly more complex, as you must provision user accounts in the new forest with the proper information before commencing a mailbox move. However, the process is really quite simple and has survived the test of time, and is therefore the preferred method of moving mailbox data.
Use 4: Home Users
These are the people who the PST is most applicable to. If you are connecting via Outlook to a Post Office Protocol (POP) host to download your email, that email will be stored in a PST file. The fact you don't have an Exchange Server doesn't change any of the points above, though; that PST is still susceptible to corruption. If mail is deleted off the server, this could lead to data loss.
For this issue, you really have two solutions. The POP3 account in Outlook can be configured to leave email on the server. This acts as a backup; if your PST file becomes corrupted, the ISP still has a copy of your messages, so they can be downloaded again. To configure, open the Tools > Account Settings dialog in Outlook. Select your POP3 account, choose Properties, press More Settings, then switch to the Advanced tab. Under the Delivery section at the bottom of the window you should check the "Leave a copy of messages on the server" checkbox. If you want a backup of all your mail, don't enable the option to remove it from the server after a certain time period.
The disadvantage to the POP3 solution becomes apparent if you move to another computer or access your mailbox via your ISP's webmail interface. The message state information (tracking of read/unread or whether the message has been replied to or forwarded) is not transferred back to the ISP, so all the mail you thought you had read and handled will still be marked unread on the ISP's server.
My preferred solution, and the one I use regularly, is an Internet Message Access Protocol (IMAP) account. The IMAP protocol is another mail protocol used to access email; it stands alongside POP. However, using IMAP, you replicate a client-server topology very similar to connecting to an Exchange mailbox with Outlook in Cached Exchange Mode. With IMAP, email generally remains stored in your mailbox at the ISP until you specifically delete it. Nevertheless, you can't get away from PST files completely; they are still there when you use an IMAP account, as Outlook uses them to make a cache of the data for working with the IMAP account in offline mode. However, as the PST isn't the only location where your data is stored, any corruption is not going to lead to data loss.
It should be noted that both the POP solution for leaving data on the server, as well as the IMAP solution, both have drawbacks, as items in your Calendar, Contacts or Tasks folders will not be stored on the server. IMAP does not support special folders - such as the Calendar or Tasks - and these will not be replicated back with a POP account, so you will still be using a PST file to some extent. Unless you move entirely into the cloud (use web services for email, calendar and contacts) or purchase your Exchange Server, you won't be able to easily get away from this.
I've covered a fair bit of information regarding PST files here. Hopefully, my points detailing why the use of PSTs is so impractical will now encourage you to reconsider your PST usage, archiving practices and retention policies.
With all your user mail stored safely on the Exchange Server, rather than local PCs, assistants can become delegates for their managers, looking after their mailbox; the administrator can rest assured that all data is centrally stored and backup up and you can turn off Outlook AutoArchive, relieving end users of that annoying prompt every couple of weeks.
If you liked this and want to see more from this Author,
please be sure to press the Yes