Table data compression, a multi-valuable database feature.

Tomas Helgi JohannssonDatabase Administrator / Software Engineer
CERTIFIED EXPERT
Over 20 years of experience as an Application Architect/Developer, a Database Administrator, and a PM focusing on performance.
Published:
Data compression is a feature offered by many modern databases like Db2, Oracle, MySQL, MariaDB and SQLServer. However, many companies do not use this feature in their database systems. Either they lack the knowledge of its existence in the application/systems design phase or due to budget issues.

Data compression is a feature offered by many modern databases like Db2, Oracle, MySQL, MariaDB and SQLServer. However, many companies do not use this feature in their database systems. Either they lack the knowledge of its existence in the application/systems design phase or due to budget issues. Some DBMSs offer compression features for an additional license price while other DBMSs offer it free of charge, e.g. might be included in the general package, like in some open source DBMSs.

However, why should you use compression? In my opinion, it is fairly simple. The major benefits of compression are associated with space savings on disks which can range from 40 to 85 % (or more) depending on your data in the tables. Note that BLOB data does not compress well, or at all, as it is in most cases already compressed, e.g. JPEG, etc.. One major beneficial facet of compression is the transfer of data into buffer pools “as-is” which gives reduced I/O and increased hit ratio on the buffer pools and thus better buffer pool usage as more data is put into it.

By compressing data, your SQL queries will benefit on a larger scale as more data will exist in the memory at any given point in time. You will see overall performance improvements in your SQL workload which provide a reduction in CPU, execution time and I/O. The CPU of the compression and decompression process is very small in comparison to other query/DML processing on a uncompressed data. Regardless of the slight CPU overhead of the compression/decompression process, the benefits of using compression will drastically reduce other CPU query/DML workload by far.

Another beneficial aspect lies in the DBMS backup-processes backing up databases/tablespaces/tables "as-is". In doing so, backup sizes become smaller as tables are compressed and as a result awarding you with larger space savings in the backup systems as well as shorter backup and restore times.

Also, transferring backups to and from the database host through your network will take a shorter time as will data transfer between database clusters.

Consider the scenario where you have multiple very large tables, lets say you have 10, which each of them are 1 TB in size uncompressed, thus a total of 10TB of uncompressed data. All of the data is read and/or updated regularly. Querying such large tables require a database host with huge amount of memory and CPUs to handle the workload efficiently. Additionaly, backup and restore time of such large tables can take many hours to complete. Other management of such large tables can also take hours to do, some even with application downtime.

However, lets assume that you can compress each of the tables up to 80 percent. This means that your 10 TB of data is after compression only 2TB in size, that is each table is now only 200GB in size instead of 1TB. The major beneficial benefit you will see after the compression, besides reduced storage size, is that query execution time will drop up to 80 percent ( depending on query complexity ) as well as up to 80 percent reduced backup and restore time.

As discussed above, the benefits of adding compression can have a very good and positive development of your database overall cost, especially if you have very large databases without compression. If you have a DBMS that requires an additional license for compression, then I highly recommend an evaluation and comparison of the license price versus the cost-savings in disk space and CPU together with expected application performance improvements. If you have a DBMS that has the compression feature included in the license, or you are using open source DBMS with this feature, then I strongly urge you to use this effective feature in reducing database costs.

0
1,194 Views
Tomas Helgi JohannssonDatabase Administrator / Software Engineer
CERTIFIED EXPERT
Over 20 years of experience as an Application Architect/Developer, a Database Administrator, and a PM focusing on performance.

Comments (0)

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.