Note: this article covers simple compression. Oracle introduced in version 11g release 2 a new feature called Advanced Compression which is not covered here.
General principle of Oracle compression
Oracle compression is a way of reducing the disk space needed to store data blocks. Oracle compression is included in Enterprise Edition only. Compression is achieved by replacing repeating values in each column within a data block with a code. For example if the following rows are stored in the same data block
tx_id tx_date first_name last_name amount1 01.01.2012 Scott Tiger 1002 01.01.2012 Scott Tiger 59.5
the repeating values in the transaction_date, first_name and last_name columns can each be replaced by a numeric code. The data block will contain a code table which allows Oracle to look up the values corresponding to each code in order to compress and decompress the data.
Advantages and disadvantages of Oracle compression
The most obvious advantage of compression is the reduced disk space required to store the data. Compression ratios (the original data volume divided by the compressed data volume) of 2…4 are typical, although depending on the data the achieved compression can be much higher.
A beneficial side effect of the reduced disk space is faster I/O. Since less disk space is used to store the data, reading (and writing) is also faster, roughly by the same factor as the compression ratio.
Compressing and decompressing the data will cause a slight overhead as the code values need to be converted to actual values and vice versa, but in reality this overhead is negligible.
A drawback of simple table compression is that it works only with the following methods for inserting data into a table:
Direct-path insert (e.g. INSERT /*+ APPEND */ INTO …)
CREATE TABLE … AS SELECT…
Direct path SQL Loader
Data inserted into a compressed table in any other way will not be compressed (this limitation does not exist with Advanced Compression introduced in Oracle 11g Release 2).
Table compression in practice
Compression can be specified when creating a new table, e.g.:
This will reorganize the table and compress the contents.
Updates, deletes and reorganization
Updates in a compressed table may cause row migration. This happens when the updated row does not fit into the original data block and is moved to a new block. This leaves a “hole” in the original data block. Deletions also cause holes in the compressed in the compressed data blocks. These holes will not be filled since the new data is being added through direct-path inserts. If updates and deletes are performed on a compressed table, the compression ratio will deteriorate over time. In order to fix this a periodic reorganization of the table can be done:
In order to achieve the best compression ratio requires some trial and error. The key to high compression ratios is to order the data in a way that places repeating values in wide columns (typically strings) in the same data block. Experimenting with the ORDER BY clause will give an idea about the best sort order when inserting data.
Compression and partitioned tables
Compression can be specified individually for partitions and sub-partitions. When COMPRESS or NOCOMPRESS is not explicitly specified for a partition or sub-partition, the compression setting is inherited from the table specification. When doing updates and deletes on a partitioned table, the table reorganization needs to be done per partition or sub-partition, e.g.:
Oracle compression brings advantages in the form of reduced disk space usage and improved I/O speed. It is most suitable for data that does not change (e.g. in a data warehouse), although table reorganization can be used to re-compress the data after updates and deletes.
Comments (0)