sql server 2008 Data Compression

Posted on 2009-06-27
Last Modified: 2012-05-07
Hi experts, i am studying sql server 2008, but i do not undertand this points:
1. ROW compression enables storing fixed length types in variable length format
2. PAGE compression is built on top of ROW compression (Column Prefix and Page Level Dictionary)
3. Compression comes with additional CPU cost
Please, i need examples very clear
Question by:nolberta
  • 5
  • 3
LVL 14

Assisted Solution

top_rung earned 140 total points
ID: 24728726
Here is a good explanation...

1.  Row level compression drastically reduces the meta-data needed for variable length columns. Instead of storing the length of each field in 2 bytes, it now takes a mere 3 bits. Fields themselves are also smaller. Storing a 1 in an int field now only takes a single byte, though of course larger values may use up to 4 bytes.
2.  Page level compression  allows common data to be shared across rows.   (example given in link above).
3.  Compression on any level will require more CPU processing.

Does this help?

LVL 14

Accepted Solution

top_rung earned 140 total points
ID: 24728742
More info.....

Storing values in variable length format will typically get you some space savings, but there is a catch. When a fixed length value is stored in variable length format, the SQL Server needs to store its offset within the row.  Note, this offset was not needed when the column value was stored as fixed length because its relative position in the row was always at the same offset....

...ROW compression means storing fixed length data in variable length storage format and PAGE compression means minimizing the data redundancy....

...enabling compression is an expensive operation [CPU]because the uncompressed data needs to be converted to new record format and for PAGE compression, the data redundancy needs to be minimized...

LVL 14

Assisted Solution

top_rung earned 140 total points
ID: 24728786
And lastly...

When you are enabling Row Level Compression you are only changing the physical storage format of the data which is associated with a data type. There is absolutely no change required at the application level once a DBA enables Row Level Compression on one or more tables within a database.

Page Level Compression is a basically a superset of Row Level Compression and it will only occur when the Page is full to optimize the performance. Page Level Compression also uses Column Prefix and Dictionary Compression along with Row Level Compression. For example, if a table is partitioned using a column prefix, then all the data within a specific partition will have same or similar prefix. Lets say that the column stores values like Memory, Memorably and Memorizer within a page for different rows, then the SQL Server 2008 Storage Engine will store the value Memor only once in the Page and will refer to this value for all the other occurrences.

This article of the three posted will probably be the most useful... 
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

LVL 51

Assisted Solution

by:Mark Wills
Mark Wills earned 60 total points
ID: 24731642
The last article from top_rung whilst largely a reasonable account of how it works, how to setup, there are a couple of "finer" points about specifics...

From :

We read one response to that same paragraph from Marcel van der Holst - MSFT, which says :

The article is incorrect. When calculating the prefix, SQL Server will pick the longest column value that totally matches the prefix as prefix value. So your second example is almost correct. Instead of NULL for the last value, SQL Server actually uses the value EMPTY, which is a special value to indicate that the value completely matches the prefix.


SQL Server Storage Engine Team (Compression Feature dev).

LVL 51

Assisted Solution

by:Mark Wills
Mark Wills earned 60 total points
ID: 24731751

There are three very good articles from Microsoft resources...
1) from the "team" :
2) from whitepapers : look for the data compression whitepaper :
3) A case study :

My own thoughts on the matter is compression can pay dividends, but you must do your homework. Certainly saving space is readily achieved, and in theory have a small data footprint, it can mean an overall improvement in operating efficiency (e.g. more data per page = more data being cached).

But also think that it becomes more important as various hardware contraints become evident when dealing with "significant" databases sizes. And it also depends on your database design. If (for example) you are highly data capture driven (e.g. sales orders) and the transaction details are highly unique, then how much are you really going to benefit ?

We can sometimes see this in a simple "ZIP" program and while a little inappropriate to compare, the analogy is reasonable... Why do some things zip into a fraction of the size, and others only save a small percentage ? It has to do with data density and uniqueness of that data. Some things will clearly benefit from compression whilst others won't.

In that case study for example, we also see some pretty significant hardware environment :
For high availability, the SQL Server 2008 instance supporting the SAP database was deployed using Windows Server 2003 Failover Clustering technology to create a two-node active/passive cluster using two HP Integrity Superdome computers, each with 32 processors and 128 GB of RAM. SAP applications are hosted on nine blade servers each with two dual-core processors and 16 GB of RAM

So, whilst seemingly attractive, I do believe that it's origins were born from datawarehousing requirements, and was successful enough to be considered for OLTP type environments, and you do need to understand the environment and databases as to the real benefits in day-to-day operational terms not just space saving.
LVL 14

Expert Comment

ID: 24751068
Mark Wills,

Understood.  Up until this thread, it was commonly assumed that a link along with information implied  that the information presented was taken from the supplied link.  For most using the "interwebs" ;)  that is simply common logic.  As far as allowing the link to speak for itself,  providing a snippet of the text is more a courtesy to the asker (and actually is a service to EE) so that relevant information would be present on EE moving forward in the case that the link "dies" in the future.  But again, I understand your concern.

Your point is taken and " "  (quotation marks) will be supplied around the text to ensure that anyone daring enough to challenge copyrights on something this trivial won't have any ground to stand on - well not as much ;)

Purely of curiosity, since I have never seen it done on EE, do Zone Advisors or administrators also provide answers to questions and do they accumulate points?


LVL 14

Expert Comment

ID: 24756780
Thanks Mark!  I didn't realize the advisors/admins were volunteers - I commend everyone's work!  Your explanation and thoroughness is why I have come to love EE and why I continue to advocate on the site's behalf amongst my peers and anyone I come into contact with.


LVL 51

Expert Comment

by:Mark Wills
ID: 24760825
A pleasure, and hope to share a few more questions with you in the future...

Author Comment

ID: 24762941
i am very satisfied

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Solution for warm standby SQL server 20 49
Sql query 107 63
sql query questions 2 34
How can i make performance tuning to my sql query? 6 47
Naughty Me. While I was changing the database name from DB1 to DB_PROD1 (yep it's not real database name ^v^), I changed the database name and notified my application fellows that I did it. They turn on the application, and everything is working. A …
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
This video shows how to use Hyena, from SystemTools Software, to bulk import 100 user accounts from an external text file. View in 1080p for best video quality.
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question