sql server 2008 Data Compression

Hi experts, i am studying sql server 2008, but i do not undertand this points:
1. ROW compression enables storing fixed length types in variable length format
2. PAGE compression is built on top of ROW compression (Column Prefix and Page Level Dictionary)
3. Compression comes with additional CPU cost
Please, i need examples very clear
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Here is a good explanation...

1.  Row level compression drastically reduces the meta-data needed for variable length columns. Instead of storing the length of each field in 2 bytes, it now takes a mere 3 bits. Fields themselves are also smaller. Storing a 1 in an int field now only takes a single byte, though of course larger values may use up to 4 bytes.
2.  Page level compression  allows common data to be shared across rows.   (example given in link above).
3.  Compression on any level will require more CPU processing.

Does this help?

More info.....

Storing values in variable length format will typically get you some space savings, but there is a catch. When a fixed length value is stored in variable length format, the SQL Server needs to store its offset within the row.  Note, this offset was not needed when the column value was stored as fixed length because its relative position in the row was always at the same offset....

...ROW compression means storing fixed length data in variable length storage format and PAGE compression means minimizing the data redundancy....

...enabling compression is an expensive operation [CPU]because the uncompressed data needs to be converted to new record format and for PAGE compression, the data redundancy needs to be minimized...



Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
And lastly...

When you are enabling Row Level Compression you are only changing the physical storage format of the data which is associated with a data type. There is absolutely no change required at the application level once a DBA enables Row Level Compression on one or more tables within a database.

Page Level Compression is a basically a superset of Row Level Compression and it will only occur when the Page is full to optimize the performance. Page Level Compression also uses Column Prefix and Dictionary Compression along with Row Level Compression. For example, if a table is partitioned using a column prefix, then all the data within a specific partition will have same or similar prefix. Lets say that the column stores values like Memory, Memorably and Memorizer within a page for different rows, then the SQL Server 2008 Storage Engine will store the value Memor only once in the Page and will refer to this value for all the other occurrences.

This article of the three posted will probably be the most useful...
Newly released Acronis True Image 2019

In announcing the release of the 15th Anniversary Edition of Acronis True Image 2019, the company revealed that its artificial intelligence-based anti-ransomware technology – stopped more than 200,000 ransomware attacks on 150,000 customers last year.

Mark WillsTopic AdvisorCommented:
The last article from top_rung whilst largely a reasonable account of how it works, how to setup, there are a couple of "finer" points about specifics...

From : http://social.msdn.microsoft.com/Forums/en-US/sqldatabaseengine/thread/0d7d41f6-3162-4bef-bc5a-a5f4d64420fc

We read one response to that same paragraph from Marcel van der Holst - MSFT, which says :

The article is incorrect. When calculating the prefix, SQL Server will pick the longest column value that totally matches the prefix as prefix value. So your second example is almost correct. Instead of NULL for the last value, SQL Server actually uses the value EMPTY, which is a special value to indicate that the value completely matches the prefix.


SQL Server Storage Engine Team (Compression Feature dev).

Mark WillsTopic AdvisorCommented:

There are three very good articles from Microsoft resources...
1) from the "team" : http://blogs.msdn.com/chadboyd/archive/2007/07/28/katmai-sql-2008-data-compression-including-backup-compression.aspx
2) from whitepapers : http://www.microsoft.com/sqlserver/2008/en/us/white-papers.aspx look for the data compression whitepaper : http://www.unisys.com/eprise/main/admin/corporate/doc/41371394.pdf
3) A case study : http://www.microsoft.com/caseStudies/Case_Study_Detail.aspx?casestudyid=4000003962

My own thoughts on the matter is compression can pay dividends, but you must do your homework. Certainly saving space is readily achieved, and in theory have a small data footprint, it can mean an overall improvement in operating efficiency (e.g. more data per page = more data being cached).

But also think that it becomes more important as various hardware contraints become evident when dealing with "significant" databases sizes. And it also depends on your database design. If (for example) you are highly data capture driven (e.g. sales orders) and the transaction details are highly unique, then how much are you really going to benefit ?

We can sometimes see this in a simple "ZIP" program and while a little inappropriate to compare, the analogy is reasonable... Why do some things zip into a fraction of the size, and others only save a small percentage ? It has to do with data density and uniqueness of that data. Some things will clearly benefit from compression whilst others won't.

In that case study for example, we also see some pretty significant hardware environment :
For high availability, the SQL Server 2008 instance supporting the SAP database was deployed using Windows Server 2003 Failover Clustering technology to create a two-node active/passive cluster using two HP Integrity Superdome computers, each with 32 processors and 128 GB of RAM. SAP applications are hosted on nine blade servers each with two dual-core processors and 16 GB of RAM

So, whilst seemingly attractive, I do believe that it's origins were born from datawarehousing requirements, and was successful enough to be considered for OLTP type environments, and you do need to understand the environment and databases as to the real benefits in day-to-day operational terms not just space saving.
Mark Wills,

Understood.  Up until this thread, it was commonly assumed that a link along with information implied  that the information presented was taken from the supplied link.  For most using the "interwebs" ;)  that is simply common logic.  As far as allowing the link to speak for itself,  providing a snippet of the text is more a courtesy to the asker (and actually is a service to EE) so that relevant information would be present on EE moving forward in the case that the link "dies" in the future.  But again, I understand your concern.

Your point is taken and " "  (quotation marks) will be supplied around the text to ensure that anyone daring enough to challenge copyrights on something this trivial won't have any ground to stand on - well not as much ;)

Purely of curiosity, since I have never seen it done on EE, do Zone Advisors or administrators also provide answers to questions and do they accumulate points?


Thanks Mark!  I didn't realize the advisors/admins were volunteers - I commend everyone's work!  Your explanation and thoroughness is why I have come to love EE and why I continue to advocate on the site's behalf amongst my peers and anyone I come into contact with.


Mark WillsTopic AdvisorCommented:
A pleasure, and hope to share a few more questions with you in the future...
nolbertaAuthor Commented:
i am very satisfied
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2008

From novice to tech pro — start learning today.