Solved

sql server 2008 Data Compression

Posted on 2009-06-27
11
271 Views
Last Modified: 2012-05-07
Hi experts, i am studying sql server 2008, but i do not undertand this points:
1. ROW compression enables storing fixed length types in variable length format
2. PAGE compression is built on top of ROW compression (Column Prefix and Page Level Dictionary)
3. Compression comes with additional CPU cost
Please, i need examples very clear
0
Comment
Question by:nolberta
  • 5
  • 3
11 Comments
 
LVL 14

Assisted Solution

by:top_rung
top_rung earned 140 total points
ID: 24728726
Here is a good explanation...
http://www.infoq.com/news/2007/11/SQL-Server-Compression

1.  Row level compression drastically reduces the meta-data needed for variable length columns. Instead of storing the length of each field in 2 bytes, it now takes a mere 3 bits. Fields themselves are also smaller. Storing a 1 in an int field now only takes a single byte, though of course larger values may use up to 4 bytes.
2.  Page level compression  allows common data to be shared across rows.   (example given in link above).
3.  Compression on any level will require more CPU processing.

Does this help?



0
 
LVL 14

Accepted Solution

by:
top_rung earned 140 total points
ID: 24728742
More info.....

Storing values in variable length format will typically get you some space savings, but there is a catch. When a fixed length value is stored in variable length format, the SQL Server needs to store its offset within the row.  Note, this offset was not needed when the column value was stored as fixed length because its relative position in the row was always at the same offset....

...ROW compression means storing fixed length data in variable length storage format and PAGE compression means minimizing the data redundancy....

...enabling compression is an expensive operation [CPU]because the uncompressed data needs to be converted to new record format and for PAGE compression, the data redundancy needs to be minimized...



http://blogs.msdn.com/sqlserverstorageengine/archive/2007/11/12/types-of-data-compression-in-sql-server-2008.aspx



0
 
LVL 14

Assisted Solution

by:top_rung
top_rung earned 140 total points
ID: 24728786
And lastly...

When you are enabling Row Level Compression you are only changing the physical storage format of the data which is associated with a data type. There is absolutely no change required at the application level once a DBA enables Row Level Compression on one or more tables within a database.


Page Level Compression is a basically a superset of Row Level Compression and it will only occur when the Page is full to optimize the performance. Page Level Compression also uses Column Prefix and Dictionary Compression along with Row Level Compression. For example, if a table is partitioned using a column prefix, then all the data within a specific partition will have same or similar prefix. Lets say that the column stores values like Memory, Memorably and Memorizer within a page for different rows, then the SQL Server 2008 Storage Engine will store the value Memor only once in the Page and will refer to this value for all the other occurrences.


This article of the three posted will probably be the most useful...
http://www.sql-server-performance.com/articles/dba/Data_Compression_in_SQL_Server_2008_p1.aspxFifth
0
 
LVL 51

Assisted Solution

by:Mark Wills
Mark Wills earned 60 total points
ID: 24731642
The last article from top_rung whilst largely a reasonable account of how it works, how to setup, there are a couple of "finer" points about specifics...

From : http://social.msdn.microsoft.com/Forums/en-US/sqldatabaseengine/thread/0d7d41f6-3162-4bef-bc5a-a5f4d64420fc

We read one response to that same paragraph from Marcel van der Holst - MSFT, which says :

The article is incorrect. When calculating the prefix, SQL Server will pick the longest column value that totally matches the prefix as prefix value. So your second example is almost correct. Instead of NULL for the last value, SQL Server actually uses the value EMPTY, which is a special value to indicate that the value completely matches the prefix.

Thanks,

Marcel.
SQL Server Storage Engine Team (Compression Feature dev).





0
Get up to 2TB FREE CLOUD per backup license!

An exclusive Black Friday offer just for Expert Exchange audience! Buy any of our top-rated backup solutions & get up to 2TB free cloud per system! Perform local & cloud backup in the same step, and restore instantly—anytime, anywhere. Grab this deal now before it disappears!

 
LVL 51

Assisted Solution

by:Mark Wills
Mark Wills earned 60 total points
ID: 24731751

There are three very good articles from Microsoft resources...
1) from the "team" : http://blogs.msdn.com/chadboyd/archive/2007/07/28/katmai-sql-2008-data-compression-including-backup-compression.aspx
2) from whitepapers : http://www.microsoft.com/sqlserver/2008/en/us/white-papers.aspx look for the data compression whitepaper : http://www.unisys.com/eprise/main/admin/corporate/doc/41371394.pdf
3) A case study : http://www.microsoft.com/caseStudies/Case_Study_Detail.aspx?casestudyid=4000003962

My own thoughts on the matter is compression can pay dividends, but you must do your homework. Certainly saving space is readily achieved, and in theory have a small data footprint, it can mean an overall improvement in operating efficiency (e.g. more data per page = more data being cached).

But also think that it becomes more important as various hardware contraints become evident when dealing with "significant" databases sizes. And it also depends on your database design. If (for example) you are highly data capture driven (e.g. sales orders) and the transaction details are highly unique, then how much are you really going to benefit ?

We can sometimes see this in a simple "ZIP" program and while a little inappropriate to compare, the analogy is reasonable... Why do some things zip into a fraction of the size, and others only save a small percentage ? It has to do with data density and uniqueness of that data. Some things will clearly benefit from compression whilst others won't.

In that case study for example, we also see some pretty significant hardware environment :
For high availability, the SQL Server 2008 instance supporting the SAP database was deployed using Windows Server 2003 Failover Clustering technology to create a two-node active/passive cluster using two HP Integrity Superdome computers, each with 32 processors and 128 GB of RAM. SAP applications are hosted on nine blade servers each with two dual-core processors and 16 GB of RAM

So, whilst seemingly attractive, I do believe that it's origins were born from datawarehousing requirements, and was successful enough to be considered for OLTP type environments, and you do need to understand the environment and databases as to the real benefits in day-to-day operational terms not just space saving.
0
 
LVL 14

Expert Comment

by:top_rung
ID: 24751068
Mark Wills,

Understood.  Up until this thread, it was commonly assumed that a link along with information implied  that the information presented was taken from the supplied link.  For most using the "interwebs" ;)  that is simply common logic.  As far as allowing the link to speak for itself,  providing a snippet of the text is more a courtesy to the asker (and actually is a service to EE) so that relevant information would be present on EE moving forward in the case that the link "dies" in the future.  But again, I understand your concern.

Your point is taken and " "  (quotation marks) will be supplied around the text to ensure that anyone daring enough to challenge copyrights on something this trivial won't have any ground to stand on - well not as much ;)

Purely of curiosity, since I have never seen it done on EE, do Zone Advisors or administrators also provide answers to questions and do they accumulate points?

Thanks!



0
 
LVL 14

Expert Comment

by:top_rung
ID: 24756780
Thanks Mark!  I didn't realize the advisors/admins were volunteers - I commend everyone's work!  Your explanation and thoroughness is why I have come to love EE and why I continue to advocate on the site's behalf amongst my peers and anyone I come into contact with.

Respectfully!

0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 24760825
A pleasure, and hope to share a few more questions with you in the future...
0
 

Author Comment

by:nolberta
ID: 24762941
i am very satisfied
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

Suggested Solutions

I have written a PowerShell script to "walk" the security structure of each SQL instance to find:         Each Login (Windows or SQL)             * Its Server Roles             * Every database to which the login is mapped             * The associated "Database User" for this …
In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
This video shows how to remove a single email address from the Outlook 2010 Auto Suggestion memory. NOTE: For Outlook 2016 and 2013 perform the exact same steps. Open a new email: Click the New email button in Outlook. Start typing the address: …
This tutorial demonstrates a quick way of adding group price to multiple Magento products.

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now