websss
asked on
Compress NVarChar column to use less disk space?
I have a very large table with 1 billion rows
One of the columns stores full address details (street, town, city, country)
Previously it was varchar(max) but since we moved to multilingual we have to change it to nvarchar(max) which doubles the size
Our database size is now growing by 10.5GB a day and I'm looking at ways to reduce this
Is there a way to compress this data and uncompress on the web app?
Open to ideas and suggestions
SQL Server 2017 enterprise
One of the columns stores full address details (street, town, city, country)
Previously it was varchar(max) but since we moved to multilingual we have to change it to nvarchar(max) which doubles the size
Our database size is now growing by 10.5GB a day and I'm looking at ways to reduce this
Is there a way to compress this data and uncompress on the web app?
Open to ideas and suggestions
SQL Server 2017 enterprise
I can't imagine an address that's more than 4K chars. Change it to nvarchar(4000) and you can use standard SQL Server page compression to drastically reduce the size. You'll trade a small amount of CPU for reduced I/O, which is usually a net big gain overall. It was for us.
If you leave it as nvarchar(max), it won't be compressed. You'd have to use COMPRESS / DECOMPRESS. COMPRESS is relatively very slow. We have even taken to writing the full nvarchar(max) data at transaction, and only later using COMPRESS in batch mode to reduce the size because it takes so long. DECOMPRESS is not nearly as bad, and only becomes an issue if you read a very large number of rows at one time.
If you leave it as nvarchar(max), it won't be compressed. You'd have to use COMPRESS / DECOMPRESS. COMPRESS is relatively very slow. We have even taken to writing the full nvarchar(max) data at transaction, and only later using COMPRESS in batch mode to reduce the size because it takes so long. DECOMPRESS is not nearly as bad, and only becomes an issue if you read a very large number of rows at one time.
ASKER
thanks, I could get away with nvarchar(4000) how would I enable compression ?
Rebuild the clustered index specifying "WITH ( DATA_COMPRESSION = PAGE, FILLFACTOR = 98 )" or whatever other reasonable fillfactor you want.
ALTER INDEX <clus_index_name> ON dbo.<table_name > REBUILD WITH ( DATA_COMPRESSION = PAGE, FILLFACTOR = 98, /*ONLINE = ON,*/ SORT_IN_TEMPB = ON );
ONLINE = ON is optional; if you don't specify it, the table will be locked so that no other task can use it while it is being rebuilt, not even to just read the table.
ALTER INDEX <clus_index_name> ON dbo.<table_name > REBUILD WITH ( DATA_COMPRESSION = PAGE, FILLFACTOR = 98, /*ONLINE = ON,*/ SORT_IN_TEMPB = ON );
ONLINE = ON is optional; if you don't specify it, the table will be locked so that no other task can use it while it is being rebuilt, not even to just read the table.
ASKER
If PAGE doesn't produce large (enough) savings, then use ROW. ROW really should be the default, unless you're severely CPU constrained.
With those results, honestly, you could go with either ROW or PAGE. PAGE compression takes significantly more time to initially INSERT, but reads are not that much affected. ROW compression is basically "free" except for a bit of CPU.
With those results, honestly, you could go with either ROW or PAGE. PAGE compression takes significantly more time to initially INSERT, but reads are not that much affected. ROW compression is basically "free" except for a bit of CPU.
ASKER
Thanks scott
I noticed 3 rows in the results, I assume these are indexes?
I have 3 indexes
So I should just apply this to the clustered Index?
I noticed 3 rows in the results, I assume these are indexes?
I have 3 indexes
So I should just apply this to the clustered Index?
No, not just to clus. Check the other indexes as well, with the same rules: ROW by default, PAGE if it significantly reduces the size. It's up to you to decide what a "significant" reduction is for each index, but anything 50% or more I'd almost surely compress (NOTE: I have lots of spare CPU on my main prod cluster, so I don't worry about CPU at all, you might have a different situation. Compression does take some CPU time.)
This question needs an answer!
Become an EE member today
7 DAY FREE TRIALMembers can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Don't know if the newer versions of SQL Server will allow an ALTER TABLE .. ALTER COLUMN to be performed WITH(ONLINE = ON). Hopefully another expert with more current knowledge can answer that.
Just for kicks and giggles please post the CREATE TABLE statement of the table we're talking about into this question.
Good luck.