There are many articles online describing SQL Server data compression. In this post I will summarize some of the salient points.

Feature Image / License

What is SQL Server Data Compression

SQL Server data compression has been around since SQL Server 2008. To give perspective, SQL Server 2008 and 2008 R2 reach end of life for extended support this July.

Basically we can compress tables and indexes i.e. the data structures in an RDBMS.

Here is the prosaic KB from Microsoft about SQL Server data compression.

Types of Compression

There are 2 different kinds of SQL Server data compression:

Row Level Compression
Page Level Compression

Row Level Compression

Row level compression is the more primitive type of compression. It is superceded by page level compression. This means we cannot have page level compression without already having row compression.

It looks to compress string data types e.g. varchar, char, nvarchar, nchar, text, etc. It can ignore zeros and null marks as well.

Page Level Compression

Page level compression builds off of row level compression and adds additional compression. This will save us more space than row level compression. Further explanation of the compression can be found here.

Before Compressing

A basic tenant of performance tweaking is to establish a baseline and evaluate changes over time. It is no different here – we want to see how much compression saved us.

Do not attempt to compress system tables!

There are some system views and DMVs that can help:

sys.master_files – this shows us each file in the database instance along with some details.
sys.database_files – a database level view of the database files on disk.
sys.dm_os_volume_stats – shows us basic info about the OS volume where the database files are stored.

Additionally you might want to consider testing some queries with the IO statistics turned on – giving you a measurement you can judge progress from.

SET STATISTICS IO ON
SELECT * FROM Purchasing.PurchaseOrderDetail where OrderQty > 500;
SET STATISTICS IO OFF

SET STATISTICS IO ON

SELECT * FROM Purchasing.PurchaseOrderDetail where OrderQty > 500;

SET STATISTICS IO OFF

How Much Space Can We Save?

Fortunately this is simple to see. Microsoft includes a system procedure called sp_estimate_data_compression_savings. It returns the size of the object we ask about and estimates how much space can be saved by the type of compresssion we are considering to apply.

We can call this like:

exec sp_estimate_data_compression_savings @schema, @object, @index, @partition, @typeofcompression;

1	exec sp_estimate_data_compression_savings @schema, @object, @index, @partition, @typeofcompression;

Some examples of calling it:

Pay attention to the column “size_with_requested_compression_setting_(KB)”. This is what we are looking for.

Demo of sp_estimate_data_compression_savings - PAGE and ROW

Why Compress?

There is always a bottleneck to any process. Some resource necessary for the computation will lag behind another. Typically the hierarchy looks like this:

CPU is orders of magnitude faster than disk!

Like encryption, the brunt of the work moves from the disk to the CPU. Since CPU is orders of magnitude faster than disk IOPS we realize a performance gain.

Enabling SQL Server Data Compression

Although this can be done in the GUI (SSMS) I prefer to use T-SQL. It allows for more granular control and the ability to automate later.

Microsoft reference KB about enabling compression

Here is an example of how to enable data compression on an object:

ALTER TABLE [Purchasing].[PurchaseOrderDetail] REBUILD PARTITION = ALL
WITH (DATA_COMPRESSION = PAGE);

1 2	ALTER TABLE [Purchasing].[PurchaseOrderDetail] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE);

Similar logic can be used for compressing an index.

Using iteration (while loops or cursor) we can iterate through database objects and compress / decompress in large numbers at a time.

Which Tables and Indexes are Compressed?

We can see what tables and indexes are compressed or not by using simple system catalog views:

SELECT DISTINCT
 	 s.[name]
	,t.[name]
	,i.[name]
	,i.[type]
	,i.index_id
	,p.partition_number
	,p.[rows]
FROM 
	sys.tables t
	LEFT OUTER JOIN sys.indexes i ON t.object_id = i.object_id
	INNER JOIN sys.schemas s ON t.schema_id = s.schema_id
	LEFT OUTER JOIN sys.partitions p ON i.index_id = p.index_id
		AND t.object_id = p.object_id
WHERE 
	t.type = 'U'
	AND p.data_compression_desc = 'NONE'
ORDER BY p.rows desc

SELECT DISTINCT

s.[name]

,t.[name]

,i.[name]

,i.[type]

,i.index_id

,p.partition_number

,p.[rows]

FROM

sys.tables t

LEFT OUTER JOIN sys.indexes i ON t.object_id = i.object_id

INNER JOIN sys.schemas s ON t.schema_id = s.schema_id

LEFT OUTER JOIN sys.partitions p ON i.index_id = p.index_id

AND t.object_id = p.object_id

WHERE

t.type = 'U'

AND p.data_compression_desc = 'NONE'

ORDER BY p.rows desc

Disabling SQL Server Data Compression

Like the above example we follow similar logic:

ALTER TABLE [Purchasing].[PurchaseOrderDetail] REBUILD PARTITION = ALL
WITH (DATA_COMPRESSION = NONE);

1 2	ALTER TABLE [Purchasing].[PurchaseOrderDetail] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = NONE);

Here is the Microsoft KB explaining disabling compression.

By using some looping or batches with dynamic SQL we can do something like this snippet:

    SELECT    cs_obj_name = CHECKSUM(d.OBJ_NAME), *
    INTO    #comp_tbls
    FROM(
        SELECT    DISTINCT OBJ_NAME = OBJECT_NAME(i.[object_id]), i.[object_id],                     
                CASE WHEN i.[type]=0
                    /* Heap     */ THEN ' ALTER TABLE '+ QUOTENAME(OBJECT_NAME(i.[object_id])) +' REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = NONE);'    
                    /* Index */ ELSE ' ALTER INDEX '+ QUOTENAME(i.name)+' ON '+QUOTENAME(OBJECT_NAME(i.[object_id]))+' REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = NONE);'
                END + CHAR(13) AS cmd    
        FROM    sys.partitions prt WITH(NOLOCK)
                INNER JOIN sys.indexes i WITH(NOLOCK) ON i.[object_id] = prt.[object_id] and prt.index_id = i.index_id
--can also join a table with what you want to compress/decompress
        WHERE    prt.[data_compression] &lt;> 0

SELECT cs_obj_name = CHECKSUM(d.OBJ_NAME), *

INTO #comp_tbls

FROM(

SELECT DISTINCT OBJ_NAME = OBJECT_NAME(i.[object_id]), i.[object_id],

CASE WHEN i.[type]=0

/* Heap */ THEN ' ALTER TABLE '+ QUOTENAME(OBJECT_NAME(i.[object_id])) +' REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = NONE);'

/* Index */ ELSE ' ALTER INDEX '+ QUOTENAME(i.name)+' ON '+QUOTENAME(OBJECT_NAME(i.[object_id]))+' REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = NONE);'

END + CHAR(13) AS cmd

FROM sys.partitions prt WITH(NOLOCK)

INNER JOIN sys.indexes i WITH(NOLOCK) ON i.[object_id] = prt.[object_id] and prt.index_id = i.index_id

--can also join a table with what you want to compress/decompress

WHERE prt.[data_compression] <> 0

After Compression

Ok now we have compressed objects and see performance gains. That’s great! But what about the disk space that was freed up? Can we reclaim it?

Yes – shrinking the database will deallocate the disk space to the object and return it to the OS.

Shrinking databases should be used with caution! This is a legitimate example of when to use it.

What We Didn’t Cover

There is a lot to cover for SQL Server data compression. I did not attempt to cover it all but rather provide a crash course.

Some of the things we omitted:

Columnstore compression
COMPRESS function in SQL Server 2017
Backup compression
Replication and compression

If you liked this post then you might also like: SQL Server on VMware Best Practices – How to Optimize the Architecture

Did you find this helpful? Please subscribe!

One thought on “SQL Server Data Compression – Crunch Time!”

Wait Statistics in SQL Server - Performance Troubleshooting – MlakarTechTalk says:

March 8, 2019 at 6:01 am

[…] If you liked this post then you might also like: SQL Server Data Compression – Crunch Time! […]

MlakarTechTalk

All Things Data | InfoSec | Privacy | Technology

SQL Server Data Compression – Crunch Time!

What is SQL Server Data Compression

Types of Compression

Row Level Compression

Page Level Compression

Before Compressing

How Much Space Can We Save?

Why Compress?

Enabling SQL Server Data Compression

Which Tables and Indexes are Compressed?

Disabling SQL Server Data Compression

After Compression

What We Didn’t Cover

Did you find this helpful? Please subscribe!

Like this:

One thought on “SQL Server Data Compression – Crunch Time!”

Leave a CommentCancel reply

What is SQL Server Data Compression

Types of Compression

Row Level Compression

Page Level Compression

Before Compressing

How Much Space Can We Save?

Why Compress?

Enabling SQL Server Data Compression

Which Tables and Indexes are Compressed?

Disabling SQL Server Data Compression

After Compression

What We Didn’t Cover

Did you find this helpful? Please subscribe!

Share this:

Like this:

One thought on “SQL Server Data Compression – Crunch Time!”

Leave a CommentCancel reply