SQL 2008 Dictionary Data Compression - is this only for WHOLE column matches?

For simplicity suppose I have one varchar(50) per row and these are my rows:

1) The quick brown fox jumps over the lazy dog 2) I like brown foxes 3) fox 4) fox

So the page compression would find the word fox for instance and put it in the dictionary. Would only rows 3 and 4 benefit and rows 1 and 2 would not because they contain other data as well? This is what seems to be happening empirically. I have a bunch of XML data with highly repeating segments (angle brackets, repeated attribute names, etc) which compresses fantastically if I save it out to a CSV and zip it as a control case but I get almost 0% compression in SQL server. I've confirmed that SOME compression does take place by looking at sys.dm_db_index_physical_stats page_count and compressed_page_count so I know there's nothing in my structure inherently preventing compression altogether like too large rows.

The example here only includes whole columns and not subsets hence why I am asking explicitly.

http://blogs.msdn.com/sqlserverstorageengine/archive/2008/01/18/details-on-page-compression-page-dictionary.aspx

ansaurus

tags:

views:

answers:

SQL 2008 Dictionary Data Compression - is this only for WHOLE column matches?

related questions