For simplicity suppose I have one varchar(50) per row and these are my rows:
1) The quick brown fox jumps over the lazy dog 2) I like brown foxes 3) fox 4) fox
So the page compression would find the word fox for instance and put it in the dictionary. Would only rows 3 and 4 benefit and rows 1 and 2 would not because they contain other data as well? This is what seems to be happening empirically. I have a bunch of XML data with highly repeating segments (angle brackets, repeated attribute names, etc) which compresses fantastically if I save it out to a CSV and zip it as a control case but I get almost 0% compression in SQL server. I've confirmed that SOME compression does take place by looking at sys.dm_db_index_physical_stats page_count and compressed_page_count so I know there's nothing in my structure inherently preventing compression altogether like too large rows.
The example here only includes whole columns and not subsets hence why I am asking explicitly.