views:

22

answers:

1

I have byte arrays that can be a few dozen megabytes in size. Such large arrays are not happy creatures, especially when you have a many of them. So I would like to compress them, so they're easier to deal with. They compress well, generally a 3:1 ratio with DotNetZip set to BestSpeed.

The data in the arrays can be nearly identical. With this consideration, I was hoping to find some way to programmatically compress the arrays differentially, much like version control or backup software. This way, if I have three arrays of 30 MB that differ only in sparse places, my zip file would be closer to 10 MB instead of 30.

I have tried many queries on google and stackoverflow, with language like compressed, archival, backup, diff, differential...none of my terms are turning up anything useful. What should I be looking for?

A: 

You may want to look unto how the rsync protocol works on Unix. It essentially computes the differences between two files and uses that to create a compressed delta used to compute the changes.

You may be able to adapt that to what you're trying to do.

LBushkin