views:

50

answers:

1

Good morning

Just built a small Data De-duplication chunk of code in C# and want to check if anyone has done something simular before, and if so, how? is there publicly available code for this?

The code i wrote is on GitHub at http://gist.github.com/273880.

At the moment, there is no physical backing store, and no way of storing what blocks belong to what files, but was thinking of chucking these into a DB. Also, using SHA512 for the check-summing, but that might be overkill...

+1  A: 

I think this is basically a limited dictionary coder.

Kaleb Brasee
looking at the entry, correct, it would be. but it seems to be the way all these de-dupe systems are built... the one in ZFS, from what i can gather, is built based on the same principle: each block, before written to the disk, is checked for duplication and if there is a dupe, its just a link... i am thinking of building this into a DB or file backed store and sharing over SMB... cant think of another way of doing this...
TiernanO