views:

58

answers:

4

Is it possible to detect changes in the base64 encoding of an object to detect the degree of changes in the object.

Suppose I send a document attachment to several users and each makes changes to it and emails back to me, can I use the string distance between original base64 and the received base64s to detect which version has the most changes. Would that be a valid metric?

If not, would there be any other metrics to quantify the deltas?

A: 

you should do the same that diff does. Then for example do the metrics on diff fiel size.

Axarydax
+4  A: 

That would depend entirely on the type of the document you had encoded. If it was a text file, then sure, the base64 encoded difference are probably on a par with the actual changes. However, you may have a format of a file where changes to the contents effectively produce a completely different binary file. An example of this would be a ZIP file.

Robin Day
A: 

In theory, yes, if do a smart diff (detecting inserts, deletions, and modifications).

In practice, no, unless the documents are absolutely plain text. Binary formats can't be meaningfully diff'd.

egrunin
A: 

Base64 packs groups of 3x8 bit values into 4x6. If you change one 8 bit value by one bit, then you'll impact only one of the 6 bit values. If you change by two bits, then you have about a 5/12 chance of hitting one of the other 6 bit values. So if you're counting bits, it is entirely equivalent; otherwise, you will introduce noise depending on the metric you use.

Rex Kerr