views:

249

answers:

6

I have a situation where I want to create a signature of a data structure:

my $signature = ds_to_sig(
  { foo   => 'bar',
    baz   => 'bundy',
    boing => undef,
    number => 1_234_567,
  }
);

The aim should be that if the data structure changes then so should the signature.

Is there an established way to do this?

A: 

I think the word you're looking for is "hashing".

Basically, you put your data structure through a function that generates a fairly unique value from it. This value would be your signiture.

Rik
+13  A: 

I think what you're looking for is a hash function. I would recommend an approach like this:

use Storable;
$Storable::canonical = 1;
sub ds_to_sig {
    my $structure = shift;
    return hash(freeze $structure);
}

The function hash can be any hash function, for example the function md5 from Digest::MD5

Leon Timmermans
Hehe. two virtually identical answers in less than 2 minutes.
Rik
Make that 3 in 3 minutes! I guess that can only mean we've got it right ;-)
Leon Timmermans
The key there is $Storable::canonical. Without that, Storable doesn't guarantee the order of the elements.
brian d foy
You should probably be using 'nfreeze' for cross platform consistency
EvdB
A: 

Can't you use an object instead of a struct? That way you could see if an object is an instance of a type without having to compare hashes, etc.

demianturner
data structures and objects are largely interchangeable in Perl 5 - objects are really just blessed data references. Either way - I want to get a signature of the contents of the data
EvdB
The real problem with this approach is that he's after the data. Since the data is used to maintain state on objects, you'd have to instantiate a new object every time the state changed, thus negating this approach.
Ovid
+7  A: 

Use Storable::nstore to turn it into a binary representation, and then calculate a checksum (for example with the Digest module).

Both modules are core modules.

moritz
I had just edited my code to do exactly that. You and I are on the same track again!
Leon Timmermans
+7  A: 

The best way to do this is to use a deep-structure serialization system like Storable. Two structures with the same data will produce the same blob of Storable output, so they can be compared.

#!/usr/bin/perl

use strict;
use warnings;

use Storable ('freeze');

$Storable::canonical = 1;

my $one = { foo => 42, bar => [ 1, 2, 3 ] };
my $two = { foo => 42, bar => [ 1, 2, 3 ] };

my $one_s = freeze $one;
my $two_s = freeze $two;

print "match\n" if $one_s eq $two_s;

...And to prove the inverse:

$one = [ 4, 5, 6 ];
$one_s = freeze $one;

print "no match" if $one_s ne $two_s;
friedo
You need to set $Storable::canonical to a true value. It might not matter in little examples, but it matters in much larger ones.
brian d foy
+4  A: 

Digest::MD5->new->add( Data::Dumper->new([$structure]) ->Purity(0) ->Terse(1) ->Indent(0) ->Useqq(1) ->Sortkeys(1) ->Dump() )->b64digest();