tags:

views:

129

answers:

3

I would like to represent a set in Perl. What I usually do is using a hash with some dummy value, e.g.:

my %hash=();
$hash{"element1"}=1;
$hash{"element5"}=1;

Then use if (defined %hash{$element_name)) to decide whether an element is in the set.

Is this a common practice? Any suggestions on improving this?

Also, should I use defined or exists?

Thank you

+3  A: 

Use one of the many Set modules on CPAN. Judging from your example, Set::Light or Set::Scalar seem appropriate.


I can defend this advice with the usual arguments pro CPAN (disregarding possible synergy effects).

  1. How can we know that look-up is all that is needed, both now and in the future? Experience teaches that even the simplest programs expand and sprawl. Using a module would anticipate that.
  2. An API is much nicer for maintenance, or people who need to read and understand the code in general, than an ad-hoc implementation as it allows to think about partial problems at different levels of abstraction.
  3. Related to that, if it turns out that the overhead is undesirable, it is easy to go from a module to a simple by removing indirections or paring data structures and source code. But on the other hand, if one would need more features, it is moderately more difficult to achieve the other way around.
  4. CPAN modules are already tested and to some extent thoroughly debugged, perhaps also the API underwent improvement steps over the time, whereas with ad-hoc, programmers usually implement the first design that comes to mind.

Rarely it turns out that picking a module at the beginning is the wrong choice.

daxim
Is there something those give you other than overhead? Set operations or something? If all I am doing is checking for existence of an item in a set, why would I use a library to do something the language already does incredibly well?
Chas. Owens
Any suggestions on modules that support sets of sets?
David B
Sets of sets sounds like a job for Set::Object, whose instances can nest.
daxim
Why use a library? So that your code reads as what it does, not how it does it. `$set->insert('foo')` makes it clear that you are inserting 'foo' into a set. `$hash{foo} = 1` looks like you are mapping foo to 1, which you aren't. Add to the mix membership tests, for which defined and exists work equally well, and things can get confusing when you start mixing the two. Or when you start using = 0 to delete something from the set, but are using exists checks somewhere. Basically, using a hash as a set is hard to read and hard to maintain. And oh yeah, Set::Object is faster than hashes.
jrockway
A: 

That's how I've always done it. I would tend to use exists rather than defined but they should both work in this context.

Stephen Darlington
+4  A: 

Yes, building hash sets that way is a common idiom. Note that:

my @keys = qw/a b c d/;
my %hash;
@hash{@keys} = ();

is preferable to using 1 as the value because undef takes up significantly less space. This also forces you to uses exists (which is the right choice anyway).

Chas. Owens
+1 Thanks. Please see related question about "deep sets": http://stackoverflow.com/questions/3700069/how-to-check-if-a-key-exists-in-a-deep-perl-hash
David B