views:

75

answers:

5

I think I've just been looking at this too long. I have some data that looks like this:

@a = (
    { name => 'ethan', depth => 0 },
    { name => 'victoria', depth => 1 },
    { name => 'stephen', depth => 2 },
    { name => 'christopher', depth => 3 },
    { name => 'isabella', depth => 2 },
    { name => 'ethan', depth => 3 },
    { name => 'emma', depth => 0 },
    { name => 'michael', depth => 1 },
    { name => 'olivia', depth => 2 },
    { name => 'alexander', depth => 3 },
    { name => 'stephen', depth => 1 },
    { name => 'sophia', depth => 0 },
    { name => 'michael', depth => 1 },
    { name => 'ava', depth => 1 },
    { name => 'joshua', depth => 2 }
);

This is a simple 'tree' data structure. Every time 'depth' = 0, that is the beginning of a new 'tree'. What I would like to know is in how many of these trees do each of the names appear? The desired result would be a single hash with the names as the key, and the count as the value.

The kink in this is, if you look closely, the first tree contains the name 'ethan' twice. Any tree can have any name more than once, but that should only count as 1, since they all occur in the same tree. However, 'michael' would have a count of 2, since he appears in two different trees.

I can think of a few ways of doing this, but they all involve multiple loops and seem somewhat brute force and inelegant. Hopefully, someone else out there can come up with a better solution.

A: 

The way I'd go about solving this is to use a recursive function that you map over the array. The edge case is is a depth of 0, in which case you return. otherwise, hold on to the root name, incrementing its value in a hash counter, and then recurse into the subtree where you increment that name as long as it doesn't match the root name, then recurse again, etc.

Daenyth
Little too complicated. Unless the spec changes, you don't need recursion for this.
cHao
@cHao: Well, it depends on the spec. If the subtrees can themselves contain subtrees then you need to do something at least similar to recursion
Daenyth
No, it's not that complicated. Basically what I outlined above.
coding_hero
@Daenyth: Hence "unless the spec changes". Right now, only 0 means we're in a new tree.
cHao
I misunderstood the OP then. It seemed to imply that the structure could be nested. If it's flat, then no, you don't need recursion.
Daenyth
+2  A: 

I'm not 100% sure about your problem spec -- is this the correct output?

  alexander 1
        ava 1
christopher 1
       emma 1
      ethan 1
   isabella 1
     joshua 1
    michael 2
     olivia 1
     sophia 1
    stephen 2
   victoria 1

If so, then this code seems to do the job:

my @names = (
  # your @a above
);

my (%seen, %count);

for my $entry (@names) {
  if ($entry->{depth} == 0) {
    ++$count{$_} for keys %seen;
    %seen = ();
  }
  ++$seen{ $entry->{name} };
}

++$count{$_} for keys %seen;
print "$_\t$count{$_}\n" for sort keys %count;

that is, just keep a tally of names that only gets shuffled into the global count when we reach the root of the tree.

hobbs
Doesn't this code miss the 'root' entry in each tree? I'd think you'd need to say `%seen = ( $entry->{name} => 1 )` rather than `%seen = ()`.
cHao
@hobbs - yes, that is the correct output.
coding_hero
@coding_hero: Is it? I don't see where `emma` and `sophia` are counted.
cHao
@cHao - You are correct. I mostly scanned the lists for the values > 1 to see if he got the point. I didn't look for every name.
coding_hero
@cHao good call. Corrected.
hobbs
This will work for me. Thanks! Points all around!
coding_hero
+1  A: 
%count = ();
for (@a)
{
    %found = () unless $_->{depth};
    my $name = $_->{name};
    unless ($found{$name}) 
    {
        ++$count{$name};
        $found{$name} = 1;
    }
}
return %count;

Basically, what we're doing is keeping a hash of the names we've found in the current tree (which is cleared out whenever the tree switches). If the current name hasn't been found yet, we bump the count and note that we've found it so it won't get counted again til the tree switches again.

cHao
+1  A: 

Here is one more way:

#!/usr/bin/perl

use strict; use warnings;

my @data = (
    { name => 'ethan', depth => 0 },
    { name => 'victoria', depth => 1 },
    { name => 'stephen', depth => 2 },
    { name => 'christopher', depth => 3 },
    { name => 'isabella', depth => 2 },
    { name => 'ethan', depth => 3 },
    { name => 'emma', depth => 0 },
    { name => 'michael', depth => 1 },
    { name => 'olivia', depth => 2 },
    { name => 'alexander', depth => 3 },
    { name => 'stephen', depth => 1 },
    { name => 'sophia', depth => 0 },
    { name => 'michael', depth => 1 },
    { name => 'ava', depth => 1 },
    { name => 'joshua', depth => 2 }
);

my @trees;

for my $x ( @data ) {
    push @trees, {} unless $x->{depth};
    $trees[-1]->{ $x->{name} } = undef;
}

my @names = keys %{ { map { $_->{name} => undef } @data } };
for my $name ( sort @names ) {
    printf "%s appears in %d tree(s)\n",
        $name, scalar grep { exists $_->{$name} } @trees;
}
Sinan Ünür
+1  A: 

I think this is shortest and simplest solution

my (%count, %seen);
for my $e (@a) {
  %seen = () unless $e->{depth};
  $count{$e->{name}}++ unless $seen{$e->{name}}++;
}

print "$_ => $count{$_}\n" for sort keys %count;

Result:

alexander => 1
ava => 1
christopher => 1
emma => 1
ethan => 1
isabella => 1
joshua => 1
michael => 2
olivia => 1
sophia => 1
stephen => 2
victoria => 1
Hynek -Pichi- Vychodil