views:

67

answers:

1

After executing these lines in Perl:

my $data = `curl '$url'`;
my $pets = XMLin($data)->(pets);

I have an array reference that contains references to hashes:

$VAR1 = [
      {
        'title' => 'cat',
        'count' => '210'
      },
      {
        'title' => 'dog',
        'count' => '210'
      }
]

In Perl, how do I sort the hashes first by count and secondarily by title. Then print to STDOUT the count followed by the title on each newline.

+5  A: 

Assuming you want counts in descending order and titles ascending:

print map join(" ", @$_{qw/ count title /}) . "\n",
      sort { $b->{count} <=> $a->{count}
                         ||
             $a->{title} cmp $b->{title} }
      @$pets;

That's compact code written in a functional style. To help understand it, let's look at equivalent code in a more familiar, imperative style.

Perl's sort operator takes an optional SUBNAME parameter that allows you to factor out your comparison and give it a name that describes what it does. When I do this, I like to begin the sub's name with by_ to make sort by_... ready more naturally.

To start, you might have written

sub by_count_or_title {
  $b->{count} <=> $a->{count}
              ||
  $a->{title} cmp $b->{title}
}

my @sorted = sort by_count_or_title @$pets;

Note that no comma follows the SUBNAME in this form!

To address another commenter's question, you could use or rather than || in by_count_or_title if you find it more readable. Both <=> and cmp have higher precedence (which you might think of as binding more tightly) than || and or, so it's strictly a matter of style.

To print the sorted array, a more familiar choice might be

foreach my $p (@sorted) {
  print "$p->{count} $p->{title}\n";
}

Perl uses $_ if you don't specify the variable that gets each value, so the following has the same meaning:

for (@sorted) {
  print "$_->{count} $_->{title}\n";
}

The for and foreach keywords are synonyms, but I find that the uses above, i.e., foreach if I'm going to name a variable or for otherwise, read most naturally.

Using map, a close cousin of foreach, instead isn't much different:

map print("$_->{count} $_->{title}\n"), @sorted;

You could also promote print through the map:

print map "$_->{count} $_->{title}\n",
      @sorted;

Finally, to avoid repetition of $_->{...}, the hash slice @$_{"count", "title"} gives us the values associated with count and title in the loop's current record. Having the values, we need to join them with a single space and append a newline to the result, so

print map join(" ", @$_{qw/ count title /}) . "\n",
      @sorted;

Remember that qw// is shorthand for writing a list of strings. As this example shows, read a map expression back-to-front (or bottom-to-top the way I indented it): first sort the records, then format them, then print them.

For one final alternative, you could eliminate the temporary @sorted but call the named comparison:

print map join(" ", @$_{qw/ count title /}) . "\n",
      sort by_count_or_title
      @$pets;
Greg Bacon
What does "map" mean/do in the code above? And why didn't we need to specify the two attributes in two different sets of curly braces?
syker
@syker : `map` takes each hash in your sorted array and prints its `count` and `title` values on a line, space-separated. @gbacon : It might be worth explaining if there is any difference between using `||` and `or` in your sort as I always see `||` in sort-by-anything examples. Is there a possibility that the more-readable `or` behaves differently under any circumstances?
Zaid
@syker @Zaid See updated answer.
Greg Bacon
@gbacon You the man.
syker
@syker map is similar to a foreach. check it out on perldoc (http://perldoc.perl.org/functions/map.html)
vol7ron