ansaurus

Question

How can I sort a Perl list in an arbitrary order?

Answer 1

+6 A:

Set up the association between strings and their respective positions with

my @all_possible_strings_in_order = qw/ name street city state postalcode /;

my %order = map +($order[$_] => $_), 0 .. $#all_possible_strings_in_order;

Now you can create a comparison function for use with Perl's sort operator:

sub by_order { $order{$a} <=> $order{$b} }

For example:

my @sorted = sort by_order qw/ city state name /;
print "@sorted\n";
# prints: name city state

Greg Bacon 2009-12-08 03:32:04

`my %order; @order{@all_possible_strings_in_order} = 1 .. @all_possible_strings_in_order` -- little bit less evil / more readable? :)

hobbs 2009-12-08 03:34:19

I prefer the `map` version (though with curlies), but in another question I profiled the two and found the slice assignment to be slightly faster (with a constant value repeated).

Chris Lutz 2009-12-08 03:36:19

(Also, in my version, I believe we need the `map` version to initialize the `state` variable in one line. Not certain about that, though.)

Chris Lutz 2009-12-08 03:38:27

Yes, that's nice, but I have a weird style hangup about separate declarations and initializations. Is the map really so bad? Too much Haskell for me?

Greg Bacon 2009-12-08 03:40:00

@gbacon: then `@$_{@all}=1..@all for \my %order;`

ysth 2009-12-08 04:12:59

Hash slices always trump anything else, just like waffles and ponies!

Ether 2009-12-08 04:14:17

%order = ( '' => 4 ) due to missing array values in the unary assignment. should be: +($allpossible_strings_in_order[$_] => $_)

zen 2009-12-08 04:58:50

Answer 2

+1 A:

Here's an idea that's fairly simple.

Take your first string from your unsorted list, search for it in your master list, find its index in the master list, then place it in a list, and keep track of the index.

Take your second string, find its index in the master list. if that index is greater than the first, place it in your new list behind the first, otherwise in front.

Keep this up for all remaining strings, maintaining a list of all the indices, so that you always know where to place the next string re the strings already sorted.

Hope this is clear enough to help.

John Doner

John R Doner 2009-12-08 03:33:20

Answer 3

+2 A:

If you have Perl 5.10, you could use this (names shortened for clarity):

use feature 'state';

sub bylist {
  state %hash = map { $all_possible[$_] => $_ } 0 .. $#all_possible;
  $hash{$_[0]} cmp $hash{$_[1]};
}

my @sorted = sort bylist @list_to_sort;

The state keyword creates what in C is known as a static variable - it's local to the bylist subroutine, but it won't be reinitialized. That way, you don't have to set anything up beforehand, but don't have to recalculate the value every time you want to use it.

I believe there's a hack to get this to happen in older Perls, but I wouldn't use it. If you don't have 5.10, just use gbacon's idea that he shamelessly stole from my brain while I was typing this :P

Chris Lutz 2009-12-08 03:35:06

Muwhahahahahaha!

Greg Bacon 2009-12-08 03:44:47

The "hack" -- which you should never, ever do, and no longer works in Perl 5.10 -- is `my %hash if 0; %hash or %hash = ...`

ephemient 2009-12-08 05:36:19

The non-hack work-around is to enclose the sub declaration in a block and define the state variable there. e.g. `{ my $x; sub foo { $x ||= 1; ... } }`

Michael Carman 2009-12-08 13:50:01

Or even `{my %hash = ...; sub bylist { ... }}` though that will pre-calculate even if `bylist` is never entered. If you have to use pre-5.10 Perl, this is most recommended, and still works in 5.10. On the other hand, it's probably not the "hack" Chris was thinking of :)

ephemient 2009-12-08 14:39:59

@ephemient: `{my %hash = ...; sub bylist {} }` only initializes if program flow reaches the block, which means you would have to put the sub definition in the middle of your program logic (and before calling it) rather than at the end as is normal.

Michael Carman 2009-12-08 15:32:28

@Michael - You could make that block a `BEGIN {}` block, so that it's guaranteed to be initialized. (Though I tend to put my subroutine definitions at the beginning. Personal style.)

Chris Lutz 2009-12-08 18:35:11

This really seems like half of an answer.

Brad Gilbert 2009-12-09 16:48:53

Answer 4

+1 A:

The most naive way to do this would be to sort based on a comparison function, where the comparison function comp(a,b) = "which of a and b comes first in the master list?", if I understand correctly.

So yeah, your idea looks about right. If you have to do a lot of sorts between changes to @all_possible_strings_in_order, then you should build the whole map once. If the order list changes every sort, you may be able to gain some speed with some clever lazy search, but maybe not.


my %order;
my $i = 0;
foreach my $s (@all_possible_strings_in_order) {
    $order{$s} = $i++;
}

my @sorted = sort {$order{$a} <=> $order{$b}} @list_that_needs_to_be_sorted;

I imagine this should be quite fast.

Peter Cordes 2009-12-08 03:36:00

Your Perl needs a little work. `foreach my $s in (@list)` ? What's that `in` mean?

Chris Lutz 2009-12-08 03:37:08

is it normal to get 4 answers in the span of 3 minutes, after no answers for 45 minutes? I feel like a perl noob for not using map or something with a range operator to generate the order hash.

Peter Cordes 2009-12-08 03:41:31

@Chris: It means I've been using bash all afternoon.

Peter Cordes 2009-12-08 03:44:35

No, but then again it's not really normal to get Perl questions on Stack Overflow these days. (And `map` and `..` aren't always the best answers. The only way for the OP to figure out what is best in this case is to profile it.)

Chris Lutz 2009-12-08 03:44:42

@Chris Lutz: ?? there are Perl questions on SO almost every single day. Not the flood that some languages get, and not the ~20/day that perlmonks gets, but a steady trickle.

ysth 2009-12-08 04:18:54

It's a very small trickle compared to Python, even though there's a lot more Perl code and programmers than Python. (Also, I conveniently happen to miss most of them.)

Chris Lutz 2009-12-08 04:25:17

Answer 5

+5 A:

A different approach (one that won't work if the list to be sorted can have duplicates that need to be preserved):

my %set;
@set{ @list_that_needs_to_be_sorted } = ();
my @sorted = grep exists $set{$_}, @all_possible_strings_in_order;

ysth 2009-12-08 04:22:35

Answer 6

+2 A:

You can just go over the master list and push any element that occurs in the unsorted list onto the result list, while removing it from the unsorted list. If your unsorted list is short (from your example, I reckon about 5 elements), this should be faster and smaller than building hash tables each time (you said you couldn't do it once beforehand).

An optimization might be to make a trie from the unsorted list, but whether this is better is dependent on the size of each list.

Svante 2009-12-08 08:47:19

That's the same as ysth's answer.

Peter Cordes 2009-12-09 05:30:39

ansaurus

tags:

views:

answers:

How can I sort a Perl list in an arbitrary order?

related questions