views:

579

answers:

3

I want to do the inverse of sort(1) : randomize every line of stdin to stdout in Perl.

+4  A: 

This perl snippet does the trick :

#! /usr/bin/perl
# randomize cat

# fisher_yates_shuffle code copied from Perl Cookbook 
# (By Tom Christiansen & Nathan Torkington; ISBN 1-56592-243-3)

use strict;

my @lines = <>;
fisher_yates_shuffle( \@lines );    # permutes @array in place
foreach my $line (@lines) {
 print $line;
}

# fisher_yates_shuffle( \@array ) : generate a random permutation
# of @array in place
sub fisher_yates_shuffle {
    my $array = shift;
    my $i;
    for ($i = @$array; --$i; ) {
        my $j = int rand ($i+1);
        next if $i == $j;
        @$array[$i,$j] = @$array[$j,$i];
    }
}

__END__
Steve Schnepp
+9  A: 

I bet real Perl hackers will tear this apart, but here it goes nonetheless.

use strict;
use warnings;
use List::Util 'shuffle';

my @lines = ();
my $bufsize = 512;
while(<STDIN>) {
    push @lines, $_;
    if (@lines == $bufsize) {
        print shuffle(@lines);
        undef @lines;
    }
}
print shuffle(@lines);

Difference between this and the other solution:

  • Will not consume all the input and then randomize it (memory hog), but will randomize every $bufsize lines (not truly random and slow as a dog compared to the other option).
  • Uses a module which returns a new list instead of a in place editing Fisher - Yates implementation. They are interchangeable (except that you would have to separate the print from the shuffle). For more information type perldoc -q rand on your shell.
Vinko Vrsalovic
Why do you use $current? Why would you manually maintain the length of the array when the array already knows this?
Leon Timmermans
Because I make mistakes :-) Fixed.
Vinko Vrsalovic
I think you want if (@lines == $bufsize). As is, you will shuffle every 514 lines.
ysth
I'm curious - Why do you think "real Perl hackers" would tear it apart? It's readable, it uses strict and warnings, and it gets the job done.
Sherm Pendley
Well, the real Perl hackers already pointed out two mistakes in a row...
Vinko Vrsalovic
Okay, my mistake. I thought you meant those who would turn it into one line of unreadable gibberish. *Real* Perl hackers understand the difference between production code and "Perl golf," and aside from fixing a few typos, wouldn't have a problem with your code. :-)
Sherm Pendley
-1 because it's not truly random.
@dehmann: I say so myself in the answer, which the OP accepted... I even state the reason for this non truly randomness.
Vinko Vrsalovic
+3  A: 
use List::Util 'shuffle';
print shuffle <>

Or if you worry about last lines lacking \n,

chomp(my @lines = <>);
print "$_\n" for shuffle @lines;
ysth
This is the equivalent of TheSnide's solution using shuffle (eats everything up first, then shuffles)
Vinko Vrsalovic
Just a whole lot shorter (and to my mind clearer).
ysth