views:

786

answers:

4

I searched SO for a similar Q/A to no avail. I want to shuffle the lines of a text file randomly and create a new file. The file may have several thousands of lines.

How can I do that with cat, awk, cut, etc.?

+1  A: 

Here is a first try that's easy on the coder but hard on the CPU which prepends a random number to each line, sorts them and then strips the random number from each line. In effect, the lines are sorted randomly:

cat myfile | awk 'BEGIN{srand();}{print rand()"\t"$0}' | sort -k1 -n | cut -f2- > myfile.shuffled
Amaç Herdağdelen
UUOC. pass the file to awk itself.
ghostdog74
Right, I debug with `head myfile | awk ...`. Then I just change it to cat; that's why it was left there.
Amaç Herdağdelen
+6  A: 

You can use shuf. On some systems at least (doesn't appear to be in POSIX).

As jleedev pointed out: sort -R might also be an option. On some systems at least; well, you get the picture.

Joey
Cool, I hadn't known about shuf. It looks like it's part of coreutils but the version I have installed on my server doesn't have shuf.
Amaç Herdağdelen
Yeah, seems to be GNU-only stuff, unfortunately. Nothing of it in POSIX.
Joey
It's odd that GNU coreutils has both `shuf` and `sort -R`.
jleedev
+2  A: 

here's an awk script

awk 'BEGIN{srand() }
{ lines[++d]=$0 }
END{
    while (1){
    if (e==d) {break}
        RANDOM = int(1 + rand() * d)
        if ( RANDOM in lines  ){
            print lines[RANDOM]
            delete lines[RANDOM]
            ++e
        }
    }
}' file

output

$ cat file
1
2
3
4
5
6
7
8
9
10

$ ./shell.sh
7
5
10
9
6
8
2
1
3
4
ghostdog74
A: 

I use a tiny perl script, which I call "unsort":

#!/usr/bin/perl
use List::Util 'shuffle';
@list = <STDIN>;
print shuffle(@list);

I've also got a NULL-delimited version, called "unsort0" ... handy for use with find -print0 and so on.

PS: Voted up 'shuf' too, I had no idea that was there in coreutils these days ... the above may still be useful if your systems doesn't have 'shuf'.

Sharkey