I searched SO for a similar Q/A to no avail. I want to shuffle the lines of a text file randomly and create a new file. The file may have several thousands of lines.
How can I do that with cat, awk, cut, etc.?
I searched SO for a similar Q/A to no avail. I want to shuffle the lines of a text file randomly and create a new file. The file may have several thousands of lines.
How can I do that with cat, awk, cut, etc.?
Here is a first try that's easy on the coder but hard on the CPU which prepends a random number to each line, sorts them and then strips the random number from each line. In effect, the lines are sorted randomly:
cat myfile | awk 'BEGIN{srand();}{print rand()"\t"$0}' | sort -k1 -n | cut -f2- > myfile.shuffled
You can use shuf
. On some systems at least (doesn't appear to be in POSIX).
As jleedev pointed out: sort -R
might also be an option. On some systems at least; well, you get the picture.
here's an awk script
awk 'BEGIN{srand() }
{ lines[++d]=$0 }
END{
while (1){
if (e==d) {break}
RANDOM = int(1 + rand() * d)
if ( RANDOM in lines ){
print lines[RANDOM]
delete lines[RANDOM]
++e
}
}
}' file
output
$ cat file
1
2
3
4
5
6
7
8
9
10
$ ./shell.sh
7
5
10
9
6
8
2
1
3
4
I use a tiny perl script, which I call "unsort":
#!/usr/bin/perl
use List::Util 'shuffle';
@list = <STDIN>;
print shuffle(@list);
I've also got a NULL-delimited version, called "unsort0" ... handy for use with find -print0 and so on.
PS: Voted up 'shuf' too, I had no idea that was there in coreutils these days ... the above may still be useful if your systems doesn't have 'shuf'.