ansaurus

Question

What's an easy way to read random line from a file in Unix command line?

Answer 1

+1 A:

cat + perl <http://snippets.openminds.be/posts/show/8>

edit: perl -e 'srand; rand($.) < 1 && ($line = $_) while <>; print $line;' filename

thanks Mr. Muskrat

Tracker1 2009-01-15 19:06:24

Just for the purposes of inclusion (in case the referred site goes down), here's the code that Tracker1 pointed to:"cat filename | perl -e 'while (<>) { push(@_,$_); } print @_[rand()*@_];';"

Anirvan 2009-01-15 19:16:44

This is a useless use of cat.Here's a slight modification of the code found in perlfaq5 (and courtesy of the Camel book):perl -e 'srand; rand($.) < 1 print $line;' filename

Mr. Muskrat 2009-01-15 21:55:17

surprise! It went down

Nathan Fellman 2009-05-13 06:57:14

err... the linked site, that is

Nathan Fellman 2009-05-22 04:48:48

Answer 2

+2 A:

Here's a simple Python script that will do the job:

import random, sys
lines = open(sys.argv[1]).readlines()
print(lines[random.randint(0, len(lines))])

Usage:

python randline.py file_to_get_random_line_from

Adam Rosenfield 2009-01-15 19:07:22

Answer 3

+3 A:

using a bash script:

#!/bin/bash
# replace with file to read
FILE=tmp.txt
# count number of lines
NUM=$(wc - l < ${FILE})
# generate random number in range 0-NUM
let X=${RANDOM} % ${NUM} + 1
# extract X-th line
sed -n ${X}p ${FILE}

Paolo Tedesco 2009-01-15 19:12:25

Random can be 0, sed needs 1 for the first line. sed -n 0p returns error.

asalamon74 2009-01-15 19:20:15

mhm - how about $1 for "tmp.txt" and $2 for NUM ?

blabla999 2009-01-15 19:22:04

but even with the bug worth a point, as it does not need perl or python and is as efficient as you can get (reading the file exactly twice but not into memory - so it would work even with huge files).

blabla999 2009-01-15 19:28:09

@asalamon74: thanks@blabla999: if we make a function out of it, ok for $1, but why not computing NUM?

Paolo Tedesco 2009-01-15 19:28:26

Changing the sed line to:head -${X} ${FILE} | tail -1should do it

JeffK 2009-01-15 19:34:24

useless use of cat detected, wc happily takes files directly

Hasturkun 2009-01-15 21:00:37

@Hasturkun: beware - the output of wc depends on whether it reads stdin or a file name off its command line. Granted, 'wc -l < $FILE' would be OK; using 'wc -l $FILE' (no redirection) would be a bug.

Jonathan Leffler 2009-01-16 08:06:37

Paolo Tedesco 2009-01-16 08:26:11

Answer 4

+3 A:

Single bash line:

sed -n $((1+$RANDOM%`wc -l test.txt | cut -f 1 -d ' '`))p test.txt

Slight problem: duplicate filename.

asalamon74 2009-01-15 19:17:59

slighter problem. performing this on /usr/share/dict/words tends to favor words starting with "A". Playing with it, I'm at about 90% "A" words to 10% "B" words. None starting with numbers yet, which make up the head of the file.

bibby 2010-09-30 05:01:32

Answer 5

+12 A:

There is a utility called rl. In Debian it's in the randomize-lines package that does exactly what you want.

Or you can use shuf:

shuf -n 1 $FILE

unbeknown 2009-01-15 19:30:38

i really like that shuf approach!

Johannes Schaub - litb 2009-01-15 19:39:06

Answer 6

+2 A:

Another alternative:

head -$((${RANDOM} % `wc -l < file` + 1)) file | tail -1

PolyThinker 2009-01-16 08:54:15

ansaurus

tags:

views:

answers:

What's an easy way to read random line from a file in Unix command line?

related questions