views:

62

answers:

3

how can i easily (quick and dirty) change, say 10, random lines of a file with a simple shellscript?

i though about abusing ed and generating random commands and line ranges, but i'd like to know if there was a better way

+1  A: 
awk 'BEGIN{srand()}
{ lines[++c]=$0 }
END{
  while(d<10){
   RANDOM = int(1 + rand() * c)
   if( !( RANDOM in r)  ) {
     r[RANDOM]
     print "do something with " lines[RANDOM]
     ++d
   }
  }
}' file

or if you have the shuf command

shuf -n 10 $file | while read -r line
do
  sed -i "s/$line/replacement/" $file
done
ghostdog74
You probably want to do this: `BEGIN { srand() }`
Dennis Williamson
+2  A: 

This seems to be quite a bit faster:

file=/your/input/file
c=$(wc -l < "$file")
awk -v c=$c 'BEGIN {
                    srand();
                    for (i=0;i<10;i++) lines[i] = int(1 + rand() * c);
                    asort(lines);
                    p = 1
             }
             {
                 if (NR == lines[p]) {
                     ++p
                     print "do something with " $0
                 }
                 else print 
             }' "$file"

I

Dennis Williamson
can you explain why the shuf command doesn't keep order? shuf just brings out 10 random lines, then uses sed to change each of them. Not efficient though i can say.
ghostdog74
@ghostdog74: I'm not sure I understand. `shuf` is short for "shuffle" and that's what it does.
Dennis Williamson
yes, but shuf -n 10 gets 10 random lines out of the file.
ghostdog74
@ghostdog74: I'm sorry, I overlooked a tiny little `-i` which caused me to misunderstand what yours is doing. My apologies.
Dennis Williamson
This solution assumes lines[] will be unique. Run the above code against a file with 11 lines. You will rarely have 10 lines modified; usually there will be fewer.
themis
@themis: That's why I commented and upvoted your answer.
Dennis Williamson
+1  A: 

Playing off @Dennis' version, this will always output 10. Doing random numbers in a separate array could create duplicates and, consequently, fewer than 10 modifications.

file=~/testfile
c=$(wc -l < "$file")
awk -v c=$c '
BEGIN {
        srand();
        count = 10;
    }

    {
        if (c*rand() < count) {
            --count;
            print "do something with " $0;
        } else
            print;
        --c;
    }
' "$file"
themis
Brilliant idea.
Dennis Williamson