What's a good algorithm to shuffle the lines in a file without reading the whole file in advance?
I guess it would look something like this: Start reading the file line by line from the beginning, storing the line at each point and decide if you want to print one of the lines stored so far (and then remove from storage) or do nothing and proceed to the next line.
Can someone verify / prove this and/or maybe post working (perl, python, etc.) code?
Related questions, but not looking at memory-efficient algorithms:
- http://stackoverflow.com/questions/2153882/how-can-i-shuffle-the-lines-of-a-text-file-in-unix-command-line
- http://stackoverflow.com/questions/886237/how-can-i-randomize-the-lines-in-a-file-using-a-standard-tools-on-redhat-linux
- http://stackoverflow.com/questions/286640/how-can-i-print-the-lines-in-stdin-in-random-order-in-perl