views:

85

answers:

3

This is my first Perl script. Ever:

#!/usr/bin/perl

if ($#ARGV < 1) { die("usage: <size_in_bytes> <file_name>\n"); }

open(FILE,">" . $ARGV[0]) or die "Can't open file for writing\n";

# you can control the range of characters here
my $minimum = 32;
my $range = 96;

for ($i=0; $i< $ARGV[1]; $i++) {
    print FILE chr(int(rand($range)) + $minimum);
}

close(FILE);

Its purpose is to generate a file in a specified size filled with random characters.

It works but it is pretty slow. It takes a few seconds to write a 10MB random file.
Does anyone have suggestions/tips on how to make it faster/better? Also feel free to point out common newbie mistakes.

+3  A: 

Write stream data from /dev/random.

#!/usr/bin/perl
use File::Copy;
if ($#ARGV < 1) { die("usage: <size_in_bytes>\n"); }
copy("/dev/random","tmp", $ARGV[0]) or die "Copy failed: $!";

code not tested.

Edit: Since you want a range, do this.

Your range is between 96 and 32, which is a space of 64. 64 = 01000000b (0x40 in hex). Simply generate your numbers and preform a bitwise AND against the number that is the range of values to be generated-1 and add the lower bound by preforming a bitwise OR with it's value (00100000b, or 0x20)

This will let you do things like take any random string (just read raw hex from /dev/random) and transform the data to be in your range.

you mean /dev/random, probably. Note that it's unix-like systems only.
Karel Bílek
Yes, thank you for spotting that. I'm betting it's on unix based on where perl is located (usr/bin)
This doesn't allow me to control which characters get written or not.
quantumSoup
Oh, I haven't noticed that. But you might still want to know how portable is your code :)
Karel Bílek
Related SO on dev/random equivalent for windowshttp://stackoverflow.com/questions/191335/windows-equivalent-of-dev-random
@quantumSoup then yours works fine, just build the string and print it every 4kb instead of every byte.
@user257493: `/usr/bin/perl` is the recommended shebang path for Perl scripts, regardless of platform. On Windows, for example, it's ignored, though still parsed for switches.
Jon Purdy
+1  A: 

If you need random numbers from a range, I'm not aware of more efficient way. Your script adjusted to my likings:

#!/usr/bin/perl

use warnings;
use strict;

die("usage: $0 <size_in_bytes> <file_name>\n") unless @ARGV == 2;

my ($num_bytes, $fname) = @ARGV;

open(FILE, ">", $fname) or die "Can't open $fname for writing ($!)";

my $minimum = 32;
my $range = 96;

for (1 .. $num_bytes) {
    print FILE pack( "c", int(rand($range)) + $minimum);
}

close(FILE);

I use pack("c") when I really need binary. chr() might be fine too but IIRC it actually depends on what the character encoding your environment is using (think ASCII vs. utf8.)

BTW if you really need binary file for Windows compatibility you might want to add binmode FILE; after the open.

Otherwise, if range is optional, you can simply dd if=/dev/random of=$filename bs=1 count=$size_of_the_output (or on Linux the faster crypto-unsafe /dev/urandom). But that would be much slower as /dev/random really tries to deliver real random bits - as they become available. And if there is not enough of them (e.g. your platform doesn't have H/W RNG) then performance would really suffer - compared to the blazingly fast libc's pseudo-random number generator (the Perl uses internally to implement the rand()).

Dummy00001
+3  A: 
  1. You could ask rand to create more than one value for you each time you call it.
  2. Collect several characters together before calling print. Printing one character at a time is inefficient.

 

for (my $bytes = 0; $bytes < $num_bytes; $bytes += 4) {
    my $rand = int(rand($range ** 4));
    my $string = '';
    for (1..4) {
        $string .= chr($rand % $range + $minimum);
        $rand = int($rand / $range);
    }
    print FILE $string;
}
mobrule
I *think* perl works best in 4k chunks sent to the stream. Related SO to changing the stream buffer size: http://stackoverflow.com/questions/1251062/how-can-i-set-the-file-read-buffer-size-in-perl-to-optimize-it-for-large-files