tags:

views:

179

answers:

5

I have inherited some code from a guy whose favorite past time was to shorten every line to its absolute minimum (and sometimes only to make it look cool). His code is hard to understand but I managed to understand (and rewrite) most of it.

Now I have stumbled on a piece of code which, no matter how hard I try, I cannot understand.

my @heads = grep {s/\.txt$//} OSA::Fast::IO::Ls->ls($SysKey,'fo','osr/tiparlo',qr{^\d+\.txt$}) || ();
my @selected_heads = ();
for my $i (0..1) {
   $selected_heads[$i] = int rand scalar @heads;
   for my $j (0..@heads-1) {
      last if (!grep $j eq $_, @selected_heads[0..$i-1]);
      $selected_heads[$i] = ($selected_heads[$i] + 1) % @heads; #WTF?
   }
   my $head_nr = sprintf "%04d", $i;
   OSA::Fast::IO::Cp->cp($SysKey,'',"osr/tiparlo/$heads[$selected_heads[$i]].txt","$recdir/heads/$head_nr.txt");
   OSA::Fast::IO::Cp->cp($SysKey,'',"osr/tiparlo/$heads[$selected_heads[$i]].cache","$recdir/heads/$head_nr.cache");
}

From what I can understand, this is supposed to be some kind of randomizer, but I never saw a more complex way to achieve randomness. Or are my assumptions wrong? At least, that's what this code is supposed to do. Select 2 random files and copy them.

=== NOTES ===

The OSA Framework is a Framework of our own. They are named after their UNIX counterparts and do some basic testing so that the application does not need to bother with that.

A: 

Here's yet another way to select 2 unique random indices:

my @selected_heads = ();
my @indices = 0..$#heads;
for my $i (0..1) {
  my $j = int rand (@heads - $i);
  push @selected_heads, $indices[$j];
  $indices[$j] = $indices[@heads - $i - 1];
}
J.F. Sebastian
+3  A: 

It selects a random element from @heads.

Then it adds on another random but different element from @heads (if it is the element previously selected, it scrolls through @heads till it find an element not previously selected).

In summary, it selects N (in your case N=2) different random indexes in @heads array and then copies files corresponding to those indexes.

Personally I would write it a bit differently:

# ...
%selected_previously = ();
foreach my $i (0..$N) { # Generalize for N random files instead of 2
    my $random_head_index = int rand scalar @heads;
    while ($selected_previously[$random_head_index]++) {
        $random_head_index = $random_head_index + 1) % @heads; # Cache me!!!
    }
    # NOTE: "++" in the while() might be considered a bit of a hack
    # More readable version: $selected_previously[$random_head_index]=1; here.
DVK
+1  A: 

The part you labeled "WTF" isn't so troubling, it's just simply making sure that $selected_heads[$i] remains as a valid subscript of @head. The really troubling part is that it is a pretty inefficient way of making sure he's not selecting the same file.

Then again, if the size of @heads is small, stepping from 0..$#heads is probably more efficient than just generating int rand( 2 ) and testing if they are the same.

But basically it copies two files at random (why?) as a '.txt' file and a '.cache' file.

Axeman
+12  A: 

This looks like some C code with Perl syntax. Sometimes knowing the language the person is thinking in helps you figure out what's going on. In this case, the person's brain is infected with the inner workings of memory management, pointer arithmetic, and other low level concerns, so he wants to minutely control everything:

my @selected_heads = ();

# a tricky way to make a two element array
for my $i (0..1) {

   # choose a random file
   $selected_heads[$i] = int rand @heads;

   # for all the files (could use $#heads instead)
   for my $j (0..@heads-1) {
      # stop if the chosen file is not already in @selected_heads
      # it's that damned ! in front of the grep that's mind-warping
      last if (!grep $j eq $_, @selected_heads[0..$i-1]);

      # if we are this far, the two files we selected are the same
      # choose a different file if we're this far
      $selected_heads[$i] = ($selected_heads[$i] + 1) % @heads; #WTF?
   }

...
}

This is a lot of work because the original programmer either doesn't understand hashes or doesn't like them.

my %selected_heads;
until( keys %selected_heads == 2 )
    {
    my $try = int rand @heads;
    redo if exists $selected_heads{$try};
    $selected_heads{$try}++;
    }

my @selected_heads = keys %selected_heads;

If you still hate hashes and have Perl 5.10 or later, you can use smart-matching to check if a value is in an array:

my @selected_heads;
until( @selected_heads == 2 )
    {
    my $try = int rand @heads;
    redo if $try ~~ @selected_heads;
    push @selected_heads, $try;
    }

However, you have a special constraint on this problem. Since you know there are only two elements, you just have to check if the element you want to add is the prior element. In the first case it won't be undef, so the first addition always works. In the second case, it just can't be the last element in the array:

my @selected_heads;
until( @selected_heads == 2 )
    {
    my $try = int rand @heads;
    redo if $try eq $selected_heads[-1];
    push @selected_heads, $try;
    }

Huh. I can't remember the last time I used until when it actually fit the problem. :)

Note that all of these solutions have the problem that they can cause an infinite loop if the number of original files is less than 2. I'd add a guard condition higher up so the no and single file cases through an error and perhaps the two file case doesn't bother to order them.

Another way you might do this is to shuffle (say, with List::Util) the entire list of original files and just take off the first two files:

use List::Util qw(shuffle);

my @input = 'a' .. 'z';

my @two = ( shuffle( @input ) )[0,1];

print "selected: @two\n";
brian d foy
And for Perls below 5.10, `List::Util::first` is a great way of checking if an element is in a list. It has XS bindings so it is faster than writing your own `foreach { ... last if ... }` loop.
Ether
+1 for `redo` :)
friedo
Funny how the rewrite is shorter than the original!
Zaid
@Zaid a good programmer makes concision and readability work together, not against each other. If you come up with the *right* way to say exactly what you mean, then you shouldn't have to say a *lot*.
hobbs
+1  A: 

How about just

for my $i (0..1) {
    my $selected = splice( @heads, rand @heads, 1 );
    my $head_nr = sprintf "%04d", $i;
    OSA::Fast::IO::Cp->cp($SysKey,'',"osr/tiparlo/$selected.txt","$recdir/heads/$head_nr.txt");
    OSA::Fast::IO::Cp->cp($SysKey,'',"osr/tiparlo/$selected.cache","$recdir/heads/$head_nr.cache");
}

unless @heads or @selected_heads are used later.

ysth