views:

119

answers:

4

I was going through the exercises at http://rubykoans.com/ and I was struck by the following Ruby quirk that I found really unexplainable:

array = [:peanut, :butter, :and, :jelly]

array[0]    => :peanut #OK!
array[0,1]  => [:peanut] #OK!
array[0,2]  => [:peanut, :butter] #OK!
array[0,0]  => [] #OK!
array[2]    => :and #OK!
array[2,2]  => [:and, :jelly] #OK!
array[2,20] => [:and, :jelly] #OK!
array[4]    => nil #OK!
array[4,0]  => []  #HUH??  Why's that?
array[4,100]=> []  #Still HUH, but consistent with previous one
array[5]    => nil #consistent with array[4]    => nil  
array[5,0]  => nil #WOW.  Now I don't understand anything anymore...

#So why is array[5,0] not equal to   array[4,0]  ?

Is there any reason why array slicing behaves this weird when you start at the n+1th position??

+4  A: 

Makes sense to me. Say the first number when you slice does not identify the element, but places between elements, in order to be able to define spans (and not elements themselves):

  :peanut   :butter   :and   :jelly
0         1         2      3        4

so, 4 is still within the array, just barely; if you request 0 elements, you get the empty end of the array. But there is no index 5, so you can't slice from there.

When you don't slice, but index (like array[4]), you are actually pointing at elements themselves, so the indices only go from 0 to 3.

Basically what I'm saying is that slicing and indexing are two different operations, and inferring the behaviour of one from the other is where your problem lies.

Amadan
A: 

I agree that this seems like strange behavior, but even the official documentation on Array#slice demonstrates the same behavior as in your example, in the "special cases" below:

   a = [ "a", "b", "c", "d", "e" ]
   a[2] +  a[0] + a[1]    #=> "cab"
   a[6]                   #=> nil
   a[1, 2]                #=> [ "b", "c" ]
   a[1..3]                #=> [ "b", "c", "d" ]
   a[4..7]                #=> [ "e" ]
   a[6..10]               #=> nil
   a[-3, 3]               #=> [ "c", "d", "e" ]
   # special cases
   a[5]                   #=> nil
   a[5, 1]                #=> []
   a[5..10]               #=> []

Unfortunately, even their description of Array#slice doesn't seem to offer any insight as to why it works this way:

Element Reference—Returns the element at index, or returns a subarray starting at start and continuing for length elements, or returns a subarray specified by range. Negative indices count backward from the end of the array (-1 is the last element). Returns nil if the index (or starting index) are out of range.

Mark Rushakoff
+1  A: 

At least note that the behavior is consistent. From 5 on up everything acts the same; the weirdness only occurs at [4,N].

Maybe this pattern helps, or maybe I'm just tired and it doesn't help at all.

array[0,4] => [:peanut, :butter, :and, :jelly]
array[1,3] => [:butter, :and, :jelly]
array[2,2] => [:and, :jelly]
array[3,1] => [:jelly]
array[4,0] => []

At [4,0], we catch the end of the array. I'd actually find it rather odd, as far as beauty in patterns go, if the last one returned nil. Because of a context like this, 4 is an acceptable option for the first parameter so that the empty array can be returned. Once we hit 5 and up, though, the method likely exits immediately by nature of being totally and completely out of bounds.

Matchu
A: 

this has to do with the fact that slice returns an array, relevant source documentation from Array#slice:

 *  call-seq:
 *     array[index]                -> obj      or nil
 *     array[start, length]        -> an_array or nil
 *     array[range]                -> an_array or nil
 *     array.slice(index)          -> obj      or nil
 *     array.slice(start, length)  -> an_array or nil
 *     array.slice(range)          -> an_array or nil

which suggests to me that if you give the start that is out of bounds, it will return nil, thus in your example array[4,0] asks for the 4th element that exists, but asks to return an array of zero elements. While array[5,0] asks for an index out of bounds so it returns nil. This perhaps makes more sense if you remember that the slice method is returning a new array, not altering the original data structure.

EDIT:

After reviewing the comments I decided to edit this answer. Slice calls the following code snippet when the arg value is two:

if (argc == 2) {
    if (SYMBOL_P(argv[0])) {
        rb_raise(rb_eTypeError, "Symbol as array index");
    }
    beg = NUM2LONG(argv[0]);
    len = NUM2LONG(argv[1]);
    if (beg < 0) {
        beg += RARRAY(ary)->len;
    }
    return rb_ary_subseq(ary, beg, len);
}

if you look in the array.c class where the rb_ary_subseq method is defined, you see that it is returning nil if the length is out of bounds, not the index:

if (beg > RARRAY_LEN(ary)) return Qnil;

In this case this is what is happening when 4 is passed in, it checks that there are 4 elements and thus does not trigger the nil return. It then goes on and returns an empty array if the second arg is set to zero. while if 5 is passed in, there are not 5 elements in the array, so it returns nil before the zero arg is evaluated. code here at line 944.

I believe this to be a bug, or at least unpredictable and not the 'Principle of Least Surprise'. When I get a few minutes I will a least submit a failing test patch to ruby core.

Jed Schneider
But... the element indicated by the 4 in array[4,0] doesn't exist either... - because it is actually the 5the element (0-based counting, see the examples). So it is out of bounds as well.
Pascal Van Hecke
you're right. I went back and looked at the source, and it looks like the first argument is handled inside the c code as the length, not the index. I will edit my answer, to reflect this. I think this could be submitted as a bug.
Jed Schneider