Can anyone recommend a good concise reference for the Python slice notation? I'm a seasoned programmer but new to Python and this notation needs a bit of picking up. It looks extremely powerful, but I haven't quite got my head round it and am looking for a good guide.
Do you mean http://www.python.org/doc/2.5.2/ref/slicings.html#tok-slicing ?
Or http://docs.python.org/reference/expressions.html#grammar-token-slicing ?
Or http://docs.python.org/3.0/library/functions.html#slice
Or something else?
It's pretty simple really:
a[start:end] # items start through end-1
a[start:] # items start through the rest of the array
a[:end] # items from the beginning through end-1
a[:] # a copy of the whole array
There is also the step
value, which can be used with any of the above:
a[start:end:step] # start through not past end, by step
The key point to remember is that the :end
value represents the first value that is not in the selected slice. So, the difference beween end
and start
is the number of elements selected (if step
is 1, the default).
The other feature is that start
or end
may be a negative number, which means it counts from the end of the array instead of the beginning. So:
a[-1] # last item in the array
a[-2:] # last two items in the array
a[:-2] # everything except the last two items
Python is kind to the programmer if there are fewer items than you ask for. For example, if you ask for a[:-2]
and a
only contains one element, you get an empty list instead of an error. Sometimes you would prefer the error, so you have to be aware that this may happen.
The tutorial talks about it:
http://docs.python.org/tutorial/introduction.html#strings
(Scroll down a bit until you get to the part about slicing.)
The ASCII art diagram is helpful too for remembering how slices work:
+---+---+---+---+---+
| H | e | l | p | A |
+---+---+---+---+---+
0 1 2 3 4 5
-5 -4 -3 -2 -1
"One way to remember how slices work is to think of the indices as pointing between characters, with the left edge of the first character numbered 0."
Enumerating the possibilities allowed by the grammar:
>>> seq[:] # [seq[0], seq[1], ..., seq[-1] ]
>>> seq[low:] # [seq[low], seq[low+1], ..., seq[-1] ]
>>> seq[:high] # [seq[0], seq[1], ..., seq[high-1]]
>>> seq[low:high] # [seq[low], seq[low+1], ..., seq[high-1]]
>>> seq[::stride] # [seq[0], seq[stride], ..., seq[-1] ]
>>> seq[low::stride] # [seq[low], seq[low+stride], ..., seq[-1] ]
>>> seq[:high:stride] # [seq[0], seq[stride], ..., seq[high-1]]
>>> seq[low:high:stride] # [seq[low], seq[low+stride], ..., seq[high-1]]
Of course, if (high-low)%stride != 0
, then the end point will be a little lower than high-1
.
Extended slicing (with commas and ellipses) are mostly used only by special data structures (like Numpy); the basic sequences don't support them.
>>> class slicee:
... def __getitem__(self, item):
... return `item`
...
>>> slicee()[0, 1:2, ::5, ...]
'(0, slice(1, 2, None), slice(None, None, 10), Ellipsis)'
And a couple of things that weren't immediately obvious to me when I first saw the slicing syntax:
>>> x = [1,2,3,4,5,6]
>>> x[::-1]
[6,5,4,3,2,1]
Easy way to reverse sequences!
And if you wanted, for some reason, every second item in the reversed sequence:
>>> x = [1,2,3,4,5,6]
>>> x[::-2]
[6,4,2]
I use the "an index points between elements" method of thinking about it myself, but one way of describing it which sometimes helps others get it is this:
mylist[X:Y]
X is the index of the first element you want.
Y is the index of the first element you don't want.
After using it a bit I realise that the simplest description is that it is exactly the same as the arguments in a for loop...
(from:to:step)
any of them are optional
(:to:step)
(from::step)
(from:to)
then the negative indexing just needs you to add the length of the string to the negative indices to understand it.
This works for me anyway...