views:

203

answers:

4

Is it a linked list, an array? I searched around and only found people guessing. My C knowledge isn't good enough to look at the source code.

+5  A: 

This is implementation dependent, but IIRC:

  • CPython uses an array of pointers
  • Jython uses an ArrayList
  • IronPython apparently also uses an array. You can browse the source code to find out.

Thus they all have O(1) random access.

NullUserException
Implementation dependent as in a python interpreter which implemented lists as linked lists would be a valid implementation of the python language? In other words: O(1) random access into lists is not guaranteed? Doesn't that make it impossible to write efficient code without relying on implementation details?
sepp2k
@sepp I believe lists in Python are just ordered collections; the implementation and/or performance requirements of said implementation are not explicitly stated
NullUserException
@sppe2k: Since Python doesn't really have a standard or formal specification (though there are some pieces of documentations that say "... is guaranteed to ..."), you can't be 100% sure as in "this is guaranteed by some piece of paper". But since `O(1)` for list indexing is a pretty common and valid assumption, no implementation would dare to break it.
delnan
@Paul It says nothing about how the underlying implementation of lists should be done.
NullUserException
It just doesn't happen to specify the big O running time of things. Language syntax specification doesn't necessarily mean the same thing as implementation details, it just happens to often be the case.
Paul McMillan
+6  A: 

In CPython, lists are arrays of pointers. Other implementations of Python may choose to store them in different ways.

Amber
+3  A: 

It's an array. Practical proof: Indexing takes (of course with extremely small differences (0.0013 µsecs!)) the same time regardless of index:

...>python -m timeit --setup="x = [None]*1000" "x[500]"
10000000 loops, best of 3: 0.0579 usec per loop

...>python -m timeit --setup="x = [None]*1000" "x[0]"
10000000 loops, best of 3: 0.0566 usec per loop

I would be astounded if IronPython or Jython used linked lists - they would ruin the performance of many many widely-used libraries built on the assumption that lists are arrays.

delnan
Nice proof of concept. My results for `x[0]` and `x[500]` weren't exactly equal, but they were very close (close enough to be rounding error or just differences in CPU usage). By the way, my CPU is better than yours :)
Rafe Kettler
@Ralf: I know my CPU (most other hardware too, for that matter) is old and dog slow - on the bright side, I can assume that code that runs fast enough for me is fast enough for all users :D
delnan
@delnan: Python almost always runs fast enough for all users
Rafe Kettler
@delnan: -1 Your "practical proof" is a nonsense, as are the 6 upvotes. About 98% of the time is taken up doing `x=[None]*1000`, leaving the measurement of any possible list access difference rather imprecise. You need to separate out the initialisation: `-s "x=[None]*100" "x[0]"`
John Machin
@John: Aaah yes, good catch... fixed and edited it (also now using len-1 (999) as first index).
delnan
@delnan: A smarter linked-list implementation of anything might keep pointers to both the first element and the last element. 999 is less "proof" than 500.
John Machin
Shows that it's not a naive implementation of a linked list. Doesn't definitively show that it's an array.
Michael Mior
A: 

The C code is pretty simple, actually. Expanding one macro and pruning some irrelevant comments, the basic structure is in listobject.h, which defines a list as:

typedef struct {
    PyObject_HEAD
    Py_ssize_t ob_size;

    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size
     *     ob_item == NULL implies ob_size == allocated == 0
     */
    Py_ssize_t allocated;
} PyListObject;

PyObject_HEAD contains a reference count and a type identifier. So, it's a vector/array with pre-allocation.

larsmans