views:

359

answers:

3

What is the typical underlying data structure used to implement Python's built-in list data type?

+11  A: 

CPython:

typedef struct {
    PyObject_VAR_HEAD
    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size
     *     ob_item == NULL implies ob_size == allocated == 0
     * list.sort() temporarily sets allocated to -1 to detect mutations.
     *
     * Items must normally not be NULL, except during construction when
     * the list is not yet visible outside the function that builds it.
     */
    Py_ssize_t allocated;
} PyListObject;

As can be seen on the following line, the list is declared as an array of pointers to PyObjects.

PyObject **ob_item;
Georg
+5  A: 

In the Jython implementation, it's an ArrayList<PyObject>.

nategood
+10  A: 

List objects are implemented as arrays. They are optimized for fast fixed-length operations and incur O(n) memory movement costs for pop(0) and insert(0, v) operations which change both the size and position of the underlying data representation.

See also: http://docs.python.org/library/collections.html#collections.deque

Btw, I find it interesting that the Python tutorial on data structures recommends using pop(0) to simulate a queue but does not mention O(n) or the deque option.

http://docs.python.org/tutorial/datastructures.html#using-lists-as-queues

e1i45
Very good point about the tutorial! That should be fixed.
Brandon Craig Rhodes
The tutorial existed long long before the deque module, that's why.Report it to bugs.python.org , if possible with a patch for a correct sentence and the tutorial will no longer gives incorrect hints.
Bluebird75