tags:

views:

5607

answers:

8

If you are creating a 1d array, you can implement it as a List, or else use the 'array' module in the STDLIB. I have always used Lists for 1d arrays.

What is the reason or circumstance where I would want to use the array module instead?

Is it for performance and memory optimization, or am I missing something obvious?

+3  A: 

My understanding is that arrays are stored more efficiently (i.e. as contiguous blocks of memory vs. pointers to Python objects), but I am not aware of any performance benefit. Additionally, with arrays you must store primitives of the same type, whereas lists can store anything.

Ben Hoffstein
+14  A: 

For almost all cases the normal list is the right choice. The arrays module is more like a thin wrapper over C arrays, which give you kind of strongly typed containers (see docs), with access to more C-like types such as singed/unsinged short or double, which are not part of the built-in types. I'd say use the arrays module only if you really need it, in all other cases stick with lists.

André
Should be faster as well.
Mikael Jansson
Possible, never used it really though, but would be interesting to run some micro benchmarks.
André
Actually, I did a quick test - I timed summing a list with 100M entries and the same test with the corresponding array, and the list was actually about 10% faster.
Moe
@Moe My guess is that the lists are more optimizes b/c they are more widely used than the arrays module, but that's just a guess of course.
André
Lists are faster, because operations on array "raw" data need to continuously create and destroy python objects when reading from or writing to the array.
ΤΖΩΤΖΙΟΥ
+1  A: 

Array can only be used for specific types, whereas lists can be used for any object.

Arrays can also only data of one type, whereas a list can have entries of various object types.

Arrays are also more efficient for some numerical computation.

Hortitude
The builtin python arrays are not performance-wise efficient, only memory-wise.
ΤΖΩΤΖΙΟΥ
There ARE instances where arrays are more efficient in terms of processing. See my post below: http://stackoverflow.com/questions/176011/python-list-vs-array-when-to-use#178590
Jason Baker
A: 

If you're going to be using arrays, consider the numpy or scipy packages, which give you arrays with a lot more flexibility.

fivebells
+25  A: 

Basically, Python lists are very flexible and can hold completely heterogeneous, arbitrary data, and they can be appended to very efficiently, in amortized constant time. If you need to shrink and grow your array time-efficiently and without hassle, they are the way to go. But they use a lot more space than C arrays.

The array.array type, on the other hand, is just a thin wrapper on C arrays. It can hold only homogeneous data, all of the same type, and so it uses only sizeof(one object) * length bytes of memory. Mostly, you should use it when you need to expose a C array to an extension or a system call (for example, ioctl or fctnl). It's also a good way to represent a mutable string (array('B', bytes)) until that actually becomes available in Python 3.0.

However, if you want to do math on a homogeneous array of numeric data, then you're much better off using NumPy, which can automatically vectorize operations on complex multi-dimensional arrays.

To make a long story short: array.array is useful when you need a homogeneous C array of data for reasons other than doing math.

Dan
A: 

The standard library arrays are useful for binary I/O, such as translating a list of ints to a string to write to, say, a wave file. That said, as many have already noted, if you're going to do any real work then you should consider using NumPy.

giltay
+2  A: 

The array module is kind of one of those things that you probably don't have a need for if you don't know why you would use it (and take note that I'm not trying to say that in a condescending manner!). Most of the time, the array module is used to interface with C code. To give you a more direct answer to your question about performance:

Arrays are more efficient than lists for some uses. If you need to allocate an array that you KNOW will not change, then arrays can be faster and use less memory. GvR has an optimization anecdote in which the array module comes out to be the winner (long read, but worth it).

On the other hand, part of the reason why lists eat up more memory than arrays is because python will allocate a few extra elements when all allocated elements get used. This means that appending items to lists is faster. So if you plan on adding items, a list is the way to go.

TL;DR I'd only use an array if you had an exceptional optimization need or you need to interface with C code (and can't use pyrex).

Jason Baker
A: 

Hello everybody... I have a related question that may clarify the discussion this time around a very exceptional use: I'm using Numpy to set the following array that will be later populated:

LVArray = zeros([N, newsize]).

My variable newsize grows exponentially (newsze = (u*6)(u**3)) where u is a integration delta in a MCMC algorithm... whereas N is around 360, maximum a 1000. At a very early stage, the memory is not enough to even define this zeros' Numpy Array.

I'm looking for a better and efficient way of storing my data now that I cannot use my array (or generating it if you have any idea of creating a Python generator for this use) ... would a list do this? would an array.array handle this? Please discuss time consuption but I certaily care more about memory capability... And, can scipy do better things and how? Need ideas please, you are the best for that...

Thanks a lot for carrying a question in place of another answer here!

Angelica Echavarria

Angelica
this is not a discussion forum. you need to ask a question by clicking "ask question" at the top right of this page and delete this non-answers.
SilentGhost