views:

202

answers:

5

I'm working through some tutorials on Python and am at a position where I am trying to decide what data type/structure to use in a certain situation.

I'm not clear on the differences between arrays, lists, dictionaries and tuples.

How do you decide which one is appropriate - my current understanding doesn't let me distinguish between them at all - they seem to be the same thing.

What are the benefits/typical use cases for each one?

A: 

Do you really require speed/efficiency? Then go with a pure and simple dict.

jldupont
+3  A: 

Best type for counting elements like this is usually defaultdict

from collections import defaultdict

s = 'asdhbaklfbdkabhvsdybvailybvdaklybdfklabhdvhba'
d = defaultdict(int)

for c in s:
   d[c] += 1

print d['a']   # prints 7
Triptych
Why would a defaultdict be better than an ordinary dict?
Lennart Regebro
defaultdict creates default values for any keys not already in the dict. in the case of int, the default is 0. This saves you from having to detect the first time and item is recorded in the dict.
Donal Boyle
I've modified the question quite substantially, so for new readers this answer might seem out of context. It was a good one for the original question though!
Rich Bradshaw
+6  A: 

How do you decide which data type to use? Easy:

You look at which are available and choose the one that does what you want. And if there isn't one, you make one.

In this case a dict is a pretty obvious solution.

Lennart Regebro
+1. Your second paragraph is precisely the right answer here.
Daniel Pryden
+3  A: 

Tuples first. These are list-like things that cannot be modified. Because the contents of a tuple cannot change, you can use a tuple as a key in a dictionary. That's the most useful place for them in my opinion. For instance if you have a list like item = ["Ford pickup", 1993, 9995] and you want to make a little in-memory database with the prices you might try something like:

ikey = tuple(item[0], item[1])
idata = item[2]
db[ikey] = idata

Lists, seem to be like arrays or vectors in other programming languages and are usually used for the same types of things in Python. However, they are more flexible in that you can put different types of things into the same list. Generally, they are the most flexible data structure since you can put a whole list into a single list element of another list, but for real data crunching they may not be efficient enough.

a = [1,"fred",7.3]
b = []
b.append(1)
b[0] = "fred"
b.append(a) # now the second element of b is the whole list a

Dictionaries are often used a lot like lists, but now you can use any immutable thing as the index to the dictionary. However, unlike lists, dictionaries don't have a natural order and can't be sorted in place. Of course you can create your own class that incorporates a sorted list and a dictionary in order to make a dict behave like an Ordered Dictionary. There are examples on the Python Cookbook site.

c = {}
d = ("ford pickup",1993)
c[d] = 9995

Arrays are getting closer to the bit level for when you are doing heavy duty data crunching and you don't want the frills of lists or dictionaries. They are not often used outside of scientific applications. Leave these until you know for sure that you need them.

Lists and Dicts are the real workhorses of Python data storage.

Michael Dillon
I'd add that tuples correspond most directly to mathematical tuples (pairs, triples, etc.), as opposed to lists which are sequences of objects. So use a tuple when you have a collection of things which comprise one item (e.g. x and y coordinates), and lists when they are conceptually separate items.
Michael E
A: 

Personal: I mostly work with lists and dictionaries. It seems that this satisfies most cases.

Sometimes: Tuples can be helpful--if you want to pair/match elements. Besides that, I don't really use it.

However: I write high-level scripts that don't need to drill down into the core "efficiency" where every byte and every memory/nanosecond matters. I don't believe most people need to drill this deep.

TIMEX