views:

155

answers:

4

I've got a set of data that has three attributes, say A, B, and C, where A is kind of the index (i.e., A is used to look up the other two attributes.) What would be the best data structure for such data?

I used two dictionaries, with A as the index of each. However, there's key errors when the query to the data doesn't match any instance of A.

+3  A: 

Could you use a dictionary with A as the key, and the item in the dictionary as a 2-item tuple or list? If you're working with any of that data in a consistent way, you could also consider storing a class that has 2 properties in the dictionary.

Jordan
+2  A: 
AtoB = {"A1":"B1", "A2":"B2"}
AtoB.get("A3", None)
=> None
gilesc
This is probably the best answer if you know that your dictionary will never contain a specific value, here it's `None`
Manux
actually, just AtoB.get('A3') will be enough, since None is the default and he said keys/vals are strings. The only difference hence will be that while AtoB[] throws exception, .get() will return None instead
Nas Banov
A: 

I would suggest using a dictionary with A keyed to (B, C), so d = {A : (B, C)}.

Your problem of nonexistent keys resulting in errors can be solved in the following manner (thanks to intuited for the has_key function I forgot about):

if d.has_key(A):
    # do whatever you want
else:
    # A does not exist in your dictionary and you can handle this properly

Another option is the to do it in a try: except: block, but I tend to agree with intuited that if keys will be missing at a rate that isn't completely unusual, has_key will show better performance and (arguably) make better conceptual sense.

nearlymonolith
Wouldn't you rather follow the python EAFP?try: a in d.keys()catch KeyError: pass
mcpeterson
If it's not exceptional for a key to be missing from the dictionary, then it seems more appropriate to check first. d.has_key(A) is going to be a more efficient way to check, since it doesn't create a list of the entire key set. Also using a namedtuple (collections.namedtuple) can be a convenient and efficient way to allow readable references to B and C.
intuited
I guess that would be more "pythonic" (python wasn't my first language, so the pythonic method is still something I'm learning along the way). In that case I'd expect something more along the lines of try: d[A] catch KeyError: #something else. See edited answer.
nearlymonolith
If there's a reasonable default value then `d.get(A)` would be the best pattern, and also the fastest. Otherwise it is best/fastest/easiest-to-read-and-type to just test like you do, except to use `A in d` which is a more modern convention.(You definitely should *not* use `A in d.keys()` as that enumerates all the keys into a list *every* time you run that test.)
Ian Bicking
You dont even need to generate the key list(`d.keys()`), just use the syntax `if key in dict`.
Manux
In the comments above: `try ... catch KeyError ...` should be `try: ... except KeyError: ...`. There is no `catch` keyword in python.
Michael Dunn
Note that `has_key` is deprecated in future versions of Python. It's best to start using the `if key in dict` syntax instead.
Alex JL
A: 

As an alternative to a dict of tuples, you could use an object to store B and C:

class Thing:
    def __init__(self, B=None, C=None):
        self.B = B
        self.C = C

stuff = {A1: Thing(B1, C1)}
B1 = stuff[A1].B

processTriple(A1, stuff[A1].B, stuff[A1].C)

nothing = stuff.get(A2)

While it's a tiny bit wasteful, you could store A in the object too so that each instance represents a complete triple. Then you can pass those around more effectively than the sample above.

Walter Mundt