views:

365

answers:

9

Say you want to pass a dictionary of values to a function, or otherwise want to work with a short-lived dictionary that won't be reused. There are two easy ways to do this:

Use the dict() function to create a dictionary:

foo.update(dict(bar=42, baz='qux'))

Use an anonymous dictionary:

foo.update({'bar': 42, 'baz': 'qux'})

Which do you prefer? Are there reasons other than personal style for choosing one over the other?

+2  A: 

I prefer your "anonymous dictionary" method and I think this is purely a personal style thing. I just find the latter version more readable but it's also what I'm used to seeing.

Swingley
+1  A: 

I prefer the anonymous dictionary, too, just out of personal style.

Raphink
+10  A: 

I prefer the anonymous dict option.

I don't like the dict() option for the same reason I don't like:

 i = int("1")

With the dict() option you're needlessly calling a function which is adding overhead you don't need:

>>> from timeit import Timer
>>> Timer("mydict = {'a' : 1, 'b' : 2, 'c' : 'three'}").timeit()
0.91826782454194589
>>> Timer("mydict = dict(a=1, b=2, c='three')").timeit()
1.9494664824719337
Dave Webb
So explicit, if only more python looked like this!
dagoof
I've been using anonymous dicts for years (starting back in my Perl days). I'd never actually thought to wonder why until this morning, and your answer is pretty close to what I'd felt but never formally expressed.
Just Some Guy
+1  A: 

If I have a lot of arguments, sometimes it is nice to omit the quotes on the keys:

DoSomething(dict(
   Name = 'Joe',
   Age = 20,
   Gender = 'Male',
   ))

This is a very subjective question, BTW. :)

gahooa
OK, it's a little subjective. But if everyone's come to the same subjective conclusion, then it seemed prudent to investigate it. :-)
Just Some Guy
+1  A: 

I think the dict() function is really there for when you're creating a dict from something else, maybe something that easily produces the necessary keyword args. The anonymous method is best for 'dict literals' in the same way you'd use "" for strings, not str().

Aaron Lockey
+5  A: 

I think in this specific case I'd probably prefer this:

foo.update(bar=42, baz='qux')

In the more general case, I often prefer the literal syntax (what you call an anonymous dictionary, though it's just as anonymous to use {} as it is to use dict()). I think that speaks more clearly to the maintenance programmer (often me), partly because it stands out so nicely with syntax-highlighting text editors. It also ensures that when I have to add a key which is not representable as a Python name, like something with spaces, then I don't have to go and rewrite the whole line.

Peter Hansen
Excellent points! Thanks.
Just Some Guy
+2  A: 

The dict() method has the added overhead of a function call.

>>>import timeit,dis
>>> timeit.Timer("{'bar': 42, 'baz': 'qux'}").repeat()
[0.59602910425766709, 0.60173793037941437, 0.59139834811408321]
>>> timeit.Timer("dict(bar=42, baz='qux')").repeat()
[0.98166498814792646, 0.97745355904172015, 0.99231773870701545]

>>> dis.dis(compile("{'bar': 42, 'baz': 'qux'}","","exec"))
  1           0 BUILD_MAP                0
              3 DUP_TOP
              4 LOAD_CONST               0 (42)
              7 ROT_TWO
              8 LOAD_CONST               1 ('bar')
             11 STORE_SUBSCR
             12 DUP_TOP
             13 LOAD_CONST               2 ('qux')
             16 ROT_TWO
             17 LOAD_CONST               3 ('baz')
             20 STORE_SUBSCR
             21 POP_TOP
             22 LOAD_CONST               4 (None)
             25 RETURN_VALUE

>>> dis.dis(compile("dict(bar=42, baz='qux')","","exec"))
  1           0 LOAD_NAME                0 (dict)
              3 LOAD_CONST               0 ('bar')
              6 LOAD_CONST               1 (42)
              9 LOAD_CONST               2 ('baz')
             12 LOAD_CONST               3 ('qux')
             15 CALL_FUNCTION          512
             18 POP_TOP
             19 LOAD_CONST               4 (None)
             22 RETURN_VALUE
Mark Peters
cool. but the anonymous dict seems to do more operations .. what do those number columns represent? i did not find it in the docs
mykhal
A: 

Actually, if the receiving function will only receive a dictionary with not pre-dertermined keywords, I normally use the ** passing convention.

In this example, that would be:

class Foo(object):
   def update(self, **param_dict):
       for key in param_dict:
          ....
foo = Foo()
....
foo.update(bar=42, baz='qux')
jsbueno
+1  A: 

My answer will largely talk about the design of APIs to use dicts vs. keyword args. But it's also applicable the individual use of {...} vs. dict(...).

The bottom line: be consistent. If most of your code will refer to 'bar' as a string - keep it a string in {...}; if you normally refer to it the identifier bar - use dict(bar=...).

Constraints

Before talking about style, note that the keyword bar=42 syntax works only for strings and only if they are valid identifiers. If you need arbitrary punctuation, spaces, unicode - or even non-string keys - the question is over => only the {'bar': 42} syntax will work.

This also means that when designing an API, you must allow full dicts, and not only keyword arguments - unless you are sure that only strings, and only valid identifiers are allowed. (Technically, update(**{'spaces & punctuation': 42}) works. But it's ugly. And numbers/tuples/unicode won't work.)

Note that dict() and dict.update() combine both APIs: you can pass a single dict, you can pass keyword args, and you can even pass both (the later I think is undocumented). So if you want to be nice, allow both:

def update(self, *args, **kwargs):
    """Callable as dict() - with either a mapping or keyword args:

    .update(mapping)
    .update(**kwargs)
    """
    mapping = dict(*args, **kwargs)
    # do something with `mapping`...

This is especially recommended for a method named .update(), to follow the least-surprise rule.

Style

I find it nice to distinguish internal from external strings. By internal I mean arbitrary identifiers denoting something only inside the program (variable names, object attributes) or possibly between several programs (DB columns, XML attribute names). They are normally only visible to developers. External strings are intended for human consumption.

[Some Python coders (me included) observe the convention of using 'single_quotes' for internal strings vs. "Double quotes" for external strings. This is definitely not universal, though.]

Your question is about the proper uses of barewords (Perl term) - syntax sugars allowing to omit the quotes quotes altogether on internal strings. Some languages (notably LISP) allow them widely; the Pythonic opportunities to employ barewords are attribute access - foo.bar and keyword arguments - update(bar=...).

The stylistic dilemma here is "Are your strings internal enough to look like identifiers?"

If the keys are external strings, the answer is definitely NO:

foo.update({"The answer to the big question": 42})

# which you later might access as:
foo["The answer to the big question"]

If the keys refer to Python identifiers (e.g. object attributes), then I'd say YES:

foo.update(dict(bar=42))
# As others mentioned, in that case the cleaner API (if possible)
# would be to receive them as **kwargs directly:
foo.update(bar=42)

# which you later might access as:
foo.bar

If the keys refer to identifiers outside your Python program, such as XML attr names, or DB column names, using barewords may be good or bad choice - but you it's best to choose one style and be consistent.

Consistency is good because there is a psychological barrier between identifiers and strings. It exists because strings rarely cross it - only when using introspection to do meta-programming. And syntax highlighting only reinforces it. So if you read the code and see a green 'bar' in one place and a black foo.bar in a second place, you won't immediately make a connection.

Another important rule of thumb is: Barewords are good iff they are (mostly) fixed. E.g. if you refer to fixed DB columns mostly in your code, than using barewords to refer to them might be nice; but if half the time the column is a parameter, then it's better to use strings.

This is because parameter/constant is the most important difference people associate with the identifiers/strings barrier. The difference between column (variable) and "person" (constant) is the most readable way to convey this difference. Making them both identifiers would blur the distinction, as well as backfiring syntactically - you'd need to use **{column: value} and getattr(obj, column) etc. a lot.

Beni Cherniavsky-Paskin