views:

335

answers:

5

I discovered this pattern (or anti-pattern) and I am very happy with it.

I feel it is very agile:

def example():
    age = ...
    name = ...
    print "hello %(name)s you are %(age)s years old" % locals()

Sometimes I use its cousin:

def example2(obj):
    print "The file at %(path)s has %(length)s bytes" % obj.__dict__

I don't need to create an artificial tuple and count parameters and keep the %s matching positions inside the tuple.

Do you like it? Do/Would you use it? Yes/No, please explain.

+5  A: 

I think it is a great pattern because you are leveraging built-in functionality to reduce the code you need to write. I personally find it quite Pythonic.

I never write code that I don't need to write - less code is better than more code and this practice of using locals() for example allows me to write less code and is also very easy to read and understand.

Andrew Hare
+3  A: 

The "%(name)s" % <dictionary> or even better, the "{name}".format(<parameters>) have the merit of

  • being more readable than "%0s"
  • being independent from the argument order
  • not compelling to use all the arguments in the string

I would tend to favour the str.format(), since it should be the way to do that in Python 3 (as per PEP 3101), and is already available from 2.6. With locals() though, you would have to do this:

print("hello {name} you are {age} years old".format(**locals()))
RedGlyph
+2  A: 

Using the built-in vars([object]) (documentation) might make the second look better to you:

def example2(obj):
    print "The file at %(path)s has %(length)s bytes" % vars(obj)

The effect is of course just the same.

kaizer.se
+1  A: 

Never in a million years. It's unclear what the context for formatting is: locals could include almost any variable. self.__dict__ is not as vague. Perfectly awful to leave future developers scratching their heads over what's local and what's not local.

It's an intentional mystery. Why saddle your organization with future maintenance headaches like that?

S.Lott
Why is it unclear? It looks perfectly clear to me.
alex tingle
The context is the function in which locals() appears. If it's a nice short function, you can just look at the vars. If it's a very long function, it should be refactored. How is self.__dict__ any clearer?
foosion
I don't understand why referencing a local variable name in a format string is any more unclear than referencing a local variable name in code.
Robert Rossney
A local variable name is clear. `locals()` is unclear.
S.Lott
`self.__dict__` is based on the class definition -- which is often tied up with the `__init__()` method and carefully documented in the doc string. `locals()` is often a fairly random collection of names.
S.Lott
I don't particularly like it either, but not because it's unclear. Python's scoping is simpler than many other languages, but a design pattern like this still seems to be asking for trouble, because you WILL find yourself getting lazy and using outside of a specifically defined function, and then you WILL run into scoping problems.
Paul McMillan
@Robert Rossney: I never said a variable _name_ was unclear. I said `locals()` is unclear because the variable names are harder to find. they're buried in the format string.
S.Lott
+14  A: 

It's OK for small applications and allegedly "one-off" scripts, especially with the vars enhancement mentioned by @kaizer.se and the .format version mentioned by @RedGlyph.

However, for large applications with a long maintenance life and many maintainers this practice can lead to maintenance headaches, and I think that's where @S.Lott's answer is coming from. Let me explain some of the issues involved, as they may not be obvious to anybody who doesn't have the scars from developing and maintaining large applications (or reusable components for such beasts).

In a "serious" application, you would not have your format string hard-coded -- or, if you had, it would be in some form such as _('Hello {name}.'), where the _ comes from gettext or similar i18n / L10n frameworks. The point is that such an application (or reusable modules that can happen to be used in such applications) must support internationalization (AKA i18n) and locatization (AKA L10n): you want your application to be able to emit "Hello Paul" in certain countries and cultures, "Hola Paul" in some others, "Ciao Paul" in others yet, and so forth. So, the format string gets more or less automatically substituted with another at runtime, depending on the current localization settings; instead of being hardcoded, it lives in some sort of database. For all intents and purposes, imagine that format string always being a variable, not a string literal.

So, what you have is essentially

formatstring.format(**locals())

and you can't trivially check exactly what local names the formatting is going to be using. You'd have to open and peruse the L10N database, identify the format strings that are going to be used here in different settings, verify all of them.

So in practice you don't know what local names are going to get used -- which horribly crimps the maintenance of the function. You dare not rename or remove any local variable, as it might horribly break the user experience for users with some (to you) obscure combinaton of language, locale and preferences

If you have superb integration / regression testing, the breakage will be caught before the beta release -- but QA will scream at you and the release will be delayed... and, let's be honest, while aiming for 100% coverage with unit tests is reasonable, it really isn't with integration tests, once you consider the combinatorial explosion of settings [[for L10N and for many more reasons]] and supported versions of all dependencies. So, you just don't blithely go around risking breakages because "they'll be caught in QA" (if you do, you may not last long in an environment that develops large apps or reusable components;-).

So, in practice, you'll never remove the "name" local variable even though the User Experience folks have long switched that greeting to a more appropriate "Welcome, Dread Overlord!" (and suitably L10n'ed versions thereof). All because you went for locals()...

So you're accumulating cruft because of the way you've crimped your ability to maintain and edit your code -- and maybe that "name" local variable only exists because it's been fetched from a DB or the like, so keeping it (or some other local) around is not just cruft, it's reducing your performance too. Is the surface convenience of locals() worth that?-)

But wait, there's worse! Among the many useful services a lint-like program (like, for example, pylint) can do for you, is to warn you about unused local variables (wish it could do it for unused globals as well, but, for reusable components, that's just a tad too hard;-). This way you'll catch most occasional misspellings such as if ...: nmae = ... very rapidly and cheaply, rather than by seeing a unit-test break and doing sleuth work to find out why it broke (you do have obsessive, pervasive unit tests that would catch this eventually, right?-) -- lint will tell you about an unused local variable nmae and you will immediately fix it.

But if you have in your code a blah.format(**locals()), or equivalently a blah % locals()... you're SOL, pal!-) How is poor lint going to know whether nmae is in fact an unused variable, or actually it does get used by whatever external function or method you're passing locals() to? It can't -- either it's going to warn anyway (causing a "cry wolf" effect that eventually leads you to ignore or disable such warnings), or it's never going to warn (with the same final effect: no warnings;-).

Compare this to the "explicit is better than implicit" alternative...:

blah.format(name=name)

There -- none of the maintenance, performance, and am-I-hampering-lint worries, applies any more; bliss! You make it immediately clear to everybody concerned (lint included;-) exactly what local variables are being used, and exactly for what purposes.

I could go on, but I think this post is already pretty long;-).

So, summarizing: "γνῶθι σεαυτόν!" Hmm, I mean, "know thyself!". And by "thyself" I actually mean "the purpose and scope of your code". If it's a 1-off-or-thereabouts thingy, never going to be i18n'd and L10n'd, will hardly need future maintenance, will never be reused in a broader context, etc, etc, then go ahead and use locals() for its small but neat convenience; if you know otherwise, or even if you're not entirely certain, err on the side of caution, and make things more explicit -- suffer the small inconvenience of spelling out exactly what you're going, and enjoy all the resulting advantages.

BTW, this is just one of the examples where Python is striving to support both "small, one-off, exploratory, maybe interactive" programming (by allowing and supporting risky conveniences that extend well beyond locals() -- think of import *, eval, exec, and several other ways you can mush up namespaces and risk maintenance impacts for the sake of convenience), as well as "large, reusable, enterprise-y" apps and components. It can do a pretty good job at both, but only if you "know thyself" and avoid using the "convenience" parts except when you're absolutely certain you can in fact afford them. More often than not, the key consideration is, "what does this do to my namespaces, and awareness of their formation and use by the compiler, lint &c, human readers and maintainers, and so on?".

Remember, "Namespaces are one honking great idea -- let's do more of those!" is how the Zen of Python concludes... but Python, as a "language for consenting adults", lets you define the boundaries of what that implies, as a consequence of your development environment, targets, and practices. Use this power responsibly!-)

Alex Martelli
An excellent answer. I think most programs are not internationalized, and so this is not a problem in a very large number of cases. But in those cases, yes, string interpolation is bad.
Paul Biggar
@Paul, I beg to differ: string interpolation is _excellent_, _especially_ for i18n/L10n support -- it just needs to happen on explicitly named variables! The problem is not with interpolation, it's with passing `locals()` to external functions or methods. BTW, Python's growing support for Unicode (now the default text string in `3.*`) is exactly an attempt to help change the fact that "most programs are not i18n'd" -- many more _should_ be, than currently _are_; as the 'net (via smartphones, netbooks, -).
Alex Martelli
I think there is the possibility for fiddling with locales/gettext to insert `{self}`,`{password}` or other objects you don't expect shown into the format string. This may be a security risk. Best to be explicit for real code
gnibbler
@Alex: I would define `"hello %(name)s you are %(age)s years old" % locals()` to be string interpolation (in the style of Perl or PHP anyway). Have you a different definition?
Paul Biggar
@Paul, I call this "string formatting" -- the names are totally arbitrary (it's obviously the same statement whether you're using a specially made dict or an existing one), and there is no interpolation involved -- see the definition of interpolation at http://en.wikipedia.org/wiki/Interpolation . Merging an 'a' and a 'c' to get a 'b', now **that** would be an interpolation of strings.
Alex Martelli
@Alex: In PHP or Perl, `"hello {$name}s you are $age years old"` is string interpolation. This is the same pattern in Python. I presume you're kidding about the interpolation definition--"string interpolation" has long been a term to describe this type of string formatting. Its even the title of PEP215.
Paul Biggar
Alex Martelli