views:

1276

answers:

8

I've discovered a new pattern. I wonder if anyone else has this pattern or has any opinion about it.

Basically, I have a hard time scrubbing up and down source files to figure out what module imports are available and so forth, so now, instead of

import foo
from bar.baz import quux

def myFunction():
    foo.this.that(quux)

I move all my imports into the function where they're actually used., like this:

def myFunction():
    import foo
    from bar.baz import quux

    foo.this.that(quux)

This does a few things. First, I rarely accidentally pollute my modules with the contents of other modules. I could set the __all__ variable for the module, but then i'd have to update it as the module evolves, and that doesn't help the namespace pollution for code that actually lives in the module.

Second, I rarely end up with a litany of imports at the top of my modules, half or more of which I no longer need because I've refactored it. Finally, I find this pattern MUCH easier to read, since every referenced name is right there in the function body.

+13  A: 

A few problems with this approach:

  • It's not immediately obvious when opening the file which modules it depends on.
  • It will confuse programs that have to analyze dependencies, such as py2exe, py2app etc.
  • What about modules that you use in many functions? You will either end up with a lot of redundant imports or you'll have to have some at the top of the file and some inside functions.

So... the preferred way is to put all imports at the top of the file. I've found that if my imports get hard to keep track of, it usually means I have too much code that I'd be better off splitting it into two or more files.

Some situations where I have found imports inside functions to be useful:

  • To deal with circular dependencies (if you really really can't avoid them)
  • Platform specific code

Also: putting imports inside each function is actually not appreciably slower than at the top of the file. The first time each module is loaded it is put into sys.modules, and each subsequent import costs only the time to look up the module, which is fairly fast (it is not reloaded).

dF
+1: And it's slow. Each function call has to repeat the import module checking.
S.Lott
+23  A: 

This does have a few disadvantages.

Performance

The import statement is executed every time the function is called, this has a non-trivial performance cost.

Top Import

import random

def f():
    L = []
    for i in xrange(1000):
        L.append(random.random())

for i in xrange(1000):
    f()


$ time python test.py
real    0m0.857s
user    0m0.848s
sys     0m0.008s

Import in Function Body

def f():
    L = []
    for i in xrange(1000):
        import random
        L.append(random.random())

for i in xrange(1000):
    f()

$ time python test2.py

real    0m2.850s
user    0m2.836s
sys     0m0.012s

Testing

On the off chance you want to test your module through runtime modification, it may make it more difficult. Instead of doing

import mymodule
mymodule.othermodule = module_stub

You'll have to do

import othermodule
othermodule.foo = foo_stub

This means that you'll have to patch the othermodule globally, as opposed to just change what the reference in mymodule points to.

Dependency Tracking

This makes it non-obvious what modules your module depends on. This is especially irritating if you use many third party libraries or are re-organizing code.

I had to maintain some legacy code that used imports inline all over the place, it made the code extremely difficult to refactor or repackage.

Ryan
You might want to clarify this -- imports are checked every time, but the module is only loaded once.
S.Lott
Thanks for the input. even if modules are cached, it still *does* have a large performance impact, as you can see from my tests.
Ryan
Yes, but now you made it clear. It was very misleading. Removed my negative vote
nosklo
Not a great example since you put the import _inside_ the for loop rather than just inside the definition of f(). But, yes, in general the local import does have a cost.
davidavr
I mostly did that out of laziness (nested loop to get 10^6 executions) instead of a single loop with xrange(10**6). Performance should be similar if I ran it with an un-nested loop and upped the count in the test body.
Ryan
being lazy about lazy-loading .... hmmm
fuentesjr
+2  A: 

From a performance point of view, you can see this: Should Python import statements always be at the top of a module?

In general, I only use local imports in order to break dependency cycles.

sykora
a suggestion: break dependency cycles by placing all stuff both modules need into a third module. Have both modules import this third one.
nosklo
@nosklo: Excellent suggestion. It's trivial to break dependency cycles in Python through refactoring.
S.Lott
+2  A: 

Another useful thing to note is that parts of using "import" inside of a function have been completely removed in Python 3.0.

There is a brief mention of it under "Removed Syntax" here:

http://docs.python.org/3.0/whatsnew/3.0.html

Russell Bryant
-1: Wrong. Only the "from xxx import *" form has been disabled from functions.
nosklo
He said that *parts* of import have been disabled. Don't be so quick to down-vote people who give useful information.
Daniel
+2  A: 

I believe this is a recommended approach in some cases/scenarios. For example in Google App Engine lazy-loading big modules is recommended since it will minimize the warm-up cost of instantiating new Python VMs/interpreters. Have a look at a Google Engineer's presentation describing this. However keep in mind this doesn't mean you should lazy-load all your modules.

fuentesjr
+3  A: 

You might want to take a look at Import statement overhead in the python wiki. In short: if the module has already been loaded (look at sys.modules) your code will run slower. If your module hasn't been loaded yet, and will foo will only get loaded when needed, which can be zero times, then the overall performance will be better.

RSabet
A: 

I would suggest that you try to avoid from foo import bar imports. I only use them inside packages, where the splitting into modules is an implementation detail and there won't be many of them anyway.

In all other places, where you import a package, just use import foo and then reference it by the full name foo.bar. This way you can always tell where a certain element comes from and don't have to maintain the list of imported elements (in reality this will always be outdated and import no longer used elements).

If foo is a really long name you can simplify it with import foo as f and then write f.bar. This is still far more convenient and explicit than maintaining all the from imports.

nikow
+1  A: 

People have explained very well why to avoid inline-imports, but not really alternative workflows to address the reasons you want them in the first place.

I have a hard time scrubbing up and down source files to figure out what module imports are available and so forth

To check for unused imports I use pylint. It does static(ish)-analysis of Python code, and one of the (many) things it checks for is unused imports. For example, the following script..

import urllib
import urllib2

urllib.urlopen("http://stackoverflow.com")

..would generate the following message:

example.py:2 [W0611] Unused import urllib2

As for checking available imports, I generally rely on TextMate's (fairly simplistic) completion - when you press Esc, it completes the current word with others in the document. If I have done import urllib, urll[Esc] will expand to urllib, if not I jump to the start of the file and add the import.

dbr