views:

387

answers:

8

Is it bad practice to do the following and not explicitly handle a file object and call its close() method?

for line in open('hello.txt'):
    print line

NB - this is for versions of Python that do not yet have the with statement.

I ask as the Python documentation seems to recommend this :-

f = open("hello.txt")
try:
    for line in f:
        print line
finally:
    f.close()

Which seems more verbose than necessary.

+1  A: 

You need to close handles so that memory is freed up. Not really needed until dealing with a lot of files at a time.

Sarfraz
It's still generally considered good practice to close the file. I suppose it won't really harm anything to leave one or two files open, but it's not *that* much work to close it.
Jason Baker
A: 

Yes, because otherwise you might leak resources.

When you’re done with a file, call f.close() to close it and free up any system resources taken up by the open file.

From the Python docs.

This'll happen for you when the program exits, but otherwise Python's keeping around resources it no longer needs up to that point.

Dominic Rodger
+3  A: 

Upon exit the Python interpreter (or the kernel in the event of a crash) will close the file, but it's still a good practice to close them when you don't need them. For 1 or 2, or 10, files it might not be a problem, but for more it might bring the whole system down.

Most importantly, it is a sign that the person who wrote the code actually cares about his work.

Emil Ivanov
Actually, you don't even have to wait for the process to stop. It will be closed when garbage collected. See my answer for more details.
Jason Baker
+2  A: 

It's kind of hinted at all over the place, but to make it the most clear, yes, you need to close that file. In Python 2.5 (using future) and in Python 2.6, you no longer need the wordy version:

from __future__ import with_statement
with open("hello.txt") as f:
    for line in f:
        print line
Douglas Mayle
+11  A: 

Close is always necessary when dealing with files, it is not a good idea to leave open file handles all over the place. They will eventually be closed when the file object is garbage collected but you do not know when that will be and in the mean time you will be wasting system resources by holding to file handles you no longer need.

If you are using Python 2.5 and higher the close() can be called for you automatically using the with statement:

from __future__ import with_statement # Only needed in Python 2.5
with open("hello.txt") as f:
    for line in f:
        print line

This is has the same effect as the code you have:

f = open("hello.txt")
try:
    for line in f:
        print line
finally:
    f.close()

The with statement is direct language support for the Resource Acquisition Is Initialization idiom commonly used in C++. It allows the safe use and clean up of all sorts of resources, for example it can be used to always ensure that database connections are closed or locks are always released like below.

mylock = threading.Lock()
with mylock:
    pass # do some thread safe stuff
Tendayi Mawushe
in small, short-running scripts it may be prudent to trade off clarity of the code against having one open file hanging arodund for a short time.
hop
The `with` statement is simple enough that it's hard to justify not using it in the interest of clarity.
Robert Rossney
true that for 2.5+
hop
+7  A: 

In actuality, the file will be closed when it is garbage collected. See this question for more on how that works.

It is still recommended that you use a try/finally block or a with statement though. If there is an exception when using one of the file object's methods, a reference will be stored in the traceback (which is stored as a global variable) until you clear it or another exception occurs.

Thus, it's bad to rely on garbage collection to close your file for you.

Also, if you've written to the file, you can't guarantee that the changes will be saved to the file until it is closed or flushed.

Jason Baker
+1 rest of the answers are just quoting the docs, you gave a good explanation for using try/finally.
Sushant
A: 

No, I don't believe the longer idiom is necessary, and here is why:

I grepped /usr/lib/python2.6/ for the pattern 'for\s+.*\s+in\s+open\(' and found many examples of

for line in open('hello.txt'):
    print line

and so far zero instances of

f = open("hello.txt")
try:
    for line in f:
        print line
finally:
    f.close()

See below for the list of files in the standard library that use the for ... in open idiom.

This naturally leads to the question: If the Python developers accept the shorter idiom in the standard libraries, how can we be improving anything by using something different in our own code if our code depends on standard libraries?

I think the answer is, the longer idiom does not improve anything.

I also ran

#!/usr/bin/env python
try:
    for i,line in enumerate(open('a')):
        print line
        raw_input()
        if i==5:
            break
except Exception:
    pass

raw_input()

and checked /proc/PID/fd for when the file descriptor was closed. It appears that when you break out of the for loop, the file is closed for you.

On the basis of these experiments, I do not believe the long try...finally...close idiom is necessary.

Here is the result of the grep:

/usr/lib/python2.6/dist-packages/NvidiaDetector/nvidiadetector.py:89:tempList = [ x.strip() for x in open(obsolete).readlines() ]
/usr/lib/python2.6/dist-packages/rpy_io.py:49:for line in open(file).readlines():
/usr/lib/python2.6/dist-packages/setuptools/command/easy_install.py:1376:for line in open(self.filename,'rt'):
/usr/lib/python2.6/dist-packages/GDebi/DscSrcPackage.py:47:for line in open(file):
/usr/lib/python2.6/dist-packages/aptsources/distinfo.py:220:[x.strip() for x in open(value)])
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeCache.py:989:for line in open("/proc/mounts"):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeAufs.py:100:for line in open("/proc/mounts"):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeAufs.py:205:for line in open("/proc/mounts"):
/usr/lib/python2.6/dist-packages/DistUpgrade/distinfo.py:220:[x.strip() for x in open(value)])
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeViewKDE.py:826:for c in open(sys.argv[2]).read():
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeConfigParser.py:45:items = [x.strip() for x in open(p)]
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:684:for line in open(cpuinfo):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:692:for line in open("/proc/mounts"):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:726:for line in open("/etc/fstab"):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:762:for line in open(fstab):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:801:for line in open("/etc/fstab"):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:874:for line in open(XORG):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeQuirks.py:939:for line in open(os.path.join(modaliasesdir,filename)):
/usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeController.py:1307:for line in open(template):
/usr/lib/python2.6/dist-packages/DistUpgrade/xorg_fix_proprietary.py:23:for raw in open(xorg_source):
/usr/lib/python2.6/dist-packages/DistUpgrade/xorg_fix_proprietary.py:58:for line in open(xorg):
/usr/lib/python2.6/dist-packages/DistUpgrade/xorg_fix_proprietary.py:82:for line in open(xorg):
/usr/lib/python2.6/dist-packages/jockey/oslib.py:377:for line in open(self.apt_jockey_source):
/usr/lib/python2.6/dist-packages/jockey/oslib.py:393:for line in open(f):
/usr/lib/python2.6/dist-packages/jockey/backend.py:651:for line in open(path):
/usr/lib/python2.6/dist-packages/jockey/detection.py:277:for line in open(alias_file):
/usr/lib/python2.6/dist-packages/jockey/detection.py:597:for l in open(os.path.join(path, 'uevent')):
/usr/lib/python2.6/dist-packages/apt/cdrom.py:83:for line in open(fname):
/usr/lib/python2.6/dist-packages/problem_report.py:1119:for line in open('/proc/mounts'):
/usr/lib/python2.6/dist-packages/apport/packaging_impl.py:128:for line in open(f):
/usr/lib/python2.6/dist-packages/apport/packaging_impl.py:190:for line in open(sumfile):
/usr/lib/python2.6/dist-packages/apport/packaging_impl.py:641:for l in open('/etc/apt/sources.list'):
/usr/lib/python2.6/dist-packages/apport/hookutils.py:190:for line in open('/proc/asound/cards'):
/usr/lib/python2.6/dist-packages/apport/hookutils.py:290:for line in open('/var/log/syslog'):
/usr/lib/python2.6/dist-packages/apport/hookutils.py:493:mods = [l.split()[0] for l in open(module_list)]
/usr/lib/python2.6/dist-packages/softwareproperties/SoftwareProperties.py:597:for line in open(f):
/usr/lib/python2.6/dist-packages/softwareproperties/gtk/SoftwarePropertiesGtk.py:883:for x in open(tmp.name):
/usr/lib/python2.6/dist-packages/lsb_release.py:253:for line in open('/etc/lsb-release'):
/usr/lib/python2.6/dist-packages/numpy/distutils/system_info.py:815:for d in open(ld_so_conf,'r').readlines():
/usr/lib/python2.6/dist-packages/LanguageSelector/LocaleInfo.py:72:for line in open(languagelist_file):
/usr/lib/python2.6/dist-packages/LanguageSelector/LocaleInfo.py:187:for line in open(environment).readlines():
/usr/lib/python2.6/dist-packages/LanguageSelector/LocaleInfo.py:193:for line in open(environment).readlines():
/usr/lib/python2.6/dist-packages/LanguageSelector/LanguageSelector.py:125:for line in open(fname):
/usr/lib/python2.6/dist-packages/LanguageSelector/LanguageSelector.py:140:for line in open(fname):
/usr/lib/python2.6/dist-packages/LanguageSelector/LanguageSelector.py:171:for line in open(fname):
/usr/lib/python2.6/dist-packages/LanguageSelector/LanguageSelector.py:210:for line in open(fname):
/usr/lib/python2.6/dist-packages/LanguageSelector/macros.py:16:for l in open(file):
/usr/lib/python2.6/dist-packages/LanguageSelector/macros.py:37:for l in open(self.LANGCODE_TO_LOCALE):
/usr/lib/python2.6/dist-packages/LanguageSelector/LangCache.py:94:for l in open(self.BLACKLIST):
/usr/lib/python2.6/dist-packages/LanguageSelector/LangCache.py:99:for l in open(self.LANGCODE_TO_LOCALE):
/usr/lib/python2.6/dist-packages/LanguageSelector/LangCache.py:111:for l in open(self.PACKAGE_DEPENDS):
/usr/lib/python2.6/dist-packages/LanguageSelector/ImSwitch.py:78:for l in open(self.blacklist_file):
unutbu
This analysis is an interesting, but it's obviously incomplete. First, it doesn't pass the smell test: if there are only 52 places in the Python standard library where a file gets opened, I will cook and eat my shoe. For a trivial example, the `logging` module implements its own `open` and `close` methods, which your RE won't find. Also: it implements a `close` method. There's a reason.
Robert Rossney
CPython uses refcounting and will close these files deterministically. It makes sense that the library shipped with CPython is able to rely on CPython-specific behavior.
Roger Pate
In the same vein, you didn't see leaked resources in your test because you didn't keep trackback objects hanging around, or other ways to keep references to the now-unneeded-but-open files---and CPython's current refcounting closes them deterministically for you.
Roger Pate
@Robert Rossney: Thanks for your comment. The above is not a list of all places files gets opened. It is only a list of places where my grep found the `for ... in open(...)` idiom. I don't follow what you are saying regarding the logging module. It does not use short idiom, but neither does it use the long idiom. Could you explain some more how logging is relevant?
unutbu
@Roger Pate: Thanks to you too. Indeed, what I grepped only applies to CPython. And I agree that if you do something unusual with traceback, it might prevent the file descriptor from ever closing. Could you show some code of how this might happen? I think it would help to see code that demonstrates the problem with the short idiom. Is it possible for the user of traceback to `del` the file object manually? Might that be a better solution and having to use the long idiom everywhere?
unutbu
In the short idiom the file object goes unnamed, so how one might muck with it in a traceback frame is beyond me. This is also why I believe the short idiom is safe. The file object exists only in the context of the for-loop. Once the for-loop ends naturally, or terminates due to a break or exception, there are no more references to the file object so it is free to be garbage collected.
unutbu
+2  A: 

Strange that for all the discussion in this topic of the importance of freeing system resources, nobody has mentioned what seems to me an obviously more significant reason to close a file deterministically: so that it can be opened again.

There are certainly cases where it doesn't matter. If a file object goes out of scope or gets deleted, the underlying file will get closed. (When it gets closed depends on the specific implementation of Python you're using.) That's generally going to be good enough - if you know exactly when the file variable is going to go out of scope, and if you know that you don't care if the file gets closed deterministically.

But why should you even be troubling yourself with that kind of analysis when the with statement exists?

Robert Rossney