I need a file system walker that I could instruct to ignore traversing directories that I want to leave untouched, including all subdirectories below that branch. The os.walk and os.path.walk just don't do it.
A:
So I made this home-roles walker function:
import os
from os.path import join, isdir, islink, isfile
def mywalk(top, topdown=True, onerror=None, ignore_list=('.ignore',)):
try:
# Note that listdir and error are globals in this module due
# to earlier import-*.
names = os.listdir(top)
except Exception, err:
if onerror is not None:
onerror(err)
return
if len([1 for x in names if x in ignore_list]):
return
dirs, nondirs = [], []
for name in names:
if isdir(join(top, name)):
dirs.append(name)
else:
nondirs.append(name)
if topdown:
yield top, dirs, nondirs
for name in dirs:
path = join(top, name)
if not islink(path):
for x in mywalk(path, topdown, onerror, ignore_list):
yield x
if not topdown:
yield top, dirs, nondirs
Johan Carlsson
2009-05-29 08:59:52
+2
A:
It is possible to modify the second element of os.walk
's return values in-place:
[...] the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search [...]
def fwalk(root, predicate):
for dirpath, dirnames, filenames in os.walk(root):
dirnames[:] = [d for d in dirnames if predicate(r, d)]
yield dirpath, dirnames, filenames
Now, you can just hand in a predicate for subdirectories:
>>> ignore_list = [...]
>>> list(fwalk("some/root", lambda r, d: d not in ignore_list))
Torsten Marek
2009-05-29 10:05:38
+4
A:
Actually, os.walk
may do exactly what you want. Say I have a list (perhaps a set) of directories to ignore in ignore
. Then this should work:
def my_walk(top_dir, ignore):
for dirpath, dirnames, filenames in os.walk(top_dir):
dirnames[:] = [
dn for dn in dirnames
if os.path.join(dirpath, dn) not in ignore ]
yield dirpath, dirnames, filenames
Rick Copeland
2009-05-29 10:06:40
I somehow forgot about slice assignment, I took the liberty of adding that to my code.
Torsten Marek
2009-05-29 10:10:50
This is the expected way of doing so, even says so in the documentation of os.path.walk().
unwind
2009-05-29 10:12:14
No, I mean full slice assignment as a way of modifying the whole list, not the fact that you can change it.
Torsten Marek
2009-05-29 10:13:48
@Torsten Marek: you start your comment with “No”, while you don't say anything different than unwind, who mentioned the docs, and I quote: “When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment)”.
ΤΖΩΤΖΙΟΥ
2009-05-29 15:03:26
@TZ...: I believe that @Torsten was treating @unwind's comment as a response to @Torsten's initial comment, in which case it makes perfect sense (to me, at least).
Rick Copeland
2009-05-29 15:13:10
I think it should be `if os.path.join(dirpath, dn)`... `dirname` is not defined.
thornomad
2009-11-22 22:24:50
@thornomad: corrected, thanks!
Rick Copeland
2009-11-24 16:10:31