views:

127

answers:

2

Hi, I feel that there is (should be?) a Python function out there that recursively splits a path string into its constituent files and directories (beyond basename and dirname). I've written one but since I use Python for shell-scripting on 5+ computers, I was hoping for something from the standard library or simpler that I can use on-the-fly.

import os

def recsplit(x):
    if type(x) is str: return recsplit(os.path.split(x))
    else: return (x[0]=='' or x[0] == '.' or x[0]=='/') and x[1:] or \
          recsplit(os.path.split(x[0]) + x[1:])

>>> print recsplit('main/sub1/sub2/sub3/file')
('main', 'sub1', 'sub2', 'sub3', 'file')

Any leads/ideas? ~Thanks~

A: 

path='main/sub1/sub2/sub3/file' path.split(os.path.sep)

gg42
Thanks - very elegant, but I guess doesn't necessarily work on Windows machines as stated above...
Stephen
It's exactly what Vazquez-Abrams posted. I'm not sure why one would vote it *up*.
Devin Jeanpierre
+2  A: 

UPDATE: After all the mucking about with altsep, the currently selected answer doesn't even split on backslashes.

>>> import re, os.path
>>> seps = os.path.sep
>>> if os.path.altsep:
...   seps += os.path.altsep
...
>>> seps
'\\/'
>>> somepath = r"C:\foo/bar.txt"
>>> print re.split('[%s]' % (seps,), somepath)
['C:\\foo', 'bar.txt'] # Whoops!! it was splitting using [\/] same as [/]
>>> print re.split('[%r]' % (seps,), somepath)
['C:', 'foo', 'bar.txt'] # after fixing it
>>> print re.split('[%r]' % seps, somepath)
['C:', 'foo', 'bar.txt'] # removed redundant cruft
>>>

Now back to what we ought to be doing:

(end of update)

1. Consider carefully what you are asking for -- you may get what you want, not what you need.

If you have relative paths
r"./foo/bar.txt" (unix) and r"C:foo\bar.txt" (windows)
do you want
[".", "foo", "bar.txt"] (unix) and ["C:foo", "bar.txt"] (windows)
(do notice the C:foo in there) or do you want
["", "CWD", "foo", "bar.txt"] (unix) and ["C:", "CWD", "foo", "bar.txt"] (windows)
where CWD is the current working directory (system-wide on unix, that of C: on windows)?

2. You don't need to faff about with os.path.altsep -- os.path.normpath() will make the separators uniform, and tidy up other weirdnesses like foo/bar/zot/../../whoopsy/daisy/somewhere/else

Solution step 1: unkink your path with one of os.path.normpath() or os.path.abspath().

Step 2: doing unkinked_path.split(os.path.sep) is not a good idea. You should pull it apart with os.path.splitdrive(), then use multiple applications of os.path.split().

Here are some examples of what would happen in step 1 on windows:

>>> os.path.abspath(r"C:/hello\world.txt")
'C:\\hello\\world.txt'
>>> os.path.abspath(r"C:hello\world.txt")
'C:\\Documents and Settings\\sjm_2\\hello\\world.txt'
>>> os.path.abspath(r"/hello\world.txt")
'C:\\hello\\world.txt'
>>> os.path.abspath(r"hello\world.txt")
'C:\\Documents and Settings\\sjm_2\\hello\\world.txt'
>>> os.path.abspath(r"e:hello\world.txt")
'E:\\emoh_ruo\\hello\\world.txt'
>>>

(the current drive is C, the CWD on drive C is \Documents and Settings\sjm_2, and the CWD on drive E is \emoh_ruo)

I'd like to suggest that you write step 2 without the conglomeration of and and or that you have in your example. Write code as if your eventual replacement knows where you live and owns a chainsaw :-)

John Machin
Why "unkinked_path.split(os.path.sep) is not a good idea" (apart from the drive letter possibly being stuck in there)?
Stephen
Thanks by the way
Stephen
Further experimentation suggests that if you've used os.path.abspath(), it doesn't make a difference. Using whatever.split(os.path.sep) certainly avoids the struggle of getting a recursive routine to work.
John Machin