views:

6411

answers:

6

What is the best way to get a list of all files in a directory, sorted by date [created | modified], using python, on a windows machine?

+12  A: 

I've done this in the past for a Python script to determine the last updated files in a directory:

import glob
import os

search_dir = "/mydir/"
# remove anything from the list that is not a file (directories, symlinks)
# thanks to J.F. Sebastion for pointing out that the requirement was a list 
# of files (presumably not including directories)  
files = filter(os.path.isfile, glob.glob(search_dir + "*"))
files.sort(key=lambda x: os.path.getmtime(x))

That should do what you're looking for based on file mtime.

EDIT: Note that you can also use os.listdir() in place of glob.glob() if desired - the reason I used glob in my original code was that I was wanting to use glob to only search for files with a particular set of file extensions, which glob() was better suited to. To use listdir here's what it would look like:

import os

search_dir = "/mydir/"
os.chdir(search_dir)
files = filter(os.path.isfile, os.listdir(search_dir))
files = [os.path.join(search_dir, f) for f in files] # add path to each file
files.sort(key=lambda x: os.path.getmtime(x))
Jay
glob() is nice, but keep in mind that it skips files starting with a period. *nix systems treat such files as hidden (thus omitting them from listings), but in Windows they are normal files.
efotinis
These solutions don't exclude dirs from list.
Constantin
Your os.listdir solution is missing the os.path.join:files.sort(lambda x,y: cmp(os.path.getmtime(os.path.join(search_dir,x)), os.path.getmtime(os.path.join(search_dir,y))))
Peter Hoffmann
`files.sort(key=lambda fn: os.path.getmtime(os.path.join(search_dir, fn)))`
J.F. Sebastian
`files = filter(os.path.isfile, os.listdir(search_dir))`
J.F. Sebastian
Your solution doesn't sort by creation date as OP asks. See http://stackoverflow.com/questions/168409/how-do-you-get-a-directory-listing-sorted-by-creation-date-in-python/539024#539024
J.F. Sebastian
@J.F. - the question actually asks "date [created | modified]" so mtime is a better choice than ctime.
Jay
@J.F. - thanks for pointing out the "key" param to sort, that was added in Python 2.4 and this code was originally on python 2.3 so I wasn't aware of it at the time. Learn something new every day!
Jay
A mere `files.sort(key=os.path.getmtime)` should work (without `lambda`).
J.F. Sebastian
A: 

Maybe you should use shell commands. In Unix/Linux, find piped with sort will probably be able to do what you want.

stephanea
+3  A: 

Here's a one-liner:

import os
import time
from pprint import pprint

pprint([(x[0], time.ctime(x[1].st_ctime)) for x in sorted([(fn, os.stat(fn)) for fn in os.listdir(".")], key = lambda x: x[1].st_ctime)])

This calls os.listdir() to get a list of the filenames, then calls os.stat() for each one to get the creation time, then sorts against the creation time.

Note that this method only calls os.stat() once for each file, which will be more efficient than calling it for each comparison in a sort.

Greg Hewgill
that's hardly pythonic, though it does solve the job (disclaimer: didn't test the code).
Adriano Varoli Piazza
This solution doesn't exclude dirs from list.
Constantin
@Constantin: that's true, but a quick [... if stat.S_ISREG(x)] would handle that.
Greg Hewgill
@Greg: could you wrap you code to kill horizontal scrollbar.
J.F. Sebastian
+5  A: 

Here's my version:

def getfiles(dirpath):
    a = [s for s in os.listdir(dirpath)
         if os.path.isfile(os.path.join(dirpath, s))]
    a.sort(key=lambda s: os.path.getmtime(os.path.join(dirpath, s)))
    return a

First, we build a list of the file names. isfile() is used to skip directories; it can be omitted if directories should be included. Then, we sort the list in-place, using the modify date as the key.

efotinis
+3  A: 
sorted(filter(os.path.isfile, os.listdir('.')), 
    key=lambda p: os.stat(p).st_mtime)

You could use os.walk('.').next()[-1] instead of filtering with os.path.isfile, but that leaves dead symlinks in the list, and os.stat will fail on them.

fivebells
That only works in the current directory though.
Tom
+3  A: 

Here's a more verbose version of @Greg Hewgill's answer. It is the most conforming to the question requirements. It makes a distinction between creation and modification dates (at least on Windows).

#!/usr/bin/env python
from stat import S_ISREG, ST_CTIME, ST_MODE
import os, sys, time

# path to the directory (relative or absolute)
dirpath = sys.argv[1] if len(sys.argv) == 2 else r'.'

# get all entries in the directory w/ stats
entries = (os.path.join(dirpath, fn) for fn in os.listdir(dirpath))
entries = ((os.stat(path), path) for path in entries)

# leave only regular files, insert creation date
entries = ((stat[ST_CTIME], path)
           for stat, path in entries if S_ISREG(stat[ST_MODE]))
#NOTE: on Windows `ST_CTIME` is a creation date 
#  but on Unix it could be something else
#NOTE: use `ST_MTIME` to sort by a modification date

for cdate, path in sorted(entries):
    print time.ctime(cdate), os.path.basename(path)

Example:

$ python stat_creation_date.py
Thu Feb 11 13:31:07 2009 stat_creation_date.py
J.F. Sebastian