tags:

views:

304

answers:

4

I am having some difficulty writing a function that will search through a directory for a file that matches a specific regular expression (which I have compiled using 're.compile'). So my question is: How do I search through a directory (I plan to use os.walk) for a file that matches a specific regular expression? An example would be very much appreciated. Thanks in advance.

+2  A: 
  1. Read about the RE pattern's match method.

  2. Read all answers to How do I copy files with specific file extension to a folder in my python (version 2.5) script?

  3. Pick one that uses fnmatch. Replace fnmatch with re.match. This requires careful thought. It's not a cut-and-paste.

  4. Then, ask specific questions.

S.Lott
+7  A: 

This will find all files starting with two digits and ending in gif, you can add the files into a global list, if you wish:

import re
import os
r = re.compile(r'\d{2}.+gif$')
for root, dirs, files in os.walk('/home/vinko'):
  l = [os.path.join(root,x) for x in files if r.match(x)]
  if l: print l #Or append to a global list, whatever
Vinko Vrsalovic
Thank you Vinko! :)
Amara
Your regexp also matches 'a.gif.bak' which is not ending with 'gif'.
J.F. Sebastian
You should use `match` not `search`.
S.Lott
fnmatch.filter(files, '*.gif') will do just fine. A better example would be r'(?i)Merlin.*?s(\d{2})e(\d{2}).*?(?:avi|mkv)$'.
J.F. Sebastian
@S.Lott: why? @J.F. Sebastian: Right. That's a better example.
Vinko Vrsalovic
About search() v/s match(). As far as I can tell, it depends on how are you going to write your regex. search() will return a match if there is a match anywhere in the string, whereas match will return a match only from the beginning.
Vinko Vrsalovic
+1  A: 

If the pattern you have to match is simple enough to grab with filesystem wildcards, I recommend you take a look at the glob module, which exists for this exact purpose.

Tim Keating
+1  A: 

One of the Stackoverflow founders is a big fan of RegexBuddy from JGSoft. I tried it on a whim when i was writing a simple file moving script at work, and it makes generating the best regex for a job quite easy in the language of your choice. If you're having trouble with developing the regex itself this is a nice tool to check your logic. I guess I'm a big fan now as well.

jonny