tags:

views:

185

answers:

4

I need a way of searching a file using grep via a regular expression 'RE' from the unix command line. For example when i type in the command line:

python pythonfile.py 'RE' 'file-to-be-searched'

I need the regualr expression 'RE' to be searched in the file and print out the lines that contain the RE.

Ok so heres the coding i have:

import re
import sys

search_term = sys.argv[1]
f = sys.argv[2]

for line in open(f, 'r'):
    if re.search(search_term, line):
        print line,
        if line == None:
            print 'no matches found'

But when i enter a word which isnt present 'no matches found' doesnt print

A: 
  1. use sys.argv to get the command-line parameters
  2. use open(), read() to manipulate file
  3. use the Python re module to match lines
jldupont
+3  A: 

The natural question is why not just use grep?! But assuming you can't ...

 import re
 import sys

 file = open(sys.argv[2], "r")

 for line in file:
        if re.search(sys.argv[1], line):
              print line,

Things to note:

  • search instead of match to find anywhere in string
  • comma , after print removes carriage return (line will have one)
  • argv includes python file name - variables need to start at 1

This doesn't multiple arguments (like grep does) or expand wildcards (like the unix shell would). If you wanted this functionality you could get it using the following:

import re
import sys
import glob

for arg in sys.argv[2:]:
    for file in glob.iglob(arg):
     for line in open(file, 'r'):
      if re.search(sys.argv[1], line):
       print line,
Nick Fortescue
You need to import re first.
Tommy Herbert
thanks, sys too. I've edited
Nick Fortescue
nice semicolon ;)
The MYYN
doh! too much Java recently. Thanks, I'll fix
Nick Fortescue
you should compile your regex before using the loops.
This has two down votes and I have no idea why. Anyone who downvoted want to leave a comment? I know you could add regex compilation etc, but I thought that would detract from the clarity of the answer. I don't think there is anything incorrect, and I've run the code, unlike some of the other answers
Nick Fortescue
This answer was perfect for me thanks. Just another quick question how would i print if no matches were found?
David
add a counter, and increase it if a match happens. At the end check it in an if and print if no answers found
Nick Fortescue
ok i put a line counter in which counts the number of lines. But when i execur=te the program nothing is printed. i.e. it wont print 'no matches found'
David
can you add this problem as another stack overflow question, and show your source code as it looks at the moment? put a reference to this question, and give me a comment with the new question number
Nick Fortescue
+1  A: 

adapted from a grep in python ...

accepts shell globs as filename argument ([2:]); no exception handling:

#!/usr/bin/env python
import re, sys, os

for f in filter(os.path.isfile, sys.argv[2:]):
    for line in open(f).readlines():
        if re.match(sys.argv[1], line):
            print line

sys.argv[1] resp sys.argv[2:] works, if you run it as an standalone executable, meaning

chmod +x

first

The MYYN
what's the difference between `re.match` and `re.search` ?
OscarRyz
The shell expands wildcards before the program (your script in this case) sees them.So there is no need to use glob and it will even be unexpected behavior when used like 'pythongrep foo *' where the files in the current directory are 'a b c': the script will only search in file 'a', since it's actually executed as 'pythongrep foo a b c' after the shell expansion.
Harmen
A: 

Here is a nice tutorial for the re module in Python: http://www.amk.ca/python/howto/regex/ .

Jabba