ansaurus

Question

Match start and end of file in python with regex

Answer 1

+1 A:

Maybe you should pose your question more clearly, like what you trying to do. That said, you can slurp the file into one whole string, and match your pattern using re.

import re
data=open("file").read()
pat=re.compile("^.*pattern.*$",re.M|re.DOTALL)
print pat.findall(data)

There are better ways to do what you want, whatever it is, without re.

ghostdog74 2010-03-02 10:51:38

Because .* is greedy, that will find only one instance of 'pattern' in a file. Since you specified the re.M flag, $ matches just before every newline in the file, so with greedy .* and re.DOTALL, the first .* will match everything before the last 'pattern' in the file, and the second will match everything after the last 'pattern'.

Mark Tolonen 2010-03-02 12:44:49

whatever. this is not a full solution, as we are not sure what OP really wants to do. The best i can do is to tell him he can read the whole file as a string and do regex on it just like normal strings.

ghostdog74 2010-03-02 12:55:07

Answer 2

A:

regex $ is NOT your friend; see this SO answer

John Machin 2010-03-02 13:13:29

Answer 3

+2 A:

Read the whole file into a string, then \A matches only the beginning of a string, and \Z matches only the end of a string. With re.MULTILINE, '^' matches the beginning of the string and the just after a newline, and '$' matches the end of the string and just before a newline. See the Python documentation for re syntax.

import re

data = '''sentence one.
sentence two.
a bad sentence
sentence three.
sentence four.'''

# find lines ending in a period
print re.findall(r'^.*\.$',data,re.MULTILINE)
# match if the first line ends in a period
print re.findall(r'\A^.*\.$',data,re.MULTILINE)
# match if the last line ends in a period.
print re.findall(r'^.*\.$\Z',data,re.MULTILINE)

Output:

['sentence one.', 'sentence two.', 'sentence three.', 'sentence four.']
['sentence one.']
['sentence four.']

Mark Tolonen 2010-03-02 13:14:35

ansaurus

tags:

views:

answers:

Match start and end of file in python with regex

related questions