views:

73

answers:

1

I'm working on a "grep-like" utility in Python for searching Oracle source code files. Coding standards have changed over time, so trying to find something like "all deletes from table a.foo" could span multiple lines, or not, depending on the age of that piece of code:

s = """-- multiline DDL statement
DELETE
    a.foo f
WHERE
    f.bar = 'XYZ';

DELETE a.foo f
WHERE f.bar = 'ABC';

DELETE a.foo WHERE bar = 'PDQ';
"""

import re

p = re.compile( r'\bDELETE\b.+?a\.foo', re.MULTILINE | re.DOTALL )

for m in re.finditer( p, s ):
    print s[ m.start() : m.end() ]

This outputs:

DELETE
    a.foo
DELETE a.foo
DELETE a.foo

What I want:

[2] DELETE
[3]     a.foo
[7] DELETE a.foo
[10] DELETE a.foo

Is there a quick/simple/builtin way to map string indices to line numbers?

+1  A: 
lineno = s.count("\n",0,m.start())+1
Martin v. Löwis
Thanks, you rock!
kurosch