I was going to suggest the compiler
module, but it ignores comments:
f.py:
# For Translators: some useful info about the sentence below
_("Some string blah blah")
..and the compiler module:
>>> import compiler
>>> m = compiler.parseFile("f.py")
>>> m
Module(None, Stmt([Discard(CallFunc(Name('_'), [Const('Some string blah blah')], None, None))]))
The AST module in Python 2.6 seems to do the same.
Not sure if it's possible, but if you use triple-quoted strings instead..
"""For Translators: some useful info about the sentence below"""
_("Some string blah blah")
..you can reliably parse the Python file with the compiler module:
>>> m = compiler.parseFile("f.py")
>>> m
Module('For Translators: some useful info about the sentence below', Stmt([Discard(CallFunc(Name('_'), [Const('Some string blah blah')], None, None))]))
I made an attempt at writing a mode complete script to extract docstrings - it's incomplete, but seems to grab most docstrings: http://pastie.org/446156 (or on github.com/dbr/so_scripts)
The other, much simpler, option would be to use regular expressions, for example:
f = """# For Translators: some useful info about the sentence below
_("Some string blah blah")
""".split("\n")
import re
for i, line in enumerate(f):
m = re.findall("\S*# (For Translators: .*)$", line)
if len(m) > 0 and i != len(f):
print "Line Number:", i+1
print "Message:", m
print "Line:", f[i + 1]
..outputs:
Line Number: 1
Message: ['For Translators: some useful info about the sentence below']
Line: _("Some string blah blah")
Not sure how the .pot
file is generated, so I can't be any help at-all with that part..