The following sample code:
import token, tokenize, StringIO
def generate_tokens(src):
rawstr = StringIO.StringIO(unicode(src))
tokens = tokenize.generate_tokens(rawstr.readline)
for i, item in enumerate(tokens):
toktype, toktext, (srow,scol), (erow,ecol), line = item
print i, token.tok_name[toktype], toktext
s = \
"""
def test(x):
\"\"\" test with an unterminated docstring
"""
generate_tokens(s)
causes the following to fire:
... (stripped a little)
File "/usr/lib/python2.6/tokenize.py", line 296, in generate_tokens
raise TokenError, ("EOF in multi-line string", strstart)
tokenize.TokenError: ('EOF in multi-line string', (3, 5))
Some questions about this behaviour:
- Should I catch and 'selectively' ignore tokenize.TokenError here? Or should I stop trying to generate tokens from non-compliant/non-complete code? If so, how would I check for that?
- Can this error (or similar errors) be caused by anything other than an unterminated docstring?