views:

55

answers:

3

I have code that roughly looks like this (the entire code is a bit too long to copy here):

import re
from subprocess import Popen, PIPE

goodOutput = re.compile(r'\S+: 0x[0-9a-fA-F]{8} \d \d\s+->\s+0x[0-9a-fA-F]{8}')

p = Popen(['/tmp/myexe', param], stdout=PIPE, stderr=PIPE, cwd='/tmp')

stdout, stderr = p.communicate()

ret = goodOutput.match(stdout)
if ret == None:
   print "No match in: " + stdout

match() doesn't match this, but if I copy the stdout from the print statement and use that string in the above script as the value for stdout, it matches. So the regexp pattern should be all right. Also, if I read the string from stdin (stdout = sys.input.read()) it again works.

I've tried to rstrip() stdout as well, but that didn't help either (besides, shouldn't match() make this unnecessary?).

When I print stdout with repr() the string looks like

'xxx[a]: 0xff2eff00 4 7\t->\t0xff2eff00\n'

and if I try to match() to this it doesn't match. Is this an issue with the tab and newline characters and if so, what should I do?

+1  A: 

There still seem to be either typos in your regex or errors that lead to it not matching (extraneous }, too much whitespace).

Try

goodOutput = re.compile(r"\s*[^:]:s*0x[0-9a-fA-F]{8}\s+\d\s+\d\s+->\s+0x[0-9a-fA-F]{8}"`

and see if that helps.

Also, try re.search() vs. re.match() and see if that makes any difference.

Tim Pietzcker
Makis
A: 

Are you sure there is no leading space or such invisible characters in stdout ? If you copy paste what follow them but not these characters it would explain why your test 'by hand' works.

If so maybe you want to perform a re.search (match anywhere) instead of re.match (match at beginning) or remove these leading characters.

kriss
I checked the output with repr() and it doesn't show any leading spaces. I edited my question to show what repr() printed.
Makis
A: 

Your regex has some random characters, with correct version of it everything matches:

>>> s = 'xxx[a]: 0xff2eff00 4 7\t->\t0xff2eff00\n'
>>> re.match(r'\S+: 0x[0-9a-f]{8} \d \d\s+->\s+0x[0-9a-f]{8}', s, re.I).group()
'xxx[a]: 0xff2eff00 4 7\t->\t0xff2eff00'
SilentGhost