ansaurus

Question

Matching multiple regex groups and removing them

Answer 1

+1 A:

How about replacing (^LINE: \d+$)|(^\w+:) with an empty string ""?

Use \n instead of ^ and $ to remove unwanted empty lines also.

Amarghosh 2009-11-24 16:21:08

Sorry I don't think I was being precise enough. What I would like to know is that in my for loop, is that the correct way of ignoring anything matched by WHITESPACE, LINE and TOKEN?

greenie 2009-11-24 16:24:38

Alex has posted the improvised and pythonified version of this.

Amarghosh 2009-11-24 16:47:06

Answer 2

+2 A:

import re

x = '''LINE: 1
TOKENKIND: somedata
TOKENKIND: somedata
LINE: 2
TOKENKIND: somedata
LINE: 3'''

junkre = re.compile(r'(\s*LINE:\s*\d*\s*)|(\s*TOKENKIND:)', re.DOTALL)

print junkre.sub('', x)

Alex Martelli 2009-11-24 16:26:15

Perfect. Removing my for loop and using sub() worked fine. Thanks for your help.

greenie 2009-11-24 16:31:53

Answer 3

+1 A:

no need to use regex in Python. Its Python after all, not Perl. Think simple and use its string manipulation capabilities

f=open("file")
for line in f:
    if line.startswith("LINE:"): continue
    if "TOKENKIND" in line:
        print line.split(" ",1)[-1].strip()
f.close()

2009-11-25 00:55:30

ansaurus

tags:

views:

answers:

Matching multiple regex groups and removing them

related questions