tags:

views:

110

answers:

4

I need help with 2 regex.

  1. get all text until a open bracket.

e.g. this is so cool (234) => 'this is so cool'

  1. get the text inside the brackets, so the # '234'
+3  A: 

Up until the paren: regex = re.compile("(.*?)\s*\(")

Inside the first set of parens: regex = re.compile(".*?\((.*?)\)")

Edit: Single regex version: regex = re.compile("(.*?)\s*\((.*?)\)")

Example output:

>>> import re
>>> r1 = re.compile("(.*?)\s*\(")
>>> r2 = re.compile(".*?\((.*?)\)")
>>> text = "this is so cool (234)"
>>> m1 = r1.match(text)
>>> m1.group(1)
'this is so cool'
>>> m2 = r2.match(text)
>>> m2.group(1)
'234'
>>> r3 = re.compile("(.*?)\s*\((.*?)\)")
>>> m3 = r3.match(text)
>>> m3.group(1)
'this is so cool'
>>> m3.group(2)
'234'
>>> 

Note of course that this won't work right with multiple sets of parens, as it's only expecting one parenthesized block of text (as per your example). The language of matching opening/closing parens of arbitrary recurrence is not regular.

eldarerathis
why two regular expressions? can you not build one with 2 capture groups?
Ben
Sure. I just did this because he asked for two. Edit: Added a version that just has two capture groups as well.
eldarerathis
@eldarerathis. Maybe he did, but if you think there is a better solution, it would be more elegant to offer it
Ben
It shouldn't take much to combine the two if the OP wants to. For instance, `(.*?)\s*\((.*?)\)` should do the trick.
Justin Peel
@Justin Peel - I see his point, though. If I'm going to give two I can just as easily give three. Especially if the third is really the more optimal choice.
eldarerathis
A: 

No need for regular expression.

>>> s="this is so cool (234)"
>>> s.split("(")[0]
'this is so cool '
>>> s="this is so cool (234) test (123)"
>>> for i in s.split(")"):
...  if "(" in i:
...     print i.split("(")[-1]
...
234
123
ghostdog74
sorry, that's an awful solution. what if he had `a(b)c`. you're code would return `bc` in the 2nd case.
Mark
A: 

Here is my own library function version without regex.

def between(left,right,s):
    before,_,a = s.partition(left)
    a,_,after = a.partition(right)
    return before,a,after

s="this is so cool (234)"
print('\n'.join(between('(',')',s)))
Tony Veijalainen
A: 

Sounds to me like you could just do this:

re.findall('[^()]+', mystring)

Splitting would work, too:

re.split('[()]', mystring)

Either way, the text before the first parenthesis will be the first item in the resulting array, and the text inside the first set of parens will be the second item.

Alan Moore