views:

311

answers:

8

How do i look for the following pattern using regular expression in python? for the two cases

Am looking for str2 after the "=" sign

  • Case 1: str1=str2
  • Case 2: str1 = str2

please note there can be a space or none between the either side of the "=" sign

Mine is like this, but only works for one of the cases!

m=re.search('(?<=str\s\=\s)\w+','str = str2')

returns str2

Help!

Gath

A: 

I think a regex is overkill if you only want to deal with the above two cases. Here's what I'd do-

>>> case1 = "str1=str2"
>>> case2 = "str1 = str2"
>>> case2.split()
['str1', '=', 'str2']
>>> ''.join(case2.split())
'str1=str2'
>>> case1[5:]
'str2'
>>> ''.join(case2.split())[5:]
'str2'
>>>

Assumption

I assume you are looking for the specific token 'str1'. I also assume that str1 can be assigned different values. Something like what you'd have in a configuration file => propertyName = value.

This is just my opinion.

I knew that other ways were possible! SilentGhost gives a nice (better!) alternative.

Hope this helps.

batbrat
+3  A: 
re.search(r'=\s*(.*)', 'str = str2').group(1)

or if you just want a single word:

re.search(r'=\s*(\w+)', 'str = str2').group(1)

Extended to specific initial string:

re.search(r'\bstr\s*=\s*(\w+)', 'str=str2').group(1)

\b = word boundary, so won't match "somestr=foo"

It would be quicker to go trough all options once, instead of searching for single options one at the time:

option_str = "a=b, c=d, g=h"
options = dict(re.findall(r'(\w+)\s*=\s*(\w+)', option_str))
options['c']  # -> 'd'
MizardX
Here you are assuming the we only have the "string" with "=" sign but there can be many "strings" with the same pattern i.e a=b, c=d, g=h, thats why we emphasize on the (?<=str) on the regexp ..Gath
Gath
Fixed while you where writing :)
MizardX
+8  A: 

if you indeed have only such simple strings to parse you don't need regular expression. you can just partition on = and strip (or even lstrip) last element of a resulting tuple:

>>> case = 'str = str2'
>>> case.partition('=')[2].lstrip()
'str2'

it'll be much faster than regexps. and just to show how fast i've made a simple test:

>>> timeit.timeit("'str1 = str2 '.partition('=')[2].strip()")
0.49051564213846177
>>> timeit.timeit("'str1 = str2 '.split('=')[1].strip()")
0.97673281637025866
>>> timeit.timeit('import re')
0.65663786250422618
SilentGhost
You should use strip() instead of lstrip(). Because it will fail in this case = "str = str2 "
aatifh
text of my post clearly say `strip (or even lstrip)`. I trust that all interesting parties will be able to modify my example to satisfy their needs.
SilentGhost
A: 

Expanding on @batbrat's answer, and the other suggestions, you can use re.split() to separate the input string. The pattern can use \s (whitespace) or an explicit space.

>>> import re
>>> c1="str1=str2"
>>> c2="str1 = str2"
>>> re.split(' ?= ?',c1)
['str1', 'str2']
>>> re.split(' ?= ?',c2)
['str1', 'str2']
>>> re.split(r'\s?=\s?',c1)
['str1', 'str2']
>>> re.split(r'\s?=\s?',c2)
['str1', 'str2']
>>>
gimel
Interesting! Gath
Gath
+1  A: 

If your data is fixed then you can do this without using regex. Just split it on '='. For example:

>>> case1 = "str1=str2"
>>> case2 = "str1 = str2"

>>> str2 = case1.split('=')[1].strip()
>>> str2 = case2.split('=')[1].strip()

This YOURCASE.split('=')[1].strip() statement will work for any cases.

aatifh
A: 

Two cases:

  • (case 1) if there is a single space before the '=', then there must also be a single space after the '='

    m=re.search(r'(?<=\S)(?:\s=\s|=)(\w+)','str = str2')
    print m.group(1)
    
  • (case 2) otherwise,

    m=re.search(r'(?<=\S)\s?=\s?(\w+)','str = str2')
    print m.group(1)
    

In the first case, you could also use the "(?P=…" construct for the second space or lack of it, but it still wouldn't work for a positive lookbehind assertion, since it wouldn't be a constant length subexpression.

ΤΖΩΤΖΙΟΥ
A: 

Related idea: I find using graphical regular expression tool helpful when trying to figure out correct pattern: http://kodos.sf.net.

+1  A: 

Simply use split function