How would I extract the word 'wrestle' from the following:
type=weaksubj len=1 word1=wrestle pos1=verb stemmed1=y priorpolarity=negative
using a regular expression?
Thanks
How would I extract the word 'wrestle' from the following:
type=weaksubj len=1 word1=wrestle pos1=verb stemmed1=y priorpolarity=negative
using a regular expression?
Thanks
Given the following regex...
/word1=(\w+)/
...$1 or whatever your first match is in your language will be wrestle.
Assuming it is always separated by spaces
word1=([^ ]+)
Then you can get the value by the first group match.
The question is not very clear, but I guess this is what you are looking for:
word1=(\w+)
Your match will be in the 1st group. Here's some sample Python code:
import re
yourstring = 'type=weaksubj len=1 word1=wrestle pos1=verb stemmed1=y priorpolarity=negative'
m = re.search(r'word1=(\w+)', yourstring)
print m.group(1)
As seen on codepad. A more generalized solution:
import re
def get_attr(str, attr):
m = re.search(attr + r'=(\w+)', str)
return None if not m else m.group(1)
str = 'type=weaksubj len=1 word1=wrestle pos1=verb stemmed1=y priorpolarity=negative'
print get_attr(str, 'word1') # wrestle
print get_attr(str, 'type') # weaksubj
print get_attr(str, 'foo') # None
Also available on codepad
Maybe re is unnecessary when str.split looks like it will suffice:
>>> s = "type=weaksubj len=1 word1=wrestle pos1=verb stemmed1=y priorpolarity=negative"
>>> dd = dict(ss.split('=',1) for ss in s.split())
>>> dd['word1']
'wrestle'