tags:

views:

119

answers:

6
r'(^|^A)(\S+)(B$|$)'

results to matches everything, which actually equals to ^\S$.

How to write one matches "begins with A or ends with B, may both but not neither?"

PS: I also need refer to group (\S+) in the substring module.

Example:

Match Aanything, anythingB, and refer anything group in the replace.

+2  A: 
(^A.*B$)|(^A.*$)|(^.*B$)
Boris Pavlović
yes this matches but don't know which group in the replace part?
BOYPT
You can leave away the grouping of the alternatives, thanks to the precedence of the "|" operator, but would want to capture the relevant substring. This leaves you with ``^A(\S*)B$|^A(\S*)$|(\S*)B$``. The ugly thing here is that now you get the desired substring in one of three match groups, and you don't know which in advance. So you might want to use the 'max(match.groups())' approach of mykhal.
ThomasH
+1  A: 

try this:

/(^A|B$)/
動靜能量
While this matches at the right strings, it fails to capture the relevant substring, which is the actual difficulty here.
ThomasH
+2  A: 

^A|B$ or ^A|.*B$ (depending whether the match function is matching from the beginning)

UPDATE

it's difficult to write single regexp for this..

a possibility is:

match = re.match(r'^(?:A(\S+))|(?:(\S+)B)$', string)
if match:
    capture = max(match.groups())
# because match.groups() is either (capture, None) or (None, capture)
mykhal
actually what I need is to get the (\S+) group patten, `^A|.*B$` match "`A`" "`anythingB`", but I need "`anything`"
BOYPT
+2  A: 

Is this the desired behavior?

var rx = /^((?:A)?)(.*?)((?:B)?)$/;
"Aanything".match(rx)
> ["Aanything", "A", "anything", ""]
"anythingB".match(rx)
> ["anythingB", "", "anything", "B"]
"AanythingB".match(rx)
> ["AanythingB", "A", "anything", "B"]
"anything".match(rx)
> ["anything", "", "anything", ""]
"AanythingB".replace(rx, '$1nothing$3');
> "AnothingB"
"AanythingB".replace(rx, '$2');
> "anything"
Fordi
This regex misses the "not neither" requirement of the OP.
ThomasH
+1  A: 
BOYPT
wouldn't be `if re.match(r'^A\S+$', s): s = s[1:]`; `if re.match(r'^\S+B$', s): s = s[:-1]` much simpler?
mykhal
your final answer has unbalanced parenthesis :)
mykhal
Shouldn't the condition be on `A`? Oh, it is on the samples, but not in the "final answer"...
Kobi
oh, i forgot to update the final answer line, done now.
BOYPT
A: 

If you don't mind the extra weight in the case where both prefix "A" and suffix "B" exist, you can use a shorter regex:

reMatcher= re.compile(r"(?<=\AA).*|.*(?=B\Z)")

(using \A for ^ and \Z for $)

This one keeps the "A" prefix (instead of the "B" prefix of your solution) when both "A" and "B" are at their respective corners:

'A text here' matches ' text here'
'more text hereB' matches 'more text here'
'AYES!B' matched 'AYES!'
'neither' doesn't match

Otherwise, a non-regex solution (some would say a more “Pythonic” one) is:

def strip_prefix_suffix(text, prefix, suffix):
    left =  len(prefix) if text.startswith(prefix) else 0
    right= -len(suffix) if text.endswith(suffix) else None
    return text[left:right] if left or right else None

If there is no match, the function returns None to differentiate from a possible '' (e.g. when called as strip_prefix_suffix('AB', 'A', 'B')).

PS I should also say that this regex:

(?<=\AA).*(?=B\Z)|(?<=\AA).*|.*(?=B\Z)

should work, but it doesn't; it works just like the one I suggested, and I can't understand why. Breaking down the regex into parts, we can see something weird:

>>> text= 'AYES!B'
>>> re.compile('(?<=\\AA).*(?=B\\Z)').search(text).group(0)
'YES!'
>>> re.compile('(?<=\\AA).*').search(text).group(0)
'YES!B'
>>> re.compile('.*(?=B\\Z)').search(text).group(0)
'AYES!'
>>> re.compile('(?<=\\AA).*(?=B\\Z)|(?<=\\AA).*').search(text).group(0)
'YES!'
>>> re.compile('(?<=\\AA).*(?=B\\Z)|.*(?=B\\Z)').search(text).group(0)
'AYES!'
>>> re.compile('(?<=\\AA).*|.*(?=B\\Z)').search(text).group(0)
'AYES!'
>>> re.compile('(?<=\\AA).*(?=B\\Z)|(?<=\\AA).*|.*(?=B\\Z)').search(text).group(0)
'AYES!'

For some strange reason, the .*(?=B\\Z) subexpression takes precedence, even though it's the last alternative.

ΤΖΩΤΖΙΟΥ
I've opened an [issue](http://bugs.python.org/issue10139) in the Python bug tracker, since it's a possible bug in the re engine.
ΤΖΩΤΖΙΟΥ