tags:

views:

407

answers:

5

I have a response:

MS1:111980613994
124 MS2:222980613994124

I have the following regex:

MS\d:(\d(?:\r?\n?)){15}

According to Regex, the "(?:\r?\n?)" part should let it match for the group but exclude it from the capture (so I get a contiguous value from the group).

Problem is that for "MS1:xxx" it matches the [CR][LF] and includes it in the group. It should be excluded from the capture ...

Help please.

A: 

Perhaps what you mean to do here is place the [CR][LF] matching part outside of the captured group, something like: MS\d:(\d){15}(?:\r?\n?)

Jefromi
Unless there can be newlines scattered through the 15 digits. Also, you really don't need grouping on the newlines in that case.
Tim Sylvester
+2  A: 

No, the (?:...) syntax just means that that group (pair of parens) are not themselves a capture group. If they are enclosed by another capture group that enclosing capture group will capture the characters matched by the non-capturing group.

If you really want to ignore embedded \rs and \ns your best bet is to strip them out in a second step. You don't say what language you're using, but something equivalent to this (Python) should work:

s = re.sub(r'[\r\n]', '', s)
Laurence Gonsalves
A: 

Thanks - that clears it up. I was just using Expresso to test out my pattern.

*What* clears it up? If one of the replies answered your question, mark it "accepted".
Alan Moore
A: 

So far as I know, you'll have to use 2 regexes. One is "MS\d:(\d(?:\r?\n?)){15}", the other is used to remove the line breaks from the matches.

Please refer to "Regular expression to skip character in capture group".

boxoft
A: 

How about MS\d:(?:(\d)\r?\n?){15}

raccoon