tags:

views:

47

answers:

2

I need a regex that will match strings of letters that do not contain two consecutive dashes.

I came close with this regex that uses lookaround (I see no alternative):

([-a-z](?<!--))+

Which given the following as input:

qsdsdqf--sqdfqsdfazer--azerzaer-azerzear

Produces three matches:

qsdsdqf-
sqdfqsdfazer-
azerzaer-azerzear

What I want however is:

qsdsdqf-
-sqdfqsdfazer-
-azerzaer-azerzear

So my regex loses the first dash, which I don't want.

Who can give me a hint or a regex that can do this?

+4  A: 

This should work:

-?([^-]-?)*

It makes sure that there is at least one non-dash character between every two dashes.

sth
as he asked strings of letters you can probably replace [^-] by [a-z] (or \w if underscore is allowed, but it does not change the idea).
kriss
I would use `+` instead of `*`; as it is now, you'll always get an extra match of an empty string after the last "real" match. Also, you're matching one character at a time inside a capturing group that's controlled by a quantifier; that's hideously inefficient. I doubt it will matter in this case, but it's not something you want to make a habit of. At least tack a `+` onto the `[^-]`.
Alan Moore
A: 

Looks to me like you do want to match strings that contain double hyphens, but you want to break them into substrings that don't. Have you considered splitting it between pairs of hyphens? In other words, split on:

(?<=-)(?=-)

As for your regex, I think this is what you were getting at:

(?:[^-]+|-(?<!--)|\G-)+

The -(?<!--) will match one hyphen, but if the next character is also a hyphen the match ends. Next time around, \G- picks up the second hyphen because it's the next character; the only way that can happen (except at the beginning of the string) is if a previous match broke off at that point.

Be aware that this regex is more flavor dependent than most; I tested it in Java, but not all flavors support \G and lookbehinds.

Alan Moore