tags:

views:

104

answers:

2

Hi,

I ve got the following reg exp

(-[^\w+])|([\w+]-[\w+])

I want to use it to replace dashes with a whitespace

test -test             should not be replaced
test - test            should be replaced
test-test              should be replaced

So only if test -test the dash should NOT be replaced.

Currently ([\w+]-[\w+]) is replacing the t's around the dash.

        var specialCharsExcept = new Regex(@"([\w+]-[\w+])", RegexOptions.IgnoreCase);

        if (string.IsNullOrEmpty(term))
            return "";

        return specialCharsExcept.Replace(term, " ");

Any help? Thanks in advance

PS: I am using C#.

Update

I'm trying to use your reg exp for the following case now.

some - test "some test"   - everything within the quotes the expression should not be applied

Is this possible?

+1  A: 

Ok, changed according to comment.

>>> r = ' +(-) +|(?<=\w)-(?=\w)'
>>> re.sub(r, ' ', 'test - test')
'test test'
>>> re.sub(r, ' ', 'test-test')
'test test'
>>> re.sub(r, ' ', 'test -test')
'test -test'

EDIT Corrected according to comment. The trick is to add the 'lookahead assertion' with ?= and the lookbehind with ?<=, which not be part of the match, but will be checked.

Khelben
not exactly, I just want to replace the dash, not the whole inputThe second example should return substitution as well. On line if test -test, the dash may stay.
Chris
I've edited it. It also strips blank spaces on the second case.
Khelben
According to the test cases, the dash on `test-test` should also be replaced. `(?=\w)-` is a lookahead, so it's looking for a dash that is a letter - should be look behind. also, consider using `r` in your sample code, it will become more readable.
Kobi
@Kobi. You're absolutely right, I don't know how I thought I've made it :-( I was convinced I'd used the `r`. I've added the lookbehind mark.
Khelben
It's cool. This one is very confusing. Look at the *atrocity* I came out with...
Kobi
Really an atrocity :-P On my code, usually a regular expression is preceded with three or four lines explaining it (at least in Python you can comment them inside)e
Khelben
+5  A: 

Try this crazy one:

-(?!\w(?<=\s-\w))

This regex:

  • Searches for a dash that isn't followed by a (letter with a space two characters before it).
  • Takes care of test- test and -test, which you don't have in your test cases.
  • Selects only the dash, so you can replace it (this really what made the definition so complicated).

By the way - you don't need RegexOptions.IgnoreCase because your regex has no literal parts, you aren't tryting to captrue /test/ from "Test TEST". This will do:

Regex specialCharsExcept = new Regex(@"-(?!\w(?<=\s-\w))");
return specialCharsExcept.Replace(term, " ");
Kobi
+1 for being right ;-)
Erik
Regexp can be trully difficult :-O
Khelben
Can you tell me how to extend this expression to disable this rule as soon as "test -test" is enclosed with quotes? like: some -test "some -test", the content within the quotes should not be replaced
Chris