tags:

views:

81

answers:

1

I am working on a Regex pattern for searches that should allow optional '+' sign to include in the search and '-' sign to exclude from the search. For example: +apple orange -peach should search for apples and oranges and not for peaches. Also the pattern should allow for phrases in double quotes mixed with single words, for example: "red apple" -"black grape" +orange - you get the idea, same as most of the internet searches. So I am running 2 regular expressions, first to pick all the negatives, which is simple because '-' is required:

(?<=[\-]"?)((?<=")(?<exclude>[^"]+)|(?<exclude>[^\s,\+\-"]+))

And second to pick positives, and it is a little more complex because '+' is optional:

((?<=[\+\s]")(?<include>[^\s"\+\-][^"]+))|(?<include>(?<![\-\w]"?)([\w][^,\s\-\+]+))(?<!")

Positive search is where I am having a problem, it works fine when I run it in RegexBuddy but when I try in .Net the pattern picks up second word from negative criteria, for example in -"black grape" it picks up word 'grape' even though it ends with double quote.

Any suggestions?

+1  A: 

Try this expression:

[\+-]?(\w+|"[\w\s]+")

Starts with a + or -, optional, then matches any word or any word with spaces inside quotes.
Another advice: to experiment with regular expressions, download a tool like Expresso or The Regulator.

Here is an example using named groups, so you separate sign and value directly with the regex:

static void Main(string[] args) {
    string test = "\"red apple\" -\"black grape\" +orange";
    Regex r = new Regex( "(?<sign>[\\+-]?)((?<value>\\w+)|\"(?<value>[\\w\\s]+)\")",RegexOptions.Compiled);

    foreach (Match m in r.Matches(test)) {
        Console.WriteLine(m.Groups["sign"]);
        Console.WriteLine(m.Groups["value"]);
    }
}
Paolo Tedesco
Nice and simple, but there is a reason why I have look-ahead and look-behind in my pattern, so that I do not have to pick up + - and double quotes in my result. I was able to fix my positive search pattern:((?<=[\+\s]")(?<include>[^\s\"\+\-][^"]+))(?=\")|(?<![\-\w"])(?<include>[\w][^,\s\-\+"]+)(?![\w"])
Alex K