tags:

views:

65

answers:

4
+1  Q: 

basic regex help

$text_expression = 'word1 word2 "phrase 1" "phrase 2" -word3 -word4 -"phrase \"hello\" 3" -"phrase 4"';

i want to search strings that contains (word1 OR word2 OR 'phrase 1' OR 'phrase 2') AND doesn't contain (word3 OR word4 OR 'phrase "hello" 3' OR 'phrase 4')

what would be the regex expression that is equivalent of $text_expression above? which produces an array like;

[contains] => array (

[0] => word1 
[1] => word2 
[2] => phrase 1 

) [doesnt contain] => array (

[0] => word3 
[1] => word4 
[2] => phrase "hello" 3

)

ps: I can formulate the string another way if it's going to make it easier (e.g. use other chars instead of quotes and dashes)

A: 

I could, and I would, but for your benefit, may I humbly suggest investing 2 hours in a regex tutorial? It will pay off very quickly.

louisgab
may i humbly admit that my brain is incapable of regex =) ? pls do help, i promise i'll commit 2 hours to an open source project...
Devrim
Voted up. If you're a programmer you'll run into regexes many, many times. This is a pretty basic question. If someone is "incapable of regex" at this level and doesn't want to change it, then it's a good time to think if they're capable of programming.
viraptor
thx viraptor. you're the man.
Devrim
Please don't take it personally :) you'll get to write regexes harder than this soon if you actually do programming, so it's a good time to start being capable of them ;)
viraptor
i guess bragging is easier than answering the question...
Devrim
@Devrim My point is not answering the question or not. I rather help you learn to fish and giving you a fish.A regex tutorial will help have you match words within 10 minutes, then apply Radomir Dopieralski's solution. Use the remaining 110 minutes getting the hang of it.You can also avoid regexes entirely by using strpos() to detect matches / negative matches.
louisgab
@louisgab this won't end, let's stop it here. am i lazy or u simply don't know? obv stackoverflow can't exist if every java question is answered with 'read a java book', or invest your time in going to college. on the other hand there are people who place silly stuff that is easily accessible via a google search. i think i asked a specific problem in regex but u guys picked up on 'i can't process regex' and i honestly think, u can't solve this particular problem - because i'm good at regex. and i know when youngsters like u and viraptor brag about what they know. all fine. enough about fish.
Devrim
+1  A: 

If you insist on a regex solution, you can use lookarounds.

^(?=.*(want|need|desired))(?!.*(noway|dontwant|nonono)).*$

(?=…) is positive lookahead; it asserts that a given pattern can be matched. (?!…) is negative lookahead; it asserts that a given pattern can NOT be matched.

The (this|that|somethingelse) is a group of alternation

The pattern gives the following matches (as seen on rubular.com):

i want you
i need you
nonono i don't want you
noway noway noway
i in noway desired you
you desired me, though

polygenelubricants
i'd love not to insist on a regex solution, do u have something in mind?
Devrim
+1  A: 

Please find a good parsing library... This regex would be too complicated to use safely (mostly because of string escaping and escape-escaping). You could use a PEG parser for example.

PS. I'm assuming you want to parse the actual query $string, not produce a regex which will filter the text as described in the question.

viraptor
actually i'm looking for the regex which will filter the text.. checking PEG now, looks interesting...
Devrim
Ah - in that case, PEG is not something you should use. http://stackoverflow.com/questions/3551507/basic-regex-help/3551594#3551594 Is much better if you simply want to filter the text.
viraptor
+1  A: 

Negative match with a regular expression is possible, but very complicated. Maybe you want to search for the first part first, and then filter the results with the second part. You "or" regular expressions with |, so look for "word1|word2|phrase 1|phrase 2" first and then remove results that match "word3|word4|phrase "hello" 3|phrase 4" (escaping the words and phrases before joining with | is probably a good idea).

Radomir Dopieralski