views:

1258

answers:

7

I'm pretty sure regular expressions are the way to go, but my head hurts whenever I try to work out the specific regular expression.

What regular expression do I need to find if a Java String (contains the text "ERROR" or the text "WARNING") AND (contains the text "parsing"), where all matches are case-insensitive?

EDIT: I've presented a specific case, but my problem is more general. There may be other clauses, but they all involve matching a specific word, ignoring case. There may be 1, 2, 3 or more clauses.

+5  A: 

If you're not 100% comfortable with regular expressions, don't try to use them for something like this. Just do this instead:

string s = test_string.toLowerCase();
if (s.contains("parsing") && (s.contains("error") || s.contains("warning")) {
    ....

because when you come back to your code in six months time you'll understand it at a glance.

Edit: Here's a regular expression to do it:

(?i)(?=.*parsing)(.*(error|warning).*)

but it's rather inefficient. For cases where you have an OR condition, a hybrid approach where you search for several simple regular expressions and combine the results programmatically with Java is usually best, both in terms of readability and efficiency.

RichieHindle
Richie, you are right. However I've given a specific case to make the question easy to understand, but my problem is more general.
Steve McLeod
s.contains("error") or s.contains("warning") should actually be s.contains("error") || s.contains("warning"), isn't it?
Shivasubramanian A
A: 

try:

 If((str.indexOf("WARNING") > -1 || str.indexOf("ERROR") > -1) && str.indexOf("parsin") > -1)
Kheu
A: 

Regular Expressions are not needed here. Try this:

if((string1.toUpperCase().indexOf("ERROR",0) >= 0 ||  
  string1.toUpperCase().indexOf("WARNING",0) >= 0 ) &&
  string1.toUpperCase().indexOf("PARSING",0) >= 0 )

This also takes care of the case-insensitive criteria

Crimson
+3  A: 

If you really want to use regular expressions, you can use the positive lookahead operator:

(?i)(?=.*?(?:ERROR|WARNING))(?=.*?parsing).*

Examples:

Pattern p = Pattern.compile("(?=.*?(?:ERROR|WARNING))(?=.*?parsing).*", Pattern.CASE_INSENSITIVE); // you can also use (?i) at the beginning
System.out.println(p.matcher("WARNING at line X doing parsing of Y").matches()); // true
System.out.println(p.matcher("An error at line X doing parsing of Y").matches()); // true
System.out.println(p.matcher("ERROR Hello parsing world").matches()); // true    
System.out.println(p.matcher("A problem at line X doing parsing of Y").matches()); // false
JG
System.out.println("matches = " + Pattern.matches("(?=.*?(?:ERROR|WARNING))(?=.*?parsing)", "ERROR Hello parsing world"));returns false... something wrong there.
Steve McLeod
Yes, the `.*` was missing at the end. I've edited my post.
JG
A: 

I usually use this applet to experiment with reg. ex. The expression may look like this:

if (str.matches("(?i)^.*?(WARNING|ERROR).*?parsing.*$")) {
...

But as stated in above answers it's better to not use reg. ex. here.

Superfilin
System.out.println(Pattern.compile("(?i)^.*?(WARNING|ERROR).*?parsing.*$").matcher("parsing ERROR Hello world").matches()); returns false...
Steve McLeod
A: 

I think this regexp will do the trick (but there must be a better way to do it):

(.*(ERROR|WARNING).*parsing)|(.*parsing.*(ERROR|WARNING))
Pascal Thivent
A: 

If you've a variable number of words that you want to match I would do something like that:

String mystring = "Text I want to match";
String[] matchings = {"warning", "error", "parse", ....}
int matches = 0;
for (int i = 0; i < matchings.length(); i++) {
  if (mystring.contains(matchings[i]) {
    matches++;
  }
}

if (matches == matchings.length) {
   System.out.println("All Matches found");
} else {
   System.out.println("Some word is not matching :(");
}

Note: I haven't compiled this code, so could contain typos.

Carlos Tasada