tags:

views:

65

answers:

1

Hi All, I've a String template which I need to process using regex. I need to get the list of #if #endif blocks from the below template. I tried the following regular expression

String regexIfEndIf="\\#if(.*)\\#endif";

But the below code

Pattern pattern=Pattern.compile( regexIfEndIf,Pattern.DOTALL);
Matcher matcher=pattern.matcher( template );
while(matcher.find()){
  System.out.println("Found a match:[" + matcher.group()+"]");
}

The above system out prints everything from first #if to last #endif. But I need to get two blocks. ie first matcher.find() should find the first #if - #endif block and second matcher.find() should find the second #if-#endif. Please help me to fix the regex to return blocks of #if-#endif blocks. Below is the template used.

String template =
    "This is a sample document."
        + "#if ( $variable1 )"
        + "FIRST This text can be repeated many times until do while is called."
        + "#elseif ( $variable2 )"
        + "Some sample text after 1st ElseIf."
        + "#elseif($variable2)"
        + "This text can be repeated many times until do while is called. SECOND ELSEIF"
        + "#else "
        + "sample else condition  "
        + "#endif "
        + "Some sample text."
        + "This is the second sample document."
        + "#if ( $variable1 )"
        + "SECOND FIRST This text can be repeated many times until do while is called."
        + "#elseif ( $variable2 )"
        + "SECOND Some sample text after 1st ElseIf."
        + "#elseif($variable2)"
        + "SECOND This text can be repeated many times until do while is called. SECOND ELSEIF"
        + "#else " + "SECOND sample else condition  " + "#endif "
        + "SECOND Some sample text.";
+3  A: 

The easiest way to do this is to use a lazy quantifier:

"#if(.*?)#endif"

Also # is not a regex metacharacter so you don't need to escape it.

Ian Henry
@Null - thanks, my bad. Slip of the brain...
Ian Henry
is the nomenclature 'lazy' or 'not greedy'?
Tony Ennis
@deepa bear in mind the solution here won't work for arbitrarily nested #if's. Regexps don't do nesting well.
Tony Ennis
@Tony, strict regular expressions (in the Computer Science theory sense) can't do nesting at all. Some extensions make very limited nesting possible but extremely hard and with horrible syntax.
imgx64
@Tony: 'not greedy' could refer to possessive quantification as well. Unless you consider possessive to be a weird subclass of greedy, I suppose.
Ian Henry
Could you please help me to handle matching #if-#elseif even if nesting is there. Tony told that it is difficult to handle it.
Deepa