tags:

views:

56

answers:

3

Hi,

This might be a piece of cake for java experts. Please help me out:

I have a block of comments in my program like this:

/********* 
block of comments - line 1
line 2
.....
***/

How could I retrieve "block of comments" using regex?

thanks.

A: 

Not sure about the multi-line issues, but it were all on one line, you could do this:

^\/\*.*\*\/$

That breaks down to:

^ start of a line
\/\*+ start of a comment, one or more *'s (both characters escaped)
.* any number of characters
\*+\/ end of a comment, one or more *'s (both characters escaped)
$ end of a line

By the way, it's "regex" not "regrex" :)

ChessWhiz
I think both your \\* need to be \\*+
Ethan Shepherd
@Ethan Thanks. Didn't realize that those extra *'s shouldn't be matched.
ChessWhiz
+2  A: 

Something like this should do:

    String str =
        "some text\n"+
        "/*********\n" +
        "block of comments - line 1\n" +
        "line 2\n"+
        "....\n" +
        "***/\n" +
        "some more text";

    Pattern p = Pattern.compile("/\\*+(.*?)\\*+/", Pattern.DOTALL);
    Matcher m = p.matcher(str);

    if (m.find())
        System.out.println(m.group(1));

(DOTALL says that the . in the pattern should also match new-line characters) Prints:

block of comments - line 1
line 2
....
aioobe
messes up if you put /* in a string in the code. NOONE should do that but im just saying.
Buttink
Ah, yes. Good point.
aioobe
+2  A: 
Pattern regex = Pattern.compile("/\\*[^\\r\\n]*[\\r\\n]+(.*?)[\\r\\n]+[^\\r\\n]*\\*+/", Pattern.DOTALL);

This works because comments can't be nested in Java.

It is important to use a reluctant quantifier (.*?) or we will match everything from the first comment to the last comment in a file, regardless of whether there is actual code in-between.

/\* matches /*

[^\r\n]* matches whatever else is on the rest of this line.

[\r\n]+ matches one or more linefeeds.

.*? matches as few characters as possible.

[\r\n]+ matches one or more linefeeds.

[^\r\n]* matches any characters on the line of the closing */.

\*/ matches */.

Tim Pietzcker