tags:

views:

1862

answers:

3

I want to match a block of code multiple times in a file but can't work out the regular expression to do this. An example of the code block is:

//@debug
...
// code in here
...
//@end-debug (possibly more comments here on same line)

Each code block I'm trying to match will start with //@debug and stop at the end of the line containing //@end-debug

I have this at the moment:

/(\/{2}\@debug)(.|\s)*(\/{2}\@end-debug).*/

But this matches one big block from the first //@debug all the way to end of the line of the very last //@end-debug in the file.

Any ideas?

A: 

what language? python regular expressions (i guess they're == to perl5 regexps) have the concept of 'greedy' vs 'non-greedy' regexps, you can control it through a flag somewhere.

search for "greedy vs non-greedy" on this page, also this page might be better

Non-greedy quantifiers have the same syntax as regular greedy ones, except with the quantifier followed by a question-mark. For example, a non-greedy pattern might look like: "/A[A-Z]*?B/". In English, this means "match an A, followed by only as many capital letters as are needed to find a B."

Dustin Getz
also, this guy is your friend: http://www.regexbuddy.com/
Dustin Getz
+4  A: 

Basically your regular expression is greedy. This means the wildcard operators grab as much as they possibly can with the results you've seen. Just change it to non-greedy where appropriate. In your case use:

/(\/{2}\@debug)(.|\s)*?(\/{2}\@end-debug).*/

cletus
This did the trick!Thanks very much.
bishboria
I'll vote this up, if you indent the code block, so it will be highlighted.
Brad Gilbert
+1  A: 

You shouldn't have to use that (.|\s) hack, either, but the syntax for doing it the correct way depends on the language or tool you're using. In Perl or Javascript, you could do this:

/\/\/@debug.*?^\/\/@end-debug[^\r\n]*/sg

The /s modifier lets the dot match carriage-returns and linefeeds, resulting in a regex that's both easier to read and more efficient. It also means I had to change the second .* to [^\r\n]*, but it's worth it. The /g modifier is what lets the regex match multiple times (i.e., "globally").

Alan Moore