Hi, I'm trying to extract two blocks of code from a text file using Java regex. However, I can only extract the last block. Could some one point out what is is wrong with mycode?
thanks.
here it is
import java.util.regex.*;
INPUT_START_OR_BLANK_LINE = /(?:\A|\n\n)/
FOUR_SPACES_OR_TAB = /(?:[ ]{4}|\t)/
CODE = /.*\n+/
CODE_LINES = /(?:$FOUR_SPACES_OR_TAB$CODE)/
LOOKAHEAD_FOR_NON_CODE_LINE = /(?:(?=^[ ]{0,4}\S)|\Z)/
// this regular expression will find all of the consecutive code lines in a markdown file
// in a markdown file, if the line starts with a tab or at least 4 spaces, it's a code line
// slightly modified from one in markdownj
// see: http://github.com/myabc/markdownj/tree/master/src/java/com/petebevin/markdown/MarkdownProcessor.java
MARKDOWN_CODE_BLOCK = "(?m)" +
"$INPUT_START_OR_BLANK_LINE" +
"($CODE_LINES+)" +
"$LOOKAHEAD_FOR_NON_CODE_LINE"
def text="""
Normal paragraph ....
first Code block begin
all codes
first code block end
how about this line?
that is not good but what we have are very important
for the purpose of text. yes, we are good.
second Code block begin
all codes
second code block end
how about this line?
Normal returns
"""
Pattern p = Pattern.compile(MARKDOWN_CODE_BLOCK);
Matcher m = p.matcher(text);
while (m.find() == true){
m.group().eachLine {println it}
}
the code was adopted from http://naleid.com/blog/2009/01/01/using-groovy-regular-expressions-to-parse-code-from-a-markdown-file/