tags:

views:

53

answers:

4

I need a regex that extract text inside a delimiter but I'm having problem extracting the value inside the delimiter [DATA n] and [END DATA]

Here's my regex

(?<=\[DATA\s+\d+\]).*(?=\[END DATA\])

Here's example data I want to match

Some text here

[DATA 1]
data one 
some more data
[END DATA]
[DATA 2]
data two
more data
data
[END DATA]
[DATA n]
more data 
data 
[END DATA]
+2  A: 

In regex, [ text between ] is called a character class, and regex engines will only match one of the characters between the brackets. You just need to put backslashes to make them literal:

(?<=\[DATA\s+\d+\]).*(?=\[END DATA\])
too much php
sorry I edited my my post, it should contain the \ for brackets. thanks. but still it does not extract the data inside the delimiter btw i'm using expresso
dynamicvoid
+3  A: 

You appear to be using regular expressions features like lookbehind and lookahead when you don't really need them. Try:

\[DATA\s+\d+\](.*?)\[END DATA\]

There's only one capture group in this regular expression, (.*?). After using this, the result you're looking for should be in capture group 1.

Note also that I've used the non-greedy .*? match that will match up until the first following instance of [END DATA]. Without this, if you use just .*, you'll capture everything up to the last [END DATA].

Greg Hewgill
+1  A: 

Use the \ to escape character.

\[DATA\s\d\]+([^\[]+)\[[^\]]+\]
Jet
+3  A: 

The dot special character doesn't match newlines by default. Make sure you are using single-line modifier for your implementation of regex, or use [\S\s]*? instead of .*?

See http://www.regular-expressions.info/modifiers.html and http://www.regular-expressions.info/dot.html for details.

Bennor McCarthy
Better to specify the single-line modifier inline using `(?s)` at the front.
Sean Fausett
Yeah, that's definitely an option. I don't think that's supported by all regex implementations though. As far as I know, Javascript doesn't support that syntax. The question's not specific, so I just suggested something that should work in most (if not all) cases.
Bennor McCarthy
I want to extract all the data inside the delimiter and it should be multiline, yes I agree .*? is not do applicable.
dynamicvoid