tags:

views:

53

answers:

3

Hello everyone,

I'm having a bit of trouble with my regex and was wondering if anyone could please shed some light on what to do.

Basically, I have this Regex:

\[(link='\d+') (type='\w+')](.*|)\[/link]

For example, when I pass it the string:

[link='8' type='gig']Blur[/link] are playing [link='19' type='venue']Hyde Park[/link]" 

It only returns a single match from the opening [link] tag to the last [/link] tag.

I'm just wondering if anyone could please help me with what to put in my (.*|) section to only select one [link][/link] section at a time.

Thanks!

A: 

Regular Expressions Info a is a fantastic site. This page gives an example of dealing with html tags. There's also an Eclipse plugin that lets you develop expressions and see the matching in realtime.

Josh
+3  A: 

You need to make the wildcard selection ungreedy with the "?" operator. I make it:

/\[(link='\d+')\s+(type='\w+')\](.*?)\[\/link\]/

of course this all falls down for any kind of nesting, in which case the language is no longer regular and regexs aren't suitable - find a parser

annakata
I had to change some other aspects of the regex for it to make sense to my ecmascript brain...
annakata
Thanks alot! works perfectly!
fishkopter
@annakata: I think this question would have been a reasonable candidate for the "regexhtmlparserquestions" tag you once put up. ;-)
Tomalak
sigh, I do miss that tag :)
annakata
There is still one question that has it. You can still go for the Taxonomist badge. :-)
Tomalak
A: 

You need to make the .* in the middle of your regex non-greedy. Look up the syntax and/or flag for non-greedy mode in your flavor of regular expressions.

Darron