views:

64

answers:

2
[quote=Username here]quoted text here[/quote]

Reply text here

I need a regular expression that stores the "Username here", "quoted text here" and "Reply text here" in a Array.

This expression needs to support nesting aswell. Eks:

[quote=Username2 here][quote=Username here]quoted text here[/quote]

Reply text here[/quote]

Reply text here
A: 

Assuming you do not want to return the values nested in some way or with quotes matched - which are impossible in a regex - you can just split on the parts you do not need:

preg_split('/(\[quote=|\[quote]|]|\[/quote])/', $yourstring);
Matijs
That will not match arbitrary nested quote blocks.
Bart Kiers
As I said: some things you cannot do in a regular expression. Matching with nesting is one such thing. Very unfair to subtract points when I cannot get you the mathematically impossible.
Matijs
Apologies, I though you posted a `preg_match(...)` solution, I now see it was a `preg_split(...)`. I will remove my down-vote.
Bart Kiers
Note that all regex implementations that support look-arounds, back references etc. cannot be considered mathematical "regular". And when an implementation does not support recursive constructs, you can still match a fixed number of nested tags by using look-aheads.
Bart Kiers
Ah well, your solution does win hands down.
Matijs
@Matijs, perhaps, but although I know a bit of regex-trickery, I wouldn't quickly use a recursive regex construct in any code. Not even in one of the many pet-projects I'm working on! :)
Bart Kiers
+3  A: 

This regex matches nested quote block (in group 1) with an additional last reply (in group 2):

(\[quote=[^]]*](?:(?R)|.)*\[/quote])(.*)

A little demo:

$text = '[quote=Username2 here][quote=Username here]quoted text[/quote]Reply text[/quote]More text';
preg_match('#(\[quote=[^]]*](?:(?R)|.)*\[/quote])(.*)#is', $text, $match);
print_r($match);

produces:

Array
(
    [0] => [quote=Username2 here][quote=Username here]quoted text[/quote]Reply text[/quote]More text
    [1] => [quote=Username2 here][quote=Username here]quoted text[/quote]Reply text[/quote]
    [2] => More text
)

A little explanation:

(                  # open group 1
  \[quote=[^]]*]   #   match '[quote= ... ]'
  (?:(?R)|.)*      #   recursively match the entire pattern or any character and repeat it zero or more times
  \[/quote]        #   match '[/quote]'
)                  # open group 1
(                  # open group 2
  .*               #   match zero or more trailing chars after thae last '[/quote]'
)                  # close group 2

But, using these recursive regex constructs supported by PHP might make ones head spin... I'd opt for a little parser like John Kugelman suggested.

Bart Kiers