tags:

views:

98

answers:

3

I'm extracting a string from wikipedia API that initially looks like this: link text. I want to peel off all {{...}} and everything in between them (could be any kind of text). For that I thought about using a recursive function with "preg_match","preg_replace". something like:

function drop_brax($text)
{
    if(preg_match('/{{(.)*}}/',$text)) 
    return drop_brax(preg_replace('/{{(.)*}}/','',$text));
    return $text;
}

This function will not work because of a situation like this:

{{ I like mocachino {{ but I also like banana}} and frutis }}

this will peel off everything between the first occurence of both {{ and }} (and leave out "and frutis }}"). How can I do this properly? (while maintaining the nice recursive form).

A: 

Here's a much simpler function (without the sexy recursiveness, though):

function drop_brax($text)
{
    $text = str_replace("{{", "", $text);
    $text = str_replace("}}", "", $text);

    return $text;
}

(edit)

Nevermind, I realize I didn't read it thoroughly enough. BRB!

vrutberg
that will not peel off everything between them, it will just take off all brackets.
sombe
`str_replace(array('{{','}}'),'', $text);`
Ewan Todd
Yeah, I realized that after I posted and read your question again.
vrutberg
+3  A: 

Try something like this:

$text = '...{{aa{{bb}}cc}}...{{aa{{bb{{cc}}bb{{cc}}bb}}dd}}...';
preg_match_all('/\{\{(?:[^{}]|(?R))*}}/', $text, $matches);
print_r($matches);

output:

Array
(
    [0] => Array
        (
            [0] => {{aa{{bb}}cc}}
            [1] => {{aa{{bb{{cc}}bb{{cc}}bb}}dd}}
        )
)

And a short explanation:

\{\{      # match two opening brackets
(?:       # start non-capturing group 1
  [^{}]   #   match any character except '{' and '}'
  |       #   OR
  (?R)    #   recursively call the entire pattern: \{\{(?:[^{}]|(?R))*}}
)         # end non-capturing group 1
*         # repeat non-capturing group 1 zero or more times
}}        # match two closing brackets
Bart Kiers
I tried it, so far so good, I'm gonna give it a couple more test.Thank you very much!
sombe
You're welcome Gal.
Bart Kiers
A: 

to have this fully recursive you will need a parser:

function drop_brax($str)
{
 $buffer = NULL;
 $depth = 0;
 $strlen_str = strlen($str);
 for($i = 0; $i < $strlen_str; $i++)
 {
  $char = $str[$i];

  switch ($char)
  {
   case '{':
    $depth++;
   break;
   case '}':
    $depth--;
   break;
   default:
    $buffer .= ($depth === 0) ? $char : NULL;
  }
 }
 return $buffer;
}

$str = 'some text {{ I like mocachino {{ but I also like banana}} and frutis }} some text';
$str = drop_brax($str);
echo $str;

output:

some text some text
antpaw
I tried both your suggestion and Bart K.'s, evidently his was quicker in performance. Nevertheless thanks a lot for your help! i appreciate it.
sombe