tags:

views:

120

answers:

5

Hi,

I'm trying to match a string like this: {{name|arg1|arg2|...|argX}} with a regular expression

I'm using preg_match with

/{{(\w+)\|(\w+)(?:\|(.+))*}}/

but I get something like this, whenever I use more than two args

Array
(
    [0] => {{name|arg1|arg2|arg3|arg4}}
    [1] => name
    [2] => arg1
    [3] => arg2|arg3|arg4
)

The first two items cannot contain spaces, the rest can. Perhaps I'm working to long on this, but I can't find the error - any help would be greatly apreciated ;)

Thanks in advance, Jan

+4  A: 

Don't use regular expressions for these kind of simple tasks. What you really need is:

$inner = substr($string, 2, -2);
$parts = explode('|', $inner);

# And if you want to make sure the string has opening/closing braces:
$length = strlen($string);
assert($inner[0] === '{');
assert($inner[1] === '{');
assert($inner[$length - 1] === '}');
assert($inner[$length - 2] === '}');
soulmerge
Okay, I have to clarify then:<br />I'm trying to match an unknown number of said expressions in a html template page.so an easy substr is not a possibility...<br />but you gave me an idea ;)<br />I'm going to use the regex to find {{([^}]+)}} and then continue with the explode.Thanks!
Jan
That's a rather lazy answer and turns a simple task into writing a million checks if you really want to be sure.And instead of stripping the brackets by substringing, why not do a str_replace( array( '{','}' ), '', $string );
Polygraf
This answer was written before the clarification. And using str_replace() does not let you **verify** that the brackets are in the string.
soulmerge
A: 

Should work for anywhere from 1 to N arguments

<?php

$pattern = "/^\{\{([a-z]+)(?:\}\}$|(?:\|([a-z]+))(?:\|([a-z ]+))*\}\}$)/i";

$tests = array(
    "{{name}}"                          // should pass
  , "{{name|argOne}}"                   // should pass
  , "{{name|argOne|arg Two}}"           // should pass
  , "{{name|argOne|arg Two|arg Three}}" // should pass
  , "{{na me}}"                         // should fail
  , "{{name|arg One}}"                  // should fail
  , "{{name|arg One|arg Two}}"          // should fail
  , "{{name|argOne|arg Two|arg3}}"      // should fail
  );

foreach ( $tests as $test )
{
  if ( preg_match( $pattern, $test, $matches ) )
  {
    echo $test, ': Matched!<pre>', print_r( $matches, 1 ), '</pre>';
  } else {
    echo $test, ': Did not match =(<br>';
  }
}
Peter Bailey
+3  A: 

The problem is here: \|(.+)

Regular expressions, by default, match as many characters as possible. Since . is any character, other instances of | are happily matched too, which is not what you would like.

To prevent this, you should exclude | from the expression, saying "match anything except |", resulting in \|([^\|]+).

MaxVT
A: 

Of course you would get something like this :) There is no way in regular expression to return dynamic count of matches - in your case the arguments.

Looking at what you want to do, you should keep up with the current regular expression and just explode the extra args by '|' and add them to an args array.

bisko
A: 

indeed, this is from PCRE manual:

When a capturing subpattern is repeated, the value captured is the substring that matched the final iteration. For example, after (tweedle[dume]{3}\s*)+ has matched "tweedledum tweedledee" the value of the captured substring is "tweedledee". However, if there are nested capturing subpatterns, the corresponding captured values may have been set in previous iterations. For example, after /(a|(b))+/ matches "aba" the value of the second captured substring is "b".

2072