tags:

views:

44

answers:

3

Hey guys, question about regex in PHP!

For the given pattern, kinda like a shell terminal syntax:

application>function -arg1 value -arg2 value -arg3 value -arg4 value

I want to parse the arguments. This is my regex code

$command=' -arg1 value -arg2 value -arg3 value -arg4 value ';

// note the string begins by a space character and ends by a space character
// now i'm trying to parse arguments

$cmd->arguments=new \stdClass();

preg_replace('`\s-(.*)\s([a-zA-Z0-9-_]*)\s`Ue',
'$cmd->arguments->$1="$2";',$command);

// this regex will pick one matching out of two and returns

$cmd->arguments=stdClass(

    [arg1]=>value,
    [arg3]=>value

)

arg2 and arg4 are skipped. Any idea why? Thanks in advance!

A: 

Use getopt() http://www.tuxradar.com/practicalphp/21/2/4

berkes
This may be a good alternative, but why recommend another approach? All OP asked for was why regex doesn't work -- that does not mean that this is the exact way it was intended to be used. There may be other valid reasons for doing it this way, which were hidden when the test case was written.
MJB
Indeed. I assumed only his example. Which would be very easy achievable with getopt().
berkes
actually i am working with a code from somebody else and its syntax is similar to a shell syntax yet different on some occasions. Thanks though!
fabjoa
+1  A: 

To answer your question: You have a space \s both at the start and at the end of your regex, so after the first match arg1, the first occurrence of \s- is at arg3 because the space you are searching for before arg2 has already been matched at the end of the first match.

It also might be easier to just trim() the string and then split() / explode() it at the spaces.

Edit: By the way, removing the \s at the end should solve your problem.

jeroen
cool, makes sense, i was really lost into regex logics for a while... Thanks!
fabjoa
You're welcome!
jeroen
A: 

As jeroen said, your specific issue is the \s at the beginning and end of your regex.

It is easy to rewrite this regex so that the spaces are not needed at all, except in between the arg and value. Consider this regex:

-(.*?)\s+([a-zA-Z0-9-_]*)

Matches:

    -arg1 value -arg2 value -arg3 value  -arg4  value  //spaces before and between...
-arg1 value    -arg2 value -arg3 value   -arg4    value //no lead spaces...      
drewk