views:

171

answers:

2

Hi all,

I am absolutely useless at regular expressions, so i'd appreciate your help.

I have a string, such as this:

$foo = 'Hello __("How are you") I am __("very good thank you")'

I know it's a strange string, but stay with me please :P

I need a regex expression that will look for the content between __("Look for content here") and put it in an array.

i.e. the regular expression would find "How are you" and "very good thank you".

Thanks very much.

+4  A: 

Try this:

preg_match_all('/(?<=__\(").*?(?="\))/s', $foo, $matches);
print_r($matches);

which means:

(?<=     # start positive look behind
  __\("  #   match the characters '__("'
)        # end positive look behind
.*?      # match any character and repeat it zero or more times, reluctantly
(?=      # start positive look ahead
  "\)    #   match the characters '")'
)        # end positive look ahead

EDIT

And as Greg mentioned: someone not too familiar with look-arounds, it might be more readable to leave them out. You then match everything: __(", the string and ") and wrap the regex that matches the string, .*?, inside parenthesis to capture only those characters. You will then need to get your matches though $matches[1]. A demo:

preg_match_all('/__\("(.*?)"\)/', $foo, $matches);
print_r($matches[1]);
Bart Kiers
You sir are AMAZING.
Jamie
Wouldn't it be simpler to use `/__\("(.*?)"\)/` and then extract the matching group? I always find those lookbehind and lookahead matches hard to read.
Greg Hewgill
Why, thank you Jamie, nice to know there is at least one more person who thinks I am, besides my 2,5 year old son! :)
Bart Kiers
@Greg Hewgill, yes, that is also an option. Perhaps one preferable for Jamie. I'll edit shortly.
Bart Kiers
@anomareh, yes, I wrote a little tool myself that spits out such an explanation.
Bart Kiers
@Greg Hewgill: Yes, it will *absolutely* be simpler and better to use your regular expression. Because the first look-behind assertion will be tested for each position of the test string. I would also use a negated character class instead of a non-greedy universal character expression: `/__\("([^"]*)"\)/`.
Gumbo
@bart Ah before you formatted it, it looked like a jumbled mess from some web app :p Neat tool.
anomareh
+1  A: 

If you want to use Gumbo's suggestion, credit goes to him for the pattern:

$foo = 'Hello __("How are you")I am __("very good thank you")';

preg_match_all('/__\("([^"]*)"\)/', $foo, $matches);

Make sure to use $matches[1] for your results unless you want the full string results too.

var_dump() of $matches:

array
  0 => 
    array
      0 => string '__("How are you")' (length=16)
      1 => string '__("very good thank you")' (length=25)
  1 => 
    array
      0 => string 'How are you' (length=10)
      1 => string 'very good thank you' (length=19)
anomareh