views:

79

answers:

2

Hi

I'm looking for a way of pattern matching the "geometry" of an array, the order in which the elements appear, not the contents of each element directly.

Let me outline what I mean by some examples. Given the target array:

array('T_STRING','T_VARIABLE','ASSIGN','T_STRING','LPAREN','T_VARIABLE','COMMA','T_VARIABLE','RPAREN');
//as a matter of fact, these would be the tokens for the PHP code "foo $var = Foo($arg1,$arg2)'

Then the following "pattern" would match, returning the 0-based indexes of the matches, as well as the indexes of the groupings, just like preg_match_all() would do for strings:

array('T_STRING', '?', '(', 'T_VARIABLE', 'ASSIGN' ')', '?',
    'T_STRING', 'LPAREN', '(', 'T_VARIABLE', 'COMMA', '?', ')', '?', 'RPAREN');

This is only a simplified PoC, the way I intend to use it is much more complicated, and I don't want to use the full parser generator from PEAR (the lemon port to PHP), which would be overkill.

Are you aware of a function (possibly not an internal PHP function) or project which does just that?

Thank you.

+1  A: 

When ever I hear "pattern matching" I think "regex".

Push that array to a string and match against the pattern you're looking for using regex. You may be able to a symbol replacement to make the regex small and manageable:

Your array above could be reduced to a string like this:

$arrayPattern = 'SVASL_PVCVR_P'

Now you can use RegEx to match against it.

if (preg_match('/VA/', $arrayPattern)) 
  print "You've got a Variable followed by an Assign!";

Just a thought....

ChronoFish
Certainly something I've already thought about, but how about the part"returning the 0-based indexes of the matches, as well as the indexes of the groupings" which I didn't write in bold text by mistake? ...
Flavius
As you may know, if you're going to use regexp, the third param in `preg_match($expression, $content, $matches)` can give return the matched components. You may have to do some extra work from there, but it should give you a decent foot hold. However, in the long run, I'm not sure if regexp is the best way to go about this.
Justin Johnson
+1  A: 

If you are you looking for Code Analysis, then these slides of Sebastian Bergmann might be of use to you. Starting with slide 17 is examples for analysis by tokens.

Gordon
Thanks, but 404
Flavius
Weird. It's the correct URL. I wrapped into a tiny URL and it should be working now. Does it?
Gordon
Yep, now it does.
Flavius