views:

146

answers:

3

Hi

I am using the following code to match all variables in a script starting with '$', however i would like the results to not contain duplicates, ie be distinct/unique:

preg_match_all('/\$[a-zA-Z0-9]+/', $code, $variables);

Any advice?

+2  A: 

Use array_unique to remove the duplicates from your output array:

preg_match_all('/\$[a-zA-Z0-9]+/', $code, $variables);
$variables = array_unique($variables[0]);

But I hope you’re not trying to parse PHP with that. Use token_get_all to get the tokens of the given PHP code.

Gumbo
I agree that this is the easiest solution, although I am curious if there is a way to do this in regex, so not to turn this O(1) method into O(N)...
icco
@icco: I’m not sure how `array_unique` is implemented. But I guess they’re using a hash table for the lookup. And that means O(1).
Gumbo
Ahh, that would make sense. Good point.
icco
Cheers, I did see that php function, but did not notice that I needed the [0] due to the large result set. Doh!I am using this to obfuscate JS as google closure compiler does not allow for some of the coding practices in the js I have when in advanced mode. Namely the use of 'this' in static methods.
Gavin
@icco, sure you could do it in regex, but the look ahead will make it into a nightmare (performance-wise). This should do it: `(\$[a-zA-Z0-9]+)(?!.*\1)`. Also, the variables would appear in reverse order in the `$variables` array.
Bart Kiers
+1  A: 

Try the following code:

preg_match_all('/\$[a-zA-Z0-9]+/', $code, $variables);
$variables = array_unique($variables);
pako
+1  A: 

Don't do that with regex. After you collected them all in your $variables, simply filter them using normal programming logic/operations. Using array_unique as Gumbo mentioned, for example.

Also, what will your regex do in these cases:

// this is $not a var
foo('and this $var should also not appear!');
/* and what about $this one? */

All three "variables" ($not, $var and $this) aren't variables, but will be matched by your regex.

Bart Kiers
Thanks for your reply, but this is not an issue in my case.My code is used to obfuscate JS before it is packed(which strips comments) and there aren't any instances of $ in strings.
Gavin
Okay, good to hear that, then you can safely use `preg_match_all`. I thought it worth mentioning just in case.
Bart Kiers