tags:

views:

220

answers:

3

I have a script that recursively scans a directory pulling out class names from php files, and storing those classes names in an array. This is working nicely even through the rather large Zend Framework library folders.

The issue is that classes that extend other classes are not being included in the array.

Here is my current preg_match:

if (preg_match("/class\s*(\w*)\s*\{/i",strip_comments(file_get_contents($file)),$matches)) $classes[] = $matches[1];

I know that the last \s* is not right; there should be something there that can catch "{" or " extends Some_Other_Class {" .

+2  A: 

Try:

/^class ([a-zA-Z0-9_]+)/
Delan Azabani
Not exhaustive, but should get most of the sane ones. See the note here: http://us2.php.net/manual/en/userlandnaming.php
banzaimonkey
Ah, thanks, I tried to keep it as simple as possible to avoid errors in my regex.
Delan Azabani
Note that your Regex expression will only return the first character of the class name.
icio
Whoops! Made a mistake there... I'll fix it.
Delan Azabani
Do I necessarily need the starts with operator (^) ? The file contents of the class usually has a lot of stuff before like comments (being stripped) and require statements, so if I understand regex correctly, this will match only if "class" is the very first word in the entire file. Is that correct?
talentedmrjones
A: 

Your pattern should simply take the first word following the class keyword to be the class name as opposed to your current pattern which looks for a single word between the class keyword and opening brace {. This is problematic where your class extends another because there isn't just a single word between the delimiters and thus the pattern wouldn't match.

Here's a pattern to try out:

/^\s*class\s+([a-zA-Z0-9_]+)/
icio
I've tried that, but in some cases the word "class" appears within a message being sent to a thrown exception. I'm using the "{" to ensure Im capturing the right word after "class"
talentedmrjones
The class name won't be the whole matched string, but the first group. To access you'll want to call `preg_match_all` to get the `$matches` and then the class names from `foreach($matches[1] as $className) echo $className, "\n";`.Further, could you provide some test cases and your implementation of the code in-case there are other problems?
icio
A: 

I ended up using this foreach php file in the include path:

$handle = @fopen($path.'/'.$dir, "r");
                    $stop=false;
                    if ($handle)
                    {
                        while (!$stop&&!feof($handle))
                        {
                            $line = fgets($handle, 4096);
                            $matches=array();
                            if (preg_match('#^(\s*)((?:(?:abstract|final|static)\s+)*)class\s+'.$input.'([-a-zA-Z0-9_]+)(?:\s+extends\s+([-a-zA-Z0-9_]+))?(?:\s+implements\s+([-a-zA-Z0-9_,\s]+))?#',$line,$matches)) 
                            {
                                $stop=true;
                                $classes[]=$matches[3];
                            }
                        }
                        fclose($handle);
                    }

Seems to work pretty well. Found it in a another Coda Plugin that does something similar. The only catch is that it seems to hang sometimes. Not sure if that's a bug or it just being slow.

talentedmrjones