views:

679

answers:

8

I have several finished, older PHP projects with a lot of includes that I would like to document in javadoc/phpDocumentor style.

While working through each file manually and being forced to do a code review alongside the documenting would be the best thing, I am, simply out of time constraints, interested in tools to help me automate the task as much as possible.

The tool I am thinking about would ideally have the following features:

  • Parse a PHP project tree and tell me where there are undocumented files, classes, and functions/methods (i.e. elements missing the appropriate docblock comment)

  • Provide a method to half-way easily add the missing docblocks by creating the empty structures and, ideally, opening the file in an editor (internal or external I don't care) so I can put in the description.

Optional:

  • Automatic recognition of parameter types, return values and such. But that's not really required.

The language in question is PHP, though I could imagine that a C/Java tool might be able to handle PHP files after some tweaking.

Thanks for your great input!

+1  A: 

You can use the Code Sniffer for PHP to test your code against a predefined set of coding guidelines. It will also check for missing docblocks and generate a report you can use to identify the files.

Cassy
This is a good starting point, I hadn't thought of Code Sniffer for this. Cheers!
Pekka
+17  A: 

I think PHP_Codesniffer can indicate when there is no docblock -- see the examples of reports on this page (quoting one of those) :

--------------------------------------------------------------------------------
FOUND 5 ERROR(S) AND 1 WARNING(S) AFFECTING 5 LINE(S)
--------------------------------------------------------------------------------
  2 | ERROR   | Missing file doc comment
 20 | ERROR   | PHP keywords must be lowercase; expected "false" but found
    |         | "FALSE"
 47 | ERROR   | Line not indented correctly; expected 4 spaces but found 1
 47 | WARNING | Equals sign not aligned with surrounding assignments
 51 | ERROR   | Missing function doc comment
 88 | ERROR   | Line not indented correctly; expected 9 spaces but found 6
--------------------------------------------------------------------------------

I suppose ou could use PHP_Codesniffer to at least get a list of all files/classes/methods that don't have a documentation ; from what I remember, it can generate XML as output, which would be easier to parse using some automated tool -- that could be the first step of some docblock-generator ;-)


Also, if you are using phpDocumentor to generate the documentation, can this one not report errors for missing blocks ?

After a couple of tests, it can -- for instance, running it on a class-file with not much documentation, with the --undocumentedelements option, such as this :

phpdoc --filename MyClass.php --target doc --undocumentedelements

Gives this in the middle of the output :

Reading file /home/squale/developpement/tests/temp/test-phpdoc/MyClass.php -- Parsing file
WARNING in MyClass.php on line 2: Class "MyClass" has no Class-level DocBlock.
WARNING in MyClass.php on line 2: no @package tag was used in a DocBlock for class MyClass
WARNING in MyClass.php on line 5: Method "__construct" has no method-level DocBlock.
WARNING in MyClass.php on line 16: File "/home/squale/developpement/tests/temp/test-phpdoc/MyClass.php" has no page-level DocBlock, use @package in the first DocBlock to create one
done

But, here, too, even if it's useful as a reporting tool, it's not that helpful when it comes to generating the missing docblocks...


Now, I don't know of any tool that will pre-generate the missing docblocks for you : I generally use PHP_Codesniffer and/or phpDocumentor in my continuous integration mecanism, it reports missing docblocks, and, then, each developper adds what is missing, from his IDE...

... Which works pretty fine : there is generally not more than a couple of missing docblocks every day, so the task can be done by hand (and Eclipse PDT provides a feature to pre-generate the docblock for a method, when you are editing a specific file/method).

Appart from that, I don't really know any fully-automated tool to generate docblocks... But I'm pretty sure we could manage to create an interesting tool, using either :


After a bit more searching, though, I found this blog-post (it's in french -- maybe some people here will be able to understand) : Ajout automatique de Tags phpDoc à l'aide de PHP_Beautifier.
*Possible translation of the title : "Automatically adding phpDoc tags, using PHP_Beautifier"*

The idea is actually not bad :

  • The PHP_Beautifier tool is pretty nice and powerful, when it comes to formating some PHP code that's not well formated
    • I've used it many times for code that I couldn't even read ^^
  • And it can be extended, using what it calls "filters".

The idea that's used in the blog-post I linked to is to :

  • create a new PHP_Beautifier filter, that will detect the following tokens :
    • T_CLASS
    • T_FUNCTION
    • T_INTERFACE
  • And add a "draft" doc-block just before them, if there is not already one


To run the tool on some MyClass.php file, I've had to first install PHP_Beautifier :

pear install --alldeps Php_Beautifier-beta

Then, download the filter to the directory I was working in (could have put it in the default directory, of course) :

wget http://fxnion.free.fr/downloads/phpDoc.filter.phpcs
cp phpDoc.filter.phpcs phpDoc.filter.php

And, after that, I created a new beautifier-1.php script (Based on what's proposed in the blog-post I linked to, once again), which will :

  • Load the content of my MyClass.php file
  • Instanciate PHP_Beautifier
  • Add some filters to beautify the code
  • Add the phpDoc filter we just downloaded
  • Beautify the source of our file, and echo it to the standard output.


The code of the beautifier-1.php script will like this :
(Once again, the biggest part is a copy-paste from the blog-post ; I only translated the comments, and changed a couple of small things)

require_once 'PHP/Beautifier.php';

// Load the content of my source-file, with missing docblocks
$sourcecode = file_get_contents('MyClass.php');

$oToken = new PHP_Beautifier(); 

// The phpDoc.filter.php file is not in the default directory,
// but in the "current" one => we need to add it to the list of
// directories that PHP_Beautifier will search in for filters
$oToken->addFilterDirectory(dirname(__FILE__));

// Adding some nice filters, to format the code
$oToken->addFilter('ArrayNested');  
$oToken->addFilter('Lowercase');        
$oToken->addFilter('IndentStyles', array('style'=>'k&r'));

// Adding the phpDoc filter, asking it to add a license
// at the beginning of the file
$oToken->addFilter('phpDoc', array('license'=>'php'));

// The code is in $sourceCode
// We could also have used the setInputFile method,
// instead of having the code in a variable
$oToken->setInputString($sourcecode);        
$oToken->process();

// And here we get the result, all clean !              
echo $oToken->get();

Note that I also had to path two small things in phpDoc.filter.php, to avoid a warning and a notice...
The corresponding patch can be downloaded there : http://extern.pascal-martin.fr/so/phpDoc.filter-pmn.patch


Now, if we run that beautifier-1.php script :

$ php ./beautifier-1.php

With a MyClass.php file that initialy contains this code :

class MyClass {
    public function __construct($myString, $myInt) {
        // 
    }

    /**
     * Method with some comment
     * @param array $params blah blah
     */
    public function doSomething(array $params = array()) {
        // ...
    }

    protected $_myVar;
}

Here's the kind of result we get -- once our file is Beautified :

<?php
/**
 *
 * PHP version 5
 *
 * LICENSE: This source file is subject to version 3.0 of the PHP license
 * that is available through the world-wide-web at the following URI:
 * http://www.php.net/license/3_0.txt.  If you did not receive a copy of
 * the PHP License and are unable to obtain it through the web, please
 * send a note to [email protected] so we can mail you a copy immediately.
 * @category   PHP
 * @package
 * @subpackage Filter
 * @author FirstName LastName <mail>
 * @copyright 2009 FirstName LastName
 * @link
 * @license     http://www.php.net/license/3_0.txt  PHP License 3.0
 * @version    CVS: $Id:$
 */


/**
 * @todo Description of class MyClass
 * @author 
 * @version 
 * @package 
 * @subpackage 
 * @category 
 * @link 
 */
class MyClass {

    /**
     * @todo Description of function __construct
     * @param  $myString 
     * @param  $myInt
     * @return 
     */
    public function __construct($myString, $myInt) {
        //

    }
    /**
     * Method with some comment
     * @param array $params blah blah
     */
    public function doSomething(array $params = array()) {
        // ...

    }

    protected $_myVar;
}

We can note :

  • The license block at the beginning of the file
  • The docblock that's been added on the MyClass class
  • The docblock that's been added on the __construct method
  • The docblock on the doSomething was already present in our code : it's not been removed.
  • There are some @todo tags ^^


Now, it's not perfect, of course :

  • It doesn't document all the stuff we could want it too
    • For instance, here, it didn't document the protected $_myVar
  • It doesn't enhance existing docblocks
  • And it doesn't open the file in any graphical editor
    • But that would be much harder, I guess...


But I'm pretty sure that this idea could be used as a starting point to something a lot more interesting :

  • About the stuff that doesn't get documented : adding new tags that will be recognized should not be too hard
    • You just have to add them to a list at the beginning of the filter
  • Enhancing existing docblocks might be harder, I have to admit
  • A nice thing is this could be fully-automated
  • Using Eclipse PDT, maybe this could be set as an External Tool, so we can at least launch it from our IDE ?
Pascal MARTIN
incredible answer .. +1
Tobias
+1, you must document a lot ... :)
Gaby
Great, great stuff @Pascal. I can't do a live implementation of this right now but at first glance, it seems to be doing what I need. I will be getting back to you with the results.
Pekka
@Pekka : thanks ! *(wooo, +550 feels **great** ^^ )* ;;; if you use this and go farther, one day, would you release the result as open-source ? I'm pretty sure this would be useful to some developpers *(me included, actually ;-) )*
Pascal MARTIN
@Pascal you're welcome, you deserve them! :) If something usable comes out of it, I definitely will publish the results.
Pekka
+1  A: 

php-tracer-weaver can instrument code and generate docblocks with the parameter types, deducted through runtime analysis.

troelskn
Ohh, this looks very interesting. I can't check it out right away but definitely will during the next week.
Pekka
+1  A: 

The 1.4.x versions of phpDocumentor have the -ue option (--undocumentedelements) [1], which will cause undocumented elements to be listed as warnings on the errors.html page that it generates during its doc run.

Further, PHP_DocBlockGenerator [2] from PEAR looks like it can generate missing docblocks for you.

[1] -- http://manual.phpdoc.org/HTMLSmartyConverter/HandS/phpDocumentor/tutorial_phpDocumentor.howto.pkg.html#using.command-line.undocumentedelements

[2] -- http://pear.php.net/package/PHP_DocBlockGenerator

ashnazg
A: 

We use codesniffer for this functionality at work, using standard PEAR or Zend standards. It will not allow you to edit the files on the fly, but will definitely give you a list, with lines and description of what kind of docblock is missing.

HTH, Jc

JC
A: 

Since PHPCS was already mentioned, I throw in the Reflection API to check for missing DocBlocks. The article linked below is a short tutorial on how you could approach your problem:

Gordon
A: 

You want to actually automate the problem of filling in the "javadoc" type data?

The DMS Software Reengineering Toolkit could be configured to do this.

It parses source text just like compilers do, builds internal compiler structures, lets you implement arbitrary analyses, make modification to those structures, and then regenerate ("prettyprint") the source text changed according to the structure changes. It even preserves comments and formatting of the original text; you can of course insert additional comments and they will appear and this seems to be your primary goal. DMS does this for many languages, including PHP

What you would want to do is parse each PHP file, locate every class/method, generate the "javadoc" comments that should be that entity (difference for classes and methods, right?) and then check that corresponding comments were actually present in the compiler structures. If not, simply insert them. PrettyPrint the final result. Because it has access to the compiler structures that represent the code, it shouldn't be difficult to generate parameter and return info, as you suggested. What it can't do, of course, is generate comments about intendend purpose; but it could generate a placeholder for you to fill in later.

Ira Baxter
A: 

No idea if it's any help, but if Codesniffer can point out the functions/methods, then a decent PHP IDE (I offer PHPEd) can easily inspect and scaffold the PHPDoc comments for each function.

Simply type /** above each function and press ENTER, and PHPEd will auto-complete the code with @param1, @param1, @return, etc. filled out correctly, ready for your extra descriptions. Here's the first one I tried in order to provide an example:

  /**
  * put your comment here...
  * 
  * @param mixed $url
  * @param mixed $method
  * @param mixed $timeout
  * @param mixed $vars
  * @param mixed $allow_redirects
  * @return mixed
  */
  public static function curl_get_file_contents($url, $method = 'get', $timeout = 30, $vars = array(), $allow_redirects = true)

This is easily tweaked to:

  /**
  * Retrieves a file using the cURL extension
  * 
  * @param string $url
  * @param string $method
  * @param int $timeout
  * @param array $vars parameters to pass to cURL
  * @param int $allow_redirects boolean choice to follow any redirects $url serves up
  * @return mixed
  */
  public static function curl_get_file_contents($url, $method = 'get', $timeout = 30, $vars = array(), $allow_redirects = true)  

Not exactly an automated solution, but quick enough for me as a lazy developer :)

Raise