views:

81

answers:

4

i'm looking for a ready-made grammar and parser for php (at least 5.2), ideally an utility/library that can parse php code into a readable AST, e.g. xml. The parser itself doesn't have to be written in php, the source language doesn't matter much.

A: 

phpParseTree

The Parse_Tree extension generates an XML parse tree from a php code.

Sjoerd
This appears to produce only a stream of tokens wrapped in XML tags. It does not look like a parse tree in spite of what the target web site says.
Ira Baxter
no luck compiling this ;(
stereofrog
A: 

The PHP compiler is open source. Goto http://php.net and dowload the latest version of the source tree. You will find the parser in there.

Martin York
I don't believe it produces an AST.
Ira Baxter
I rather doubt it will. But it should provide exactly what is needed so you can build your own.
Martin York
All you need is a grammar you can trust, and a parse tree builder. That's a lot harder than it looks, partly because PHP is such a mess of a langauge. I've done this.
Ira Baxter
+1  A: 

Our DMS Software Reengineering Toolkit is generalized compiler technology used to parse/analyzer/transform arbitrary computer langauges. It parses to ASTs, and has support for building symbol tables, and various types of flow graphs.

It has a PHP Front End that is fully PHP 5.x compliant, automatically builds full ASTs, using DMS as a foundation. It can export XML, but our experience (and the design of DMS) says you get a lot more milage by staying "inside" DMS with the AST data structure, doing your work there, with DMS's huge library of AST manipulation and pattern matching facilities, and then generating your result, rather than trying to handle the huge amounts of XML that you will get.

This front end has been used in a number of production tools.

Ira Baxter
thanks, but this looks too pricey for my needs
stereofrog
A: 

To answer my own question I've managed to compile phc on my OSX box, the parser part seems to work well

 phc --dump-xml=ast foo.php > bar.xml

creates an xml representation of the AST.

stereofrog