views:

192

answers:

2

Does anybody know of a python module to parse a doxygen style C++ comment string? I mean a string like this (simple example):

  /**
   * A constructor.
   * A more elaborate description of the constructor.
   * @param param1 test1
   * @param param2 test2
   */

and I would like to extract the brief, the long description, the parameters, the return value etc. I'm currently doing this using string methods and regular expressions but my solution is not very robust. Alternatively can anybody recommend an easy to use python parser lib that I can set up quickly?

Thanks in advance

+1  A: 

You should take a look at how doxygen is implemented to see how it handles parsing. I very much doubt it uses regex.

Taybin
I did, and this was actually the motivation for this question. It uses lex and a 179kb large source file to generate a lexer. I'm looking for a simpler solution here (thus my additional question about a simple python parser lib).
Sebastian
+1  A: 

You might be able to set something up using the SimpleParse module, but this does require creating an EBNF grammar which might be more investment than you are interested in.

The Sphinx/Doxygen bridge (Breathe) uses the xml output of Doxygen and acts on that instead. Perhaps a similar approach could work here - run Doxygen to extract xml formatted docs and then leverage some of the code from Breathe to get at the data you require.

Mark Streatfield