views:

169

answers:

3

I'm working on a C project that has seen many different authors and many different documentation styles.

I'm a big fan of doyxgen and other documentation generations tools, and I would like to migrate this project to use one of these systems.

Is anybody aware of a tool that can scan source code comments for keywords like "Description", "Author", "File Name" and other sorts of context to intelligently convert comments to a standard format? If not I suppose I could write a crazy script, or convert manually.

Thanks

+2  A: 

The only one I can think of when I read the O'Reilly's book on Lex + Yacc, was that there was code to output the comments on the command line, there was a section in chapter 2 that shows how to parse the code for comments including the // and /*..*/...There's a link on the page for examples, download the file progs.zip, the file you're looking for is ch2-09.l which needs to be built, it can be easily modified to output the comments. Then that can be used in a script to filter out 'Name', 'Description' etc...

I can post the instructions here on how to do this if you are interested?

Edit: I think I have found what you are looking for, a prebuilt comment documentation extractor here.

Hope this helps, Best regards, Tom.

tommieb75
A: 

I think as tommieb75 suggests, a proper parser is the way to handle this.

I'd suggest looking at ANTLR, since it supports re-writing the token buffer in-place, which I think would minimise what you have to do to preserve whitespace etc - see chapter 9.7 of The Definitive ANTLR reference.

therefromhere
A: 

If you have relatively limited set of styles to parse, it would be very simple to write a Visual Studio macro that will search a file for comments and then reformat them into a new style using certain titles or tags to split them apart.

This is what my free AtomineerUtils add-in does when updating existing doc comments. Unfortunately, it only parses a few specific xml and doxygen-based forms of comment structure, so it won't currently handle your particular needs - but the principle is the same.

However, it could help you in a secondary sense - if you can convert the raw comments into a rough XML or doxygen-compatible format, you could use AtomineerUtils to then clean up the comments - it applies word wrapping, consistent element ordering and spacing etc automatically, and ensures that the comment accurately describes the code element it documents. So it could potentially save you doing a lot of work in the macro to get things tidy - and of course you can continue to use it so any new comments continue in the same style.

Jason Williams