views:

127

answers:

3

Hi,

I've a string that may or may not include C++ comments on it (multi-lin and single line) and I need to strip those comments out before being able to use that string. My current idea is to use an NSScanner to do that - find the position of opening and closing multi-line comments and delete that portion of the string and find the position of opening one line comments and EOL characters and also delete that portion of the string.

Would you do it differently? What would be your approach? If it matters, the string can be several megabytes in size so performance is an issue.

+1  A: 

From this thread, I thought the best suggestion was to run the string through the C++ preprocessor.

mobrule
Though if the string also has `#include`, `#define`, `#etc` directives, there may be undesirable side-effects.
mobrule
My users may not have developer tools installed on their systems.
Rui Pacheco
A: 

Don't forget to keep track of quote marks, too. Test cases:

  • "/*Ceci n'est pas une commentaire*/"
  • '/**/' (Mac OS/Mac OS X OSType literal)
  • '//!\n'
  • "This string does not contain a // comment"

In all of these cases, you should not detect a comment.

The converse is also true:

  • //Ceci n'est pas une "string"
  • /*This comment does not contain an OS'Type' literal*/
Peter Hosey
Yep, I know. Who said software was easy?
Rui Pacheco
A: 

My solution:

Go through the string using NSScanner and mark the position of each multi-line and single comment and all strings (anything between single and double quotes). Store the positions in an array of NSValues that represent the ranges of each item.

Then iterate through the array of comments, making sure that each comment is not inside a string. The way to check for this is to make sure that the location of each comment is not bigger than the location of each string and the location of the comment does not fall into location + length of each string.

And voila. Anything that doesn't fall into can be safely deleted as it is a valid comment.

Rui Pacheco