views:

368

answers:

2

I need to convert single line comments (//...) to block comments (/*...*/). I have nearly accomplished this in the following code; however, I need the function to skip any single line comment is already in a block comment. Currently it matches any single line comment, even when the single line comment is in a block comment. That can't happen. This is PHP btw. Thanks for help in advance!

 ## Convert Single Line Comment to Block Comments
 function singleLineComments( &$output ) {
  $output = preg_replace_callback('#//(.*)#m',
   create_function(
     '$match',
     'return "/* " . trim(mb_substr($match[1], 0)) . " */";'
   ), $output
  );
 }
+1  A: 

You could try a negative look behind: http://www.regular-expressions.info/lookaround.html

## Convert Single Line Comment to Block Comments
function sinlgeLineComments( &$output ) {
  $output = preg_replace_callback('#^((?:(?!/\*).)*?)//(.*)#m',
  create_function(
    '$match',
    'return "/* " . trim(mb_substr($match[1], 0)) . " */";'
  ), $output
 );
}

however I worry about possible strings with // in them. like: $x = "some string // with slashes"; Would get converted.

If your source file is PHP, you could use tokenizer to parse the file with better precision.

http://php.net/manual/en/tokenizer.examples.php

Edit: Forgot about the fixed length, which you can overcome by nesting the expression. The above should work now. I tested it with:

$foo = "// this is foo";
sinlgeLineComments($foo);
echo $foo . "\n";

$foo2 = "/* something // this is foo2 */";
sinlgeLineComments($foo2);
echo $foo2 . "\n";

$foo3 = "the quick brown fox";
sinlgeLineComments($foo3);
echo $foo3. "\n";;
Lance Rushing
Well I'm not to worried if $x = "some string // with slashes"; becomes $x = "some string /* with slashes */";. That would actually be preferred. On the other hand, I made the added the changes you suggested and got a compilation error.Warning: preg_replace_callback() [function.preg-replace-callback]: Compilation failed: lookbehind assertion is not fixed length at offset 6 in C:\wamp\www\LessCSS\Site\cleaner\inc\util.php on line 29
roydukkey
PHP's look-behind only supports fixed length assertions. That means you can't write a look-behind regex that matches an undefined number of characters, which rules out the use of * and ?. More info here: http://www.php.net/manual/en/regexp.reference.assertions.php
Ahmad Mageed
thanks for the heads up. should work now.
Lance Rushing
Doesn't work for this: `/* foo\n// shouldn't match\nbar */` - You don't want it to match the second line, but it does.
Alan Moore
@Lance Rushing: I've managed to get the code working expect for strings that have new line characters like stated by Alan Moore. Here is any updated function, seems you didn't updated the processor.<code>## Convert Single Line Comment to Block Comments function sinlgeLineComments( ' ), $output ); }</code>Should not match<code> $output = "/* foo\n// shouldn't match\nbar */\nfoo more//Should Match"</code>
roydukkey
aww why no code formatting
roydukkey
+2  A: 

As already mentioned, "//..." can occur inside block comments and string literals. So if you create a small "parser" with the aid f a bit of regex-trickery, you could first match either of those things (string literals or block-comments), and after that, test if "//..." is present.

Here's a small demo:

$code ='A
B
// okay!
/*
C
D
// ignore me E F G
H
*/
I
// yes!
K
L = "foo // bar // string";
done // one more!';

$regex = '@
  ("(?:\\.|[^\r\n\\"])*+")  # group 1: matches double quoted string literals
  |
  (/\*[\s\S]*?\*/)          # group 2: matches multi-line comment blocks
  |
  (//[^\r\n]*+)             # group 3: matches single line comments
@x';

preg_match_all($regex, $code, $matches, PREG_SET_ORDER | PREG_OFFSET_CAPTURE);

foreach($matches as $m) {
  if(isset($m[3])) {
    echo "replace the string '{$m[3][0]}' starting at offset: {$m[3][1]}\n";
  }
}

Which produces the following output:

replace the string '// okay!' starting at offset: 6
replace the string '// yes!' starting at offset: 56
replace the string '// one more!' starting at offset: 102

Of course, there are more string literals possible in PHP, but you get my drift, I presume.

HTH.

Bart Kiers