views:

136

answers:

6

I am trying to extract all substrings in a string that is between the strings /* and */. I know this will probably need to be done with regular expressions however I'm having a hard time getting the correct regex since the star character is actually used to symbolise repeated characters. I'm am trying to use the preg-match method in PHP here is what I have come up with so far but I'm not having much luck.

<?php
   $aString = "abcdef/*ghij*/klmn/*opqrs*/tuvwxyz";
   preg_match("/*/.*/", $aString, $anArray);

   for ($i = 0; $i < count($anArray); i++)
      echo $anArray[i] . "\n";
?>
A: 

Escape the * to use it, and ad parentheses to capture the content like that : /\*(.*)\*/, and you should use preg_match_all to find all matches in your string.

(and easier than a for, use var_dump($anArray))

Ugo Méda
A: 

Working code:

 $aString = "abcdef/*ghij*/klmn/*opqrs*/tuvwxyz";

 // SIMPLE VERSION WHERE ASTERISK MAY NOT BE IN THE COMMENT
 // \/\* is just escape sequence for /*  
 // [^\*]* - in comment may be whatever except * (asterisk)
 // \*\/ is escape sequence for */
 preg_match_all("#\/\*[^\*]*\*\/#", $aString, $anArray);

 // BETTER VERSION 
 // http://www.regular-expressions.info/refadv.html - for explanation of ?: and ?!  
 preg_match_all("#\/\*" . "((?:(?!\*\/).)*)" . "\*\/#", $aString, $anArray);


 var_dump($anArray); // easier for debugging than for-loop

Output for better version:

array(2) {
  [0]=>
  array(2) {
    [0]=>
    string(8) "/*ghij*/"
    [1]=>
    string(9) "/*opqrs*/"
  }
  [1]=>
  array(2) {
    [0]=>
    string(4) "ghij"
    [1]=>
    string(5) "opqrs"
  }
}
MartyIX
Thanks very much works great
jazzdawg
If I put a star somewhere in there (ie: `/*gh*ij*`) it will fail
NullUserException
@NullUserException: Yes, I was aware of that and I've added new version which should work better.
MartyIX
Those comments are really helpful, thanks again
jazzdawg
Why are you escaping the forward slash with `\/`?
NullUserException
@NullUserException: I simply don't remember which characters need escaping. :-[
MartyIX
A: 

That is not going to work. You need:

$regex = '!/\*(.*?)\*/!';
preg_match($regex, $aString, $anArray);

echo '<pre>';
print_r($anArray);
echo '<pre>';

If you are just trying to highlight php syntax, you can use highlight_string()

NullUserException
A: 
$aString = "abcdef/*ghij*/klmn/*opqrs*/tuvwxyz";
preg_match_all("/\/\*(.*?)\*\//", $aString, $anArray,PREG_SET_ORDER);
var_dump($anArray);
stillstanding
A: 

If (as you say in one of the comments) you're attempting to display PHP code in HTML there's actually a built-in function (highlight_file) that does precisely this.

Free free to ignore if you're using this as a learning exercise, etc. :-)

middaparka
A: 

To extract comment sections out of PHP code, use the Tokenizer.

token_get_all() will parse the code, and return an array of elements.

Comments will be represented as T_COMMENT elements.

This has the great advantage of catching all possible ways of having comments in PHP code:

/* This way, */

// This way

# and this way
Pekka
Thanks I'll have a look into that
jazzdawg
More important: It won’t give you false positives like in `$str = "/* foo */";`
Gumbo
@Gumbo good point.
Pekka