views:

55

answers:

4

I need to do a preg_replace on all of the PHP tags in a string, as well as any characters sitting between the PHP tags.

Eg, if the file contents was:

Hey there!
<?php some_stuff() ?>
Woohoo!

All that should be left is:

Hey there!
Woohoo!

Here's my code:

$file_contents = file_get_contents('somefilename.php');
$regex = '#([<?php](.*)[\?>])#e';
$file_contents = preg_replace($regex, '<<GENERATED CONTENT>>', $file_contents);

FAIL.

My regular expression skills are poor, can someone please fix my regex. Thank you.

+2  A: 

Try this regex:

#<\?.*?\?>#

Should work on short tags (without 'php') too.

I think the main issue with your attempt was that you need to escape the question marks with backslashes, and that you were using square brackets where you shouldn't have been. Square brackets means "pick any one of these characters".

thomasrutter
Thank you, thomasrutter, that has worked.
Callum
A: 

You can try:

$regex = '#<\?php.*?\?>#i';

The regex used: <\?php.*?\?>

  • < : a literal <
  • \? : ? is a metachar to match a literal ? you need to escape it.
  • .*? : non-greedy to match anything.
codaddict
This will also work and is a good explanation. This requires <?php not just <? as in my answer, so pick whichever you need.
thomasrutter
+1  A: 
$regex="/<?php (.*?)?\>/"

you can also try this this will work for you

Ankur Mukherjee
You need to escape the first and last question marks with backslashes.
thomasrutter
A: 

Use the right tool for the job. The PHP tokenizer contains all the functionality you need to strip PHP code away from the surrounding content:

source.php

<p>Some  HTML</p>
<?php echo("hello world"); ?>
<p>More HTML</p>
<?php
/*
 Strip this out please
 */
?>
<p>Ok Then</p>

tokenize.php

<?php
$source = file_get_contents('source.php');
$tokens= token_get_all($source);
foreach ($tokens as $token) {
 if ($token[2] == 3 || $token[2] == 1 || $token[2] == 9) {
    echo($token[1]);
 }
}

Output:

<p>Some  HTML</p>
<p>More HTML</p>
<p>Ok Then</p>

This is a simple example. The docs list all the parser tokens you can check for.

pygorex1