tags:

views:

142

answers:

4

Hey everyone, I'm working on a PHP application that needs to parse a .tpl file with HTML in it and I'm making it so that the HTML can have variables and basic if statements in it. An if statement look something like this: `

<!--if({VERSION} == 2)-->
Hello World
<!--endif -->

To parse that, I've tried using preg_replace with no luck. The pattern that I tried was

/<!--if\(([^\]*)\)-->([^<]*)<!--endif-->/e

which gets replaced with

if($1) { echo "$2"; }

Any ideas as to why this won't work and what I can do to get it up and running?

+3  A: 

You have a space between endif and --> but your regular expression doesn't allow this.

Incidentally, this seems horribly insecure... Is there any reason you're not using a pre-built templating engine like Smarty?

Greg
+2  A: 

Testing your regular expression, I see your backslash is applied to the square bracket. To use a backslash inside square brackets inside a quoted string, you need to escape it twice:

'/<!--if\(([^\\\]*)\)-->([^<]*)<!--endif-->/e'

But I don't know why you're inventing a new template logic framework, when solutions like Smarty and PHP itself exist.


Here's test code, in response to the comments below.

testinput.tpl:

<!--if({VERSION} == 2)-->
Hello World
<!--endif-->

match.php:

<?php
$template = file_get_contents('testinput.tpl');
print preg_match('/<!--if\(([^\\\]*)\)-->/e', $template) . "\n";
print preg_match('/<!--endif-->/e', $template) . "\n";
print preg_match('/<!--if\(([^\\\]*)\)-->([^<]*)<!--endif-->/e', $template) . "\n";

test run:

$ php match.php
1
1
1
Bill Karwin
I was about to ask the exact same question... So I just upvote your! ;-)
PhiLho
Three backslashes is as bad as one, Bill; you're still escaping the square bracket.
Alan Moore
Really? I just ran it, and it returns a positive match against the input. Whereas with one or two backslashes it gives an error about imbalanced square bracket.
Bill Karwin
Hmm. With four backslashes it also returns a positive match.
Bill Karwin
How are you testing it, and what did the capturing groups contain? I tried it in EditPad's search box and there were no errors, but it didn't match anything. Java won't even accept it with an odd number of backslashes.
Alan Moore
Test code posted above.
Bill Karwin
I can't test PHP directly, but I've tried this in some online PHP regex testers, and it works with two backslashes, but not with one or three. Not that it matters; there shouldn't be any backslashes in that character class. Just a right parenthesis.
Alan Moore
A: 

I agree with RoBorg, Smarty is a good way to keep your PHP and HTML seperate. If something already exists you don't need to re-invent the wheel

bju1046
Smarty seems that is has a lot of functionality that I don't really need, what I have now works fine since it only changes {VARIABLES} and ifs
Vestonian
A: 

I think you meant to do this:

'/<!--if\(([^)]*)\)-->([^<]*)<!--endif-->/'

Your regex has only one character class in it:

[^\]*)\)-->([^<]

Here's what's happening:

  • The first closing square bracket is escaped by the backslash, so it's matched literally.
  • The parentheses that were supposed close the first capturing group and open the second one are also taken literally; it isn't necessary to escape parens inside a character class.
  • The first hyphen is taken as a metacharacter; it forms the range [)*+,-]
  • The second opening square bracket is taken as a literal square bracket because it's inside a character class.
  • The second caret is taken as a literal caret because it's not the first character in the class.

So, after removing the duplicates and sorting the characters into their ASCII order, your character class is equivalent to this:

[^()*+,\-<>\[\]^]

And the parentheses outside the character class are still balanced, so the regex compiles, but it doesn't even come close to matching what you wanted it to.

Alan Moore