views:

381

answers:

6

Update- Thanks for all the responses. This Q is getting kind of messy, so I started a sequel if anyone's interested.


I was throwing together a quick script for a friend and stumbled across a really simple way of doing templating in PHP.

Basically, the idea is to parse the html document as a heredoc string, so variables inside of it will be expanded by PHP.

A passthrough function allows for expression evaluation and function and static method calls within the string:

function passthrough($s){return $s;}
$_="passthrough";

The code to parse the document inside a heredoc string is ridiculously simple:

$t=file_get_contents('my_template.html');
eval("\$r=<<<_END_OF_FILE_\n$t\_END_OF_FILE_;\n");
echo $r;

The only problem is, it uses eval.

Questions

  • Can anyone think of a way to do this sort of templating without using eval, but without adding a parser or a ton of regex madness?

  • Any suggestions for escaping stray dollar signs that don't belong to PHP variables without writing a full-on parser? Does the stray dollar sign problem render this approach not viable for 'serious' use?


Here's some sample templated HTML code.

<script>var _lang = {$_(json_encode($lang))};</script>
<script src='/blah.js'></script>
<link href='/blah.css' type='text/css' rel='stylesheet'>

<form class="inquiry" method="post" action="process.php" onsubmit="return validate(this)">

  <div class="filter">
    <h2> 
      {$lang['T_FILTER_TITLE']}
    </h2>
    <a href='#{$lang['T_FILTER_ALL']}' onclick='applyFilter();'>
      {$lang['T_FILTER_ALL']}
    </a>
    {$filter_html}
  </div>

  <table class="inventory" id="inventory_table">
    {$table_rows}
    <tr class="static"><th colspan="{$_($cols+1)}">
      {$lang['T_FORM_HELP']}
    </th></tr>
    {$form_fields}
    <tr class="static">
      <td id="validation" class="send" colspan="{$cols}">&nbsp;</td>
      <td colspan="1" class="send"><input type="submit" value="{$lang['T_SEND']}" /></td>
    </tr>
  </table>

</form>

Why use templating?


There's been some discussion of whether creating a templating layer is necessary in PHP, which, admittedly, is already pretty good at templating.

Some quick reasons templating is useful:

  • You can control it

    If you preprocess the file before it goes to the interpreter, you have more control over it. You can inject stuff, lock down permissions, scrape for malicious php / javascript, cache it, run it through an xsl template, whatever.

  • Good MVC design

    Templating promotes separation of view from model and controller.

    When jumping in and out of <?php ?> tags in your view, it's easy to get lazy and do some database queries or perform some other server action. Using a method like the above, only one statement may be used per 'block' (no semicolons), so it's much more difficult to get caught in that trap. <?= ... ?> have pretty much the same benefit, but...

  • Short tags aren't always enabled

    ...and we want our app to run on various configurations.

When I initially hack a concept together it starts out as one php file. But before it grows I'm not happy unless all php files have only one <?php at the beginning, and one ?> at the end, and preferably all are classes except stuff like the controller, settings, image server, etc.

I don't want much PHP in my views at all, because designers become confused when dreamweaver or whatever poops the bed when it sees something like this:

<a href="<?php $img="$img_server/$row['pic'].png"; echo $img; ?>">
  <img src="<?php echo $img; ?>" /></a>

This is hard enough for a programmer to look at. The average graphic designer won't go anywhere near it. Something like this is a much easier to cope with:

<a href="{$img}"><img src="{$img}" /></a>

The programmer kept his nasty code out of the html, and now the designer can work his design magic. Yay!

Quick update

Taking everyone's advice into consideration, I think preprocessing the files is the way to go, and the intermediate files should be as close as normal "php templating" as possible, with the templates being syntactic sugar. Eval still in place for now while I play with it. The heredoc thing has sort of changed its role. I'll write more later and try to respond to some of the answers, but for now...

<?php



class HereTemplate {

  static $loops;

  public function __construct () {
    $loops=array();
  }

  public function passthrough ($v) { return $v; }

  public function parse_markup ($markup, $no_escape=null, $vars=array()) {
    extract($vars);
    $eot='_EOT_'.rand(1,999999).'_EOT_';
    $do='passthrough';
    if (!$no_escape) $markup=preg_replace(
      array(
        '#{?{each.*(\$\w*).*(\$\w*).*(\$\w*).*}}?#', 
        '#{?{each.*(\$\w*).*(\$\w*).*}}?#', 
        '#{?{each}}?#',
        '#{{#', '#}}#',
        '#{_#', '#_}#',
        ),
      array(
        "<?php foreach (\\1 as \\2=>\\3) { ?>", 
        "<?php foreach (\\1 as \\2) { ?>", 
        "<?php } ?>",
        "<?php echo <<<$eot\n{\$this->passthrough(", ")}\n$eot\n ?>",
        "<?php ", " ?>",
        ), 
      $markup);
    ob_start(); 
    eval(" ?>$markup<?php ");
    echo $markup;
    return ob_get_clean();
  }

  public function parse_file ($file) {
    // include $file;
    return $this->parse_markup(file_get_contents($file));
  }

}


// test stuff


$ht = new HereTemplate();
echo $ht->parse_file($argv[1]);


?>

...

<html>

{{each $_SERVER $key $value}

<div id="{{$key}}">

{{!print_r($value)}}

</div>

{each}}



</html>
+6  A: 

I'm gonna do something silly and suggest something that requires no templating engine at all and requires only at most 5 characters more per variable/call than what you have there - replace {$foo} with <?=$foo?> and then you can use include for all your templating needs

If all you need is variable replacement though this is a templating function i actually use:

function fillTemplate($tplName,$tplVars){
  $tpl=file_get_contents("tplDir/".$tplName);
  foreach($tplVars as $k=>$v){
    $tpl = preg_replace('/{'.preg_quote($k).'}/',$v,$tpl);
  }
  return $tpl;
}

if you want to be able to call functions or have loops, there is basicly no way around calling eval short of pre-processing.

tobyodavies
I considered something like replacing `{{` with `<?php` and `}}` with `?>`, but it could get dicey. Maybe a different delimiter character would be better...
no
Ah, I just realized what you're getting at. Short tags are disabled on most of the servers I deploy to. Also it looks terrible inside of a tag attribute, and breaks syntax highlighting on many editors.
no
If you want to avoid eval you could preprocess the templates to replace `{...}`` with `<?php echo ...?>` but that makes it somewhat more of a pain and somewhat less clear in that it requires 'compilation' of sorts
tobyodavies
Preprocessing is the best I think. eval will kill the performances.
Savageman
@savageman: PHP is fully interpreted anyway, it doesn't (by default) store its byte code between invocations, it has to parse, byte-compile and evaluate each page on every hit... PHP itself is basicly calling `eval_which_handles_open_and_close_tags(file_get_contents($url))` on every page load, so it shouldn't significantly change performance
tobyodavies
Except it's really dumb to not use an op-code cache. And if you use one (which is recommended), the eval'd won't be cached.
Savageman
Do you want `preg_quote()` instead of `preg_escape()` ?
alex
Yes, i wrote that from memory... changed now
tobyodavies
@tobyodavies: I'm not sure, but I think we users have just as much access to `eval_which_handles_open_and_close_tags`, because I vaguely remember that `eval('?>' . $phpfile);` works.
Bart van Heukelom
The worst part: a template language cannot be limited to just output variables. There always will be a logical blocks. Why to answer this question if you never used to use a template yourself?
Col. Shrapnel
@Col. My point was that if you want to avoid the performance problems associated with eval, you can either use php short tags (or preprocess to php tags, see earlier comment), limit yourself to just variable replacement or use eval, or worse, your own (or someone else's) interpreter for a mini language please read the answer before commenting.
tobyodavies
A: 

There is no ultimate solution. Each has pros and cons. But you already concluded what you want. And it seems a very sensible direction. So I suggest you just find the most efficient way to achieve it.

You basically only need to enclose your documents in some heredoc syntactic sugar. At the start of each file:

<?=<<<EOF

And at the end of each template file:

EOF;
?>

Achievement award. But obviously this confuses most syntax highlighting engines. I could fix my text editor, it's open source. But Dreamweaver is a different thing. So the only useful option is to use a small pre-compiler script that can convert between templates with raw $varnames-HTML and Heredoc-enclosed Templates. It's a very basic regex and file rewriting approach:

#!/usr/bin/php -Cq
<?php
foreach (glob("*.tpl") as $fn) {
    $file = file_get_contents($fn);
    if (preg_match("/<\?.+<<</m")) {  // remove
        $file = preg_replace("/<\?(=|php\s+print)\s*<<<\s*EOF\s*|\s+EOF;\s*\?>\s*/m", "", $file);
    }
    else {   // add heredoc wrapper
        $file = "<?php print <<<EOF\n" . trim($file) . "\nEOF;\n?>";
    }
    file_put_contents($fn, $file);
}
?>

This is a given - somewhere you will need templates with a slight amount of if-else logic. For coherent handling you should therefore have all templates behave as proper PHP without special eval/regex handling wrapper. This allows you to easily switch between heredoc templates, but also have a few with normal <?php print output. Mix and match as appropriate, and the designers can work on the majority of files but avoid the few complex cases. For exampe for my templates I'm often using just:

include(template("index"));   // works for heredoc & normal php templ

No extra handler, and works for both common template types (raw php and smartyish html files). The only downside is the occasional use of said converter script.

I'd also add a extract(array_map("htmlspecialchars",get_defined_vars())); on top of each template for security.

Anyway, your passthrough method is exceptionally clever I have to say. I'd call the heredoc alias $php however, so $_ is still available for gettext.

<a href="calc.html">{$php(1+5+7*3)}</a> is more readable than Smarty

I think I'm going to adopt this trick myself.

<div>{$php(include(template($ifelse ? "if.tpl" : "else.tpl")))}</div>

Is stretching it a bit, but it seems after all possible to have simple logic in heredoc templates. Might lead to template-fileritis, yet helps enforcing a most simple template logic.

Offtopic: If the three <<<heredoc&EOF; syntax lines still appear too dirty, then the best no-eval option is using a regular expression based parser. I do not agree with the common myth that that's slower than native PHP. In fact I believe the PHP tokenizer and parser lag behind PCRE. Especially if it's solely about interpolating variables. It's just that the latter isn't APC/Zend-cached, you'd be on your own there.

mario
+5  A: 

If you don't wont to use a big template engines like Twig (which I sincerely recommend) you can still get good results with little code.

The basic idea that all the template engines share is to compile a template with friendly, easy-to-understand syntax to fast and cacheable PHP code. Normally they would accomplish this by parsing your source code and then compiling it. But even if you don't want to use something that complicated you can achieve good results using regular expressions.

So, basic idea:

function renderTemplate($templateName, $templateVars) {
    $templateLocation = 'tpl/'      . $templateName . '.php';
    $cacheLocation    = 'tplCache/' . $templateName . '.php';
    if (!file_exists($cacheLocation) || filemtime($cacheLocation) < filemtime($templateLocation)) {
        // compile template and save to cache location
    }

    // extract template variables ($templateVars['a'] => $a)
    extract($templateVars);

    // run template
    include 'tplCache/' . $templateName . '.php';
}

So basically we first compile the template and then execute it. Compilation is only done if either the cached template doesn't yet exist or there is a newer version of the template than the one in the cache.

So, let's talk about compiling. We will define two syntaxes: For output and for control structures. Output is always escaped by default. If you don't want to escape it you must mark it as "safe". This gives additional security. So, here an example of our syntax:

{% foreach ($posts as $post): }
    <h1>{ $post->name }</h1>
    <p>{ $post->body }</p>
    {!! $post->link }
{% endforeach; }

So, you use { something } to escape and echo something. You use {!! something} to directly echo something, without escaping it. And you use {% command } to execute some bit of PHP code without echoing it (for example for control structures).

So, here's the compilation code for that:

$code = file_get_contents($templateLocation);

$code = preg_replace('~\{\s*(.+?)\s*\}~', '<?php echo htmlspecialchars($1, ENT_QUOTES) ?>', $code);
$code = preg_replace('~\{!!\s*(.+?)\s*\}~', '<?php echo $1 ?>', $code);
$code = preg_replace('~\{%\s*(.+?)\s*\}~', '<?php $1 ?>', $code);

file_put_contents($cacheLocation, $code);

And that's it. You though have to note, that this is more error prone than a real template engine. But it will work for most cases. Furthermore note that this allows the writer of the template to execute arbitrary code. That's both a pro and a con.

So, here's the whole code:

function renderTemplate($templateName, $templateVars) {
    $templateLocation = 'tpl/'      . $templateName . '.php';
    $cacheLocation    = 'tplCache/' . $templateName . '.php';
    if (!file_exists($cacheLocation) || filemtime($cacheLocation) < filemtime($templateLocation)) {
        $code = file_get_contents($templateLocation);

        $code = preg_replace('~\{\s*(.+?)\s*\}~', '<?php echo htmlspecialchars($1, ENT_QUOTES) ?>', $code);
        $code = preg_replace('~\{!!\s*(.+?)\s*\}~', '<?php echo $1 ?>', $code);
        $code = preg_replace('~\{%\s*(.+?)\s*\}~', '<?php $1 ?>', $code);

        file_put_contents($cacheLocation, $code);
    }

    // extract template variables ($templateVars['a'] => $a)
    extract($templateVars);

    // run template
    include 'tplCache/' . $templateName . '.php';
}

I haven't tested the above code ;) It's only the basic idea.

nikic
This would be my approach too. It has a couple benefits: 1) It avoids eval(), 2) It avoids the overhead of parsing the template on each page load, and 3) Op code cachers shouldn't have any problem caching the final php code.
mellowsoon
a template language cannot be limited to just output variables. There always will be a logical blocks. Why to answer this question if you never used to use a template yourself?
Col. Shrapnel
@Col: Well, this template language *does* support logical blocks. As you can see in my syntax example you could for example use a `foreach` loop. Same goes for `if` or any other language construct PHP supports. And why do you say that I never used a template myself? Imho Twig is considered a template engine and thus I consider myself a user of templates. PS: Have you downvoted this answer Col?
nikic
didn't notice it at first. So, you have just change `<?` to `{%` and back. well it's even more ugly than I thought at first. I'd downvote this frankenstein twice if I could.
Col. Shrapnel
@Col: That's exactly what you would do. Only difference is, that `{%` would be short_tags-independent (okay, I already know your opinion on that). Furthermore nobody prevents you from using `<?` or `<?php` instead. This "template language" allows both `{%` and the native versions. But you are right, two syntaxes to do one think are bad. I probably should remove the `{% }` syntax and instead expand `<? ?>` to `<?php ?>`. I'll think about it, thanks for feedback.
nikic
+13  A: 

PHP was itself originally intended as a templating language (ie a simple method of allowing you to embed code inside HTML).

As you see from your own examples, it got too complicated to justify being used in this way most of the time, so good practice moved away from that to using it more as a traditional language, and only breaking out of the <?php ?> tags as little as possible.

The trouble was that people still wanted a templating language, so platforms like Smartie were invented. But if you look at them now, Smartie supports stuff like its own variables and foreach loops... and before long, Smartie templates start to have the same issues as PHP templates used to have; you may as well just have used native PHP in the first place.

What I'm trying to say here is that the ideals of a simple templating language aren't actually that easy to get right. It's virtually impossible to make it both simple enough not to scare off the designers and at the same time give it enough flexibility to actually do what you need it to do.

Spudley
Absolutely!!! +1
Otar
+1 for dissuading people from writing a template language for a template language!
GWW
I think that `foreach` loops in a template are nothing bad. Looping and conditionals are required to build templates.
nikic
@nikic - I have no problem with foreach or any other programming aspect in a templating language; you're right, they are necessary (albeit often overused). My point was to answer point in the question about PHP tags being too complex for the designer who is scared off by programming constructs, by showing that all template languages suffer from this, if they are going to be powerful enough to actually be useful. The designer is just going to have to get used to having programming code in his templates, and in that case he may just as well stick with plain PHP.
Spudley
+1  A: 

Personally, I wouldn't touch with a stick any templating system where forgetting to escape a variable creates a remote code execution vulnerability.

Tgr
Care to explain the downvote?
Tgr
+1  A: 

Personally i'm using this template engine: http://articles.sitepoint.com/article/beyond-template-engine/5

I really like it a lot, especially because of it's simplicity. It's kinda similar to your latest incarnation, but IMHO a better approach than using heredoc and putting yet another layer of parsing above the PHP one. No eval() either, but output buffering, and scoped template variables, too. Use like this:

<?php   
require_once('template.php');   

// Create a template object for the outer template and set its variables.     
$tpl = new Template('./templates/');   
$tpl->set('title', 'User List');   

// Create a template object for the inner template and set its variables.
// The fetch_user_list() function simply returns an array of users.
$body = new Template('./templates/');   
$body->set('user_list', fetch_user_list());   

// Set the fetched template of the inner template to the 'body' variable
// in the outer template.
$tpl->set('body', $body->fetch('user_list.tpl.php'));   

// Echo the results.
echo $tpl->fetch('index.tpl.php');   
?>

The outter template would look like this:

<html>
  <head>
    <title><?=$title;?></title>
  </head>
  <body>
    <h2><?=$title;?></h2>
        <?=$body;?>
  </body>
</html>

and the inner one (goes inside the outter template's $body variable) like this:

<table>
   <tr>
       <th>Id</th>
       <th>Name</th>
       <th>Email</th>
       <th>Banned</th>
   </tr>
<? foreach($user_list as $user): ?>
   <tr>
       <td align="center"><?=$user['id'];?></td>
       <td><?=$user['name'];?></td>
       <td><a href="mailto:<?=$user['email'];?>"><?=$user['email'];?></a></td>
       <td align="center"><?=($user['banned'] ? 'X' : '&nbsp;');?></td>
   </tr>
<? endforeach; ?>
</table>

If you don't like / can't use short-tags then replace them with echos. That's as close to dirt-simple as you can get, while still having all the features you'll need IMHO.

DanMan