views:

256

answers:

4

I have a need to write a basic scripting/templating engine that will run under PHP. Ideally, I would be able to mix my own markup language with an (X)HTML template and run the document through a server-side parser to dynamically replace my own markup with (X)HTML served out of a database.

Unfortunately, for all my knowledge of PHP and scripting, I'm not quite sure where to start. My first instinct was to run the entire document through some kind of regex parser and map my custom markup to specific PHP functions ... but that seems a bit slow and heavy-handed to me.

What resources/tutorials/examples exist that can point me in the right direction? For comparison, I really like the new Razor templating engine for .NET MVC ... I don't want to completely knock it off for a PHP project, but building something similar would be great.


Update

OK, let me refine my explanation a bit more ... I develop websites for WordPress. A lot of my clients want to customize their websites but run away whenever I start talking about PHP. It's a scripting language that looks too complex for the lay user to even want to get interested.

What I want to do is create my own form of markup specifically for WordPress. So rather than having PHP function calls (get_header() and get_footer() and if(has_posts())...) in the theme file, you'd have namespaced XML (<wpml:header /> and <wpml:footer /> and <wpml:loop> ... </wpml:loop>) that translates to the same thing. It would do a better job of separating your template files from the server-side script (there are several themes that place whole PHP functions directly in the theme's PHP template files!!!) and would make it easier for non-developers to begin working with an customizing a WordPress theme.

With that in mind, the already suggested solutions of TWIG and Mackrell definitely support the idea of embedding script "nuggets" in the file, but they don't really help me parse the custom XML/XHTML markup into something recognizable by the server-side code.

So ... where do I start when building a new server-side markup processor?

A: 

I'd start within XML by defining what a typical page markup would look like and then move on to deciphering the XML in your chosen language and then taking that and creating HTML.

The xml should be a bunch of nodes that describes your particular language.

So...

<MyPage>
  <MyElement id="myid" type="MyType1">
    <MyElement id="myid" type="MyType1" Text="Some text"/>
  </MyElement>
  etc...

I'd be looking more carefully on the internet to see if there is already something re-built that would suit your needs before embarking on something like this which has the very real potential of becoming one of those things that gets out of control and impossible to maintain.

griegs
A kitten is killed whenever a new DSL is written in XML.
pst
+4  A: 

It sounds like what you need is a templating language that supports being extended by custom tokens. Given that PHP itself meets that need, I'm guessing you also want sandboxing of some sort.

For that, I'd suggest TWIG.

By default, it uses the same basic syntax as Django and Jinja2 for Python or Liquid for Ruby (though, while not recommended, that is configurable) and it's compiled to cached PHP for speed.

It supports sandboxing and parameter auto-escaping as well as block substitution and inheritance, you choose what variables it gets access to, and you can set up any combination you want of default and custom tokens and filters.

Smarty might also meet your needs, but I'm not sure whether it has all the aforementioned features, its syntax is, in my opinion, not as elegant, and I'm told it's more pain than it's worth.

Whatever you do, think long and hard before inventing your own templating language. It's generally a huge pain in the long run and tends to end up on on The Daily WTF next to BobX sooner or later.

Update: I get the impression you're obsessed with using namespaced XML for your templating. Is it really worth reinventing an entire templating engine just so your users can use <wpml:header /> rather than {{header}}? TWIG doesn't let users embed arbitrary scripts... just variables and flow-control constructs you've explicitly OKed.

ssokolow
I would have given +1 if you hadn't mentioned smarty. Twig rocks, Smarty sucks, that's it.
nikic
@nikic: I was just trying to be fair to a package I've never personally used. In another answer, I believe I said something along the lines of "...and Smarty, because mentioning it is apparently mandatory for reasons I'm not really clear on".
ssokolow
@nikic: Not upvoting the answer because "Smarty sucks" according to your subjective judgment isn't exactly fair. @ssokolow even mentioned that isn't not as good a choice as Smarty. +1 to you, sir.
musicfreak
@musicfreak: If I say something I normally have reasons to say it. Twig overpowers Smarty in performance, security and functionality. Thus I see no reason to even mention Smarty.
nikic
@nikic: But +1 is supposed to mean "this answer was useful", so denying a +1 because only half the thing met with your approval and then explicitly telling me about it just makes you look childish and spiteful. It doesn't help that your original post simply said "Smarty sucks" rather than "Twig overpowers Smarty in performance, security and functionality. Thus I see no reason to even mention Smarty." StackOverflow is big enough that you can safely assume that everthing you say is probably going to form someone's first impression of you.
ssokolow
@nikic: There is nothing wrong with listing all available options and having an open mind about them.
musicfreak
Yes, you're right, it's okay to list all options. If I weren't okay with it, I would have downvoted. But I don't see why I should upvote something only because it's okay. I would have upvoted to show support for Twig. But as the answer mentions both I see no reason to upvote. But, you're right, obviously I expressed my opinion too radical. Smarty obviously is a good templateing engine, but Twig is better.
nikic
I appreciate your update, but my "obsession" with using XML is more for the user than for me. Most of the people who have asked me for this system freak out when they see any kind of markup on a page that doesn't look like HTML ... tags like `{{header}}`, while useful for me, wouldn't be useful for my users.
EAMann
A well templated php language can have just as simple a learning curve as XML namespaces.
Moses
+4  A: 

Your question pretty much precisely describes makrell - the meta-templating engine i wrote some time ago (and use it much since then). With makrell you define and change your "template language" on the fly, so you're not limited to any particular syntax convention. See examples here http://stereofrog.com/files/makrell_samples.php

Based on your update, there are essentially two approaches. First, you can treat the template as a string and apply more or less sophisticated replacements to it. This is where makrell (being nothing more than preg_replace on steroids) can help you. The downside of this approach is that your source document is assumed to be a dumb string, with no internal structure, hence no validation and no error correction. The advantage is that your code will be quite fast - for the same reason.

Another option is to parse your template into an xml document and transform it to another xml document, with your custom tags replaced with other tags (e.g. <?php processing instructions). In this case, XSL is what you're looking for.

stereofrog
+1  A: 

For custom XML you could use PHP XML parser preferably SAX for the performances.

Smarty is a very good PHP template engine with built-in tags, blocks and functions. You can extend those to create your own and even remove the built-in ones (for Smarty 3).

If you need to create your own script, I suggest you check language parser like Lex and Yacc. You'll have to define your language in a way like those SQLite images just not in a graphical manner but textually. There are other grammatical language parser available. Those I gave are among the oldest and most famous, but it was done for C++.

You'll probably want to avoid doing that yourself (like by using RegExp). Very soon you'll have many inconsistencies in your script. Even though RegExp are themself a kind of language interpreted by an automate.

You can mix the two: XML parser and general parser. Check out Finite-state machine (FSM).

Wernight