tags:

views:

83

answers:

3

I'm hacking a little template engine. I've a class (pompously named the template compiler) that produce a string of dynamically generated code.

for instance :

def dynamic_function(arg):
  #statement
  return rendered_template

At rendering time, I call the built-in function exec against this code, with a custom globals dictionary (in order to control as mush as possible the code inserted into the template by a potential malicious user).

But, I need to cache the compiled template to avoid compiling it each execution. I wonder if it's better to store the string as plain text and load it each time or use compile to produce code_object and store that object (using the shelve module for example).

Maybe it worth mentioning that ultimately I would like to make my template engine thread safe.

Thank for reading ! Thomas

edit : as S.Lott is underlining better doesn't have a sense itself. I mean by better faster, consume less memory simpler and easier debugging. Of course, more and tastier free coffee would have been even better.

A: 

Mako template library caches the compiled template as a python module and uses the built in imp module to handle byte-code caching and code loading. This seems reasonably robust to changes in the interpreter, fast and easily debuggable (you can view the source of the generated code in the cache).

See the mako.template module for how it handles this.

Ants Aasma
I use the exec function because of the ability it provide me to control the scope and the module/function available in that scope (e.g : remove the __import__ function to prevent savage importing :p)
thomas
You can still access it via \_\_builtins\_\_: exec('\_\_builtins\_\_["\_\_import\_\_"]("os").abort()', {}, {})
Ants Aasma
+2  A: 

You don't mention where you are going to store these templates, but if you are persisting them "permanently", keep in mind that Python doesn't guarantee byte-code compatibility across major versions. So either choose a method that does guarantee compatibility (like storing the source code), or also store enough information alongside the compiled template that you can throw away the compiled template when it is invalid.

The same goes for the marshal module, for example: a value marshaled with Python 2.5 is not promised to be readable with Python 2.6.

Ned Batchelder
You're right, I forgot that the byte code compatibility isn't guaranteed acrross interpreter version. A version check would make the code more complex, and I don't want that.
thomas
+1  A: 

Personally, i'd store text. You can look at text with an editor or whatever, which will make debugging and mucking about easier. It's also vastly easier to write unit tests for, if you want to test that the contents of the cache file are what you expect.

Later, if you find that your system isn't fast enough, and if profiling reveals that parsing the cached templates is taking a lot of time, you can try switching to storing bytecode - but only then. As long as the storage mechanism is properly encapsulated, this change should be fairly painless.

Tom Anderson