views:

132

answers:

5

Hello,

Recently I needed to generate a huge HTML page containing a report with several thousand row table. And, obviously, I did not want to build the whole HTML (or the underlying tree) in memory. As result, I built the page with the old good string interpolation, but I do not like the solution.

Thus, I wonder whether there are Python templating engines that can yield resulting page content by parts.

UPD 1: I am not interested in listing all available frameworks and templating engines. I am interested in templating solutions that I can use separately from any framework and which can yield content by portions instead of building the whole result in memory.

I understand the usability enhancements from partial content loading with client scripting, but that is out of the scope of my current question. Say, I want to generate a huge HTML/XML and stream it into a local file.

+1  A: 

It'd be more user-friendly (assuming they have javascript enabled) to build the table via javascript by using e.g. a jQuery plugin which allows automatical loading of contents as soon as you scroll down. Then only few rows are loaded initially and when the user scrolls down more rows are loaded on demand.

If that's not a solution, you could use three templates: one for everything before the rows, one for everything after the rows and a third one for the rows. Then you first send the before-rows template, then generate the rows and send them immediately, then the after-rows template. Then you will have only one block/row in memory instead of the whole table.

ThiefMaster
Well, thank you for suggestion, but this unfortunatelly does not answer the question.
newtover
A: 

Are you using a web framework for this? http://www.pylonshq.com includes compatibility with several templating engines. http://www.djangoproject.com/ Django has its own templating language.

I think an answer that included lazy loading of the rows with javascript would work for web view, but I presume the report is going to need to be printed, in which case you'll have to build the whole thing at some point, right?

UberAlex
I am sorry, I updated my question.
newtover
If you read the pylons website, you can have a look at several template languages that exist independently of the framework.
UberAlex
+2  A: 

There is no problem with building something like this in memory. Several thounsand rows is by no means big.

For your templating needs you can use any of the:

There are some tools that allow generation of HTML from these markup languages.

drozzy
Thanks. I would I agree, that 10000 is not so huge in my case (depends on the contents though), but if I would take data from MySQL, for example, the size may double. Anyway, your suggestion is to generate data in an intermediate format and then apply a transformation with one of the listed tools, isn't it? I noticed, some of the solutions are not even Python-based.
newtover
Yes, things like RST can be generated much more succinctly than HTML. It is then just a matter of calling a third party tool to convert it to html. I believe all of the examples I provided DO have a python lib. Another advantage of intermediate format - is that you can easily create additional applications like: XML representation, or even expose it as an API of some sort (but that is beside the point).
drozzy
Regarding doubling the size of the database query - you can easily hold many more than 20k records in memory. In the case that you are not able to do so - you can write it to a file (in an intermediate or csv representation) and then convert it to html, xml, or anything else.The slowest part will be the database query (plus network lag if any) so I would not worry about the writing/reading to a file + conversion process. I suggest running a few tests with 1million row files or so, by converting them from say RST to HTML with something like Sphinx.
drozzy
+2  A: 

Most popular template engines have a way to generate or write rendered result to file objects with chunks. For example:

Denis Otkidach
Thank you, this is what I was looking for.
newtover
+1  A: 

You don't need a streaming templating engine - I do this all the time, and long before you run into anything vaguely heavy server-side, the browser will start to choke. Rendering a 10000 row table will peg the CPU for several seconds in pretty much any browser; scrolling it will be bothersomely choppy in chrome, and the browser mem usage will rise regardless of browser.

What you can do (and I've previously implemented, even though in retrospect it turns out not to be necessary) is use client-side xslt. Printing the xslt processing instruction and the opening and closing tag using strings is easy and fairly safe; then you can stream each individual row as a standalone xml element using whatever xml writer technique you prefer.

However - you really don't need this, and likely never will - if ever your html generator gets too slow, the browser will be an order of magnitude more problematic.

So, unless you benchmarked this and have determined you really have a problem, don't waste your time. If you do have a problem, you can solve it without fundamentally changing the method - in memory generation can work just fine.

Eamon Nerbonne
That is a nice comment, since the question is more theoretical for me. I actually return a large HTML table that happens to be a valid XML at the same time intended to be easily used for automation. In my situation I would use an XML writer which writes to a file, but there is no one in Python's std library. I have not thought of the problem from the browser side. Thank you.
newtover