views:

372

answers:

3

What is the cost of parsing a large XML file using PHP on every page request?

I would like to implement custom tags in HTML.

<?xml version="1.0"?>
<html>
    <head>
        <title>The Title</title>
    </head>
    <body>
        <textbox name="txtUsername" />
    </body>
</html>

After I load this XML file in PHP, I search for the custom tags using XPath and manipulate or replace them.

Is this very costly or is it acceptable? What about applying this to a large scale web site?

In the past I also used XSLT for large sites, and it didn't seem to slow things down. This is somehow similar to XSLT, but manual.

+2  A: 

I would guess pretty costly, but the best way is to test it yourself and measure the peak memory usage and time required to run the script.

You might be able to cache some intermediate state, so that the heavy XML parsing doesn't have to be done everytime - maybe you could replace the tags with actual PHP code like Smarty does and then include that generated/cached PHP file instead.

The cached file could look like the code in Soulmerge's answer.

Tom Haigh
+1. I'd cache the generated page if possible, to eliminate all parsing.
You
@You: If you only cached the generated page you would have to re-parse the XML when any aspect of the page changes, which may or may not be a problem
Tom Haigh
+1  A: 

Parsing the xml should be fast, as long as you use builtin functions like DOMXPath and your xml files are not too large.

However, I would rather replace the custom tags with function-calls and include the file in php, which should be a lot faster, since you're not doing any string manipulation in PHP then:

<?xml version="1.0"?>
<html>
    <head>
        <title>The Title</title>
    </head>
    <body>
        <?php textbox('txtUsername') ?>
    </body>
</html>
soulmerge
That's what I am using at the moment <?=HTML::TextBox('txtName')?>.But XML is much more powerful and allows me to do attribute replacement on DOMElements like jQuery but on the server-side. Thank you for you suggestion.
Christian Toma
+1  A: 

Is this very costly or is it acceptable?

Don't guess. Measure.

troelskn
I measured it and it's lightning fast on my server. But I asked this question to see if it can work in a large scale web site.
Christian Toma
Then measure it under large load. You can use a tool like Siege (http://www.joedog.org/index/siege-home). Performance is really a very subjective thing. One approach could be to find out the number of requests/second that you can support and make your judgement from there. You might compare it to a cached version to see what the theoretical optimal solution would give you.
troelskn