tags:

views:

57

answers:

1

Hi there, i run a system which needs to update various xml files from data stored in a db. The script runs via a server side php file which is monitored by a daemon so that it executes, finishes to free resources, then is restarted.

I have some benchmarking within the script, and when i have to update 100 xml files, its taking about 15 seconds to complete. A typical xml file which is created is around 6kb - I am creating the xml using php's dom, and writing using dom->save. The db is fully normalised, and the correct indexes are in place, the 3 queries that i need to perform which gets the necessary data i need to update the xml with only takes around 0.05 seconds. Therefore the bottleneck seems to be with the actual creating of the xml via dom and writing the file itself.

Does any have any ideas how i could really speed up the process? I have considered using a crc check to see whether the xml needs to be re-written, but this would still require me to read the xml file which i would be updating and i dont do this at the moment, so surely its just as bad as just saving a new file over the top of the old one? Also, i dont think its possible to edit certain parts of the xml, as the structure isnt uniform, the order of the nodes can change depending on what data is not null after being updated.

Really appreciate your thoughts on this!

+2  A: 

Fifteen seconds to write a few XML files? That sounds way too much. Can you do some more profiling and find out which function exactly is the bottleneck?

Have you considered writing plain text XML (fwrite("<item>value</item>")) instead of building it by DOM? Sounds justifiable in this case.

Otherwise, for caching, there's always filemtime() that you could use to quickly get the "last modified" time of your XML file, and see whether the DB entry is younger than that. In a system like you describe, there should be no need to compare the contents.

Pekka
Avoid fwrite() calls. Better build the XML in a variable then save it with file_put_contents(). It will be more performant and it will make it impossible for a reader to get an half-written file.
Josh Davis