views:

111

answers:

3

Hey guys,

I'm working on a PHP project, and I need to parse large XML file (>240MB) from URL I used xmlReader it works in localhost but not working on shared hosting (BlueHost) it shows 404 error! http://webmashing.com/meilleures-des/cronjob?type=sejours

Is this action need a dedicated server? if yes please give me suggestion.

by the way splitting the XML file can help?

+1  A: 

XMLParser is a pull parser, so it doesn't load the entire file into memory as you parse it, so splitting the file will have no effect other than to add complexity to your code. However, if you're holding all the details that you parse in your script, that will take up a lot of memory.

However, you should be getting some error or message from running the script on your shared hosting to identify what the problem is. Was their version of PHP built with --enable-libxml, are you getting a memory allocation error?

Mark Baker
Hey Mark thanks for your answer, the PHP version is 5.2.13I tried to run the script with small XML it works well. for the large file I add ini_set('memory_limit', '-1'); and set_time_limit(0); to my code but still give 404 error!!
Enable error logging in your script. You need to know why it's failing on the shared hosting
Mark Baker
A: 

You may use SAX (Simple API for XML) parser which is also best solution for reading huge XML file. As this will not dump whole file into the memory. This will prevent your memory exhaust problem. Yes It will take time to read such huge file. You may need to check whether your php has libxml and libxml2 modules install using phpinfo(); function.

But Better if can go for XMLReader as this is faster and save your memory usage. You can check peak memory usage using memory_get_peak_usage(); And read file row by row and unset row from array after operation is done on that particular row.

Kamlesh Bhure
A: 

Guessing it's a memory related issue (set memory and time execution limits).

For what it's worth. I have used vtd-xml (java implementation) to parse files over 500MB with success (low memory footprint and fast - maybe the fastest exec. time).

andreas