ansaurus

Question

Parsing Large Text Files with PHP Without Killing the Server

Answer 1

+2 A:

PHP isn't really designed for this. Offload the work to a different process and call it or start it from PHP. I suggest using Python or Perl.

Randolpho 2009-08-10 14:21:37

unfortunately, it's not an option at this point to choose another language. :(

jacobangel 2009-08-10 14:22:53

Then do it with PHP in a separate process. The point is that you shouldn't be parsing that large file as part of your request. You should offload the work in a separate process, return a response, and then allow a second request to determine whether or not the process id done. Asynchronous FTW.

Randolpho 2009-08-10 14:42:38

Agreed. My guess is that you are receving the file via ftp, batch process, etc. Why not parse the file as soon it lands on the file system instead of waiting for someone to pull it down from a web request.

matt eisenberg 2009-08-10 16:25:34

heh.. just noticed the typo... I meant "process *is* done" not "process id done". :D

Randolpho 2009-08-10 18:39:04

went with this option :)

jacobangel 2009-08-10 21:35:58

Glad to hear it! :)

Randolpho 2009-08-11 04:11:17

Answer 2

+1 A:

From my meagre understanding of PHP's garbage collection, the following might help:

unset $buffer when you are done writing it out to disk, explicitly telling the GC to clean it up.
put the if block in another function, so the GC runs when that function exits.

The reasoning behind these recommendations is I suspect the garbage collector is not freeing up memory because everything is done inside a single function, and the GC is garbage.

freespace 2009-08-10 14:31:50

Tried this. It did free up a bit of memory, but not enough. I wish I knew what precisely it was doing with the memory.

jacobangel 2009-08-10 21:37:38

Answer 3

A:

I expect this to fail in many cases. You are reading in chunks of 4096 bytes. Who knows that the cut-off will not be in the middle of a <text>? In which case your str_replace would not work.

Have you considered using a regular expression?

jeyoung 2009-08-10 15:37:28

ansaurus

tags:

views:

answers:

Parsing Large Text Files with PHP Without Killing the Server

related questions