tags:

views:

2500

answers:

2

Hi all,

My PHP web application has an API that can recieve reasonably large files (up to 32 MB) which are base64 encoded. The goal is to write these files somewhere on my filesystem. Decoded of course. What would be the least resource intensive way of doing this?

Edit: Recieving the files through an API means that I have a 32MB string in my PHP app, not a 32 MB source file somewhere on disk. I need to get that string decoded an onto the filesystem.

Using PHP's own base64_decode() isn't cutting it because it uses a lot of memory so I keep running into PHP's memory limit (I know, I could raise that limit but I don't feel good about allowing PHP to use 256MB or so per process).

Any other options? Could I do it manually? Or write the file to disk encoded and call some external command? Any thought?

+7  A: 

Decode the data in smaller chunks. Four characters of Base64 data equal three bytes of “Base256” data.

So you could group each 1024 characters and decode them to 768 octets of binary data:

$chunkSize = 1024;
$src = fopen('base64.data', 'rb');
$dst = fopen('binary.data', 'wb');
while (!feof($src)) {
    fwrite($dst, base64_decode(fread($src, $chunkSize)));
}
fclose($dst);
fclose($src);
Gumbo
Thanks. One thing before I mark this as accepted: In my original question I mention that the source file comes through an API. So, it's a variable (a 32 MB string) in PHP and not a file you read from. Is there something I can use instead of your fread() that returns me chunks of a string efficiently? I.e. without making too many duplicate copies that gobble up memory?
Sander Marechal
You can read from the input via `php://input`. See http://docs.php.net/manual/en/wrappers.php.php
Gumbo
+4  A: 

Even though this has an accepted answer, I have a different suggestion.

If you are pulling the data from an API, you should not store the entire payload in a variable. Using curl or other HTTP fetchers you can automatically store your data in a file.

Assuming you are fetching the data through a simple GET url:

$url = 'http://www.example.com/myfile.base64';
$target = 'localfile.data';

$rhandle = fopen($url,'r');
stream_filter_append($rhandle, 'convert.base64-decode');

$whandle = fopen($target,'w');

stream_copy_to_stream($rhandle,$whandle);
fclose($rhandle);
fclose($whandle);

Benefits:

  • Should be faster (less copying of huge variables)
  • Very little memory overhead

If you must grab the data from a temporary variable, I can suggest this approach:

$data = 'your base64 data';
$target = 'localfile.data';

$whandle = fopen($target,'w');
stream_filter_append($whandle, 'convert.base64-decode',STREAM_FILTER_WRITE);

file_put_contents($whandle,$data);

fclose($whandle);
Evert
A nice idea, but not what I am looking for. In my case, client applications are pushing big files over XML-RPC (HTTP POST) to my server (along with a couple of other parameters). Clients can be behind NAT and firewalls, so fetching the data from the client using GET is not possible.
Sander Marechal
If the structure of the xml rpc response is somewhat static, you could manually traverse through the response body, so you can still completely avoid all the memory usage.If you must put the data in a temporary variable, you can change the setup a bit. (I'm updating the example right after the example ;))
Evert
Thanks for the update. I find it superior to the answer I originally accepted.
Sander Marechal
In regards to your second example would this differ any if you had to accept a file upload via POST and push it from your server to a soap server?
Chris