tags:

views:

4866

answers:

5

I have code looking something like this:

$data = file_get_contents($tempFile); // perhaps 30MB of file data, now in PHP's memory
$hash = md5($data);
$query = "INSERT INTO some_table
          SET BlobData = '" . mysql_real_escape_string($data) . "',
          BlobHash = '$hash'
          ";
mysql_query($query);

I know this isn't very efficient as each of the '.' operators will reallocate a bigger memory block and the 30MB string will be copied several times.

Is there anything more efficient than the following solution?

$data = file_get_contents($tempFile); // perhaps 30MB of file data, now in PHP's memory
$hash = md5($data);
$query = "INSERT INTO some_table SET BlobData = '%s', BlobHash = '$hash'";
mysql_query(sprintf($query, mysql_real_escape_string($data)));
A: 

Have you benchmarked the output buffering trick?

ob_start();
echo 'INSERT INTO some_table SET BlobData = \'', mysql_real_escape_string( $data ), '\', BlobHash = \'', $hash, '\'';
mysql_query( ob_get_clean() );

Another thing you could do is convert to mysqli or MDB2, which support bound parameters. That would allow you to skip the mysql_real_escape_string call and the string concatenations.

Don Neufeld
A: 

Well if memory usage is your problem, you can read the large file in chunks and insert these chunks with CONCAT into the database field.

Sample code (not tested):

    $id = 1337;
$h = fopen("path/to/file.ext", "r");
while (!feof($h)) 
 {
 $buffer = fread($h, 4096);
 $sql = "UPDATE table SET my_field = CONCAT(my_field, '" . mysql_real_escape_string($buffer) . "') WHERE Id = " . $id;
 mysql_query($sql);
 }

This method will be slower but you'll require only 4Kb of your memory.

Joe Scylla
+3  A: 

If you are using PDO, and prepared statments you can use the PDO::PARAM_LOB type. See example #2 on the LOB page showing how to insert an image to a database using the file pointer.

http://us2.php.net/manual/en/pdo.lobs.php

Zoredache
Problem with PDO is it only emulates prepared statements if you do not have the mysqli library. So all this does is move the concatenate into the PDO library.
jmucchiello
A: 

Would it make any difference if you didn't put the query into another variable, but instead passed it directly into the MySQL command. I've never tried this, but it may make a difference as the whole string isn't stored in another variable.

Darryl Hein
+4  A: 

You have two issues here:

#1, there are several different ways you can compute the MD5 hash:

  • Do as you do and load into PHP as a string and use PHP's md5()
  • Use PHP's md5_file()
  • As of PHP 5.1+ you can use PHP's streams API with either of md5 or md5_file to avoid loading entirely into memory
  • Use exec() to call the system's md5sum command
  • Use MySQL's MD5() function to compute the hash

Since these are all trivial to implement it would be easy for you to implement and benchmark them all for memory usage and speed. Here are some benchmarks showing system md5 via exec to be a lot faster than PHP's md5_file as file size increases. Doing it your way is definitely the worst way as far as memory usage is concerned.

#2, mysql_real_escape_string performs a database query, so you're actually transmitting your blob data to the database, getting it back as a string, and transmitting it again(!) with the INSERT query. So it's traveling to/from the DB server 3x instead of 1x and using 2x the memory in PHP.

It's going to be more efficient to use PHP5 prepared statements and only send this data to the database once. Read the linked article section, you'll see it mentions that when you are binding parameters, you can use the blob type to stream blob data to the DB in chunks. The PHP docs for mysqli_stmt::send_long_data have a great simple example of this that INSERTs a file into a blob column just like you are.

By doing that, and by using either the streams API, md5_file or exec with the system md5 command, you can do your entire INSERT without ever loading the entire file into memory, which means memory usage for your series of operations can be as low as you want!

joelhardi