tags:

views:

90

answers:

3

Using the code below each image download) file_get_contents() ) takes on average 8-15 seconds.....

If I do not use a context on file_get_contents() then image download is less than a second.

If I change the $opts to, below then i get same performance as file_get_contents() without a context which takes appox 13 seconds to process 2,500 imagesx.

$opts = array(
    'http'=>array(
        'protocol_version'=>'1.1',
        'method'=>'GET',
        'header'=>array(
            'Connection: close'
        ),
        'user_agent'=>'Image Resizer'
     )
); 

Reproduce:

    $start_time = mktime();
$products = array(
        array( 'code'=>'A123', 'image_url'=>'http://www.google.com/intl/en_ALL/images/srpr/logo1w.png' ),
        array( 'code'=>'A124', 'image_url'=>'http://www.google.com/intl/en_ALL/images/srpr/logo1w.png' ),
        array( 'code'=>'A125', 'image_url'=>'http://www.google.com/intl/en_ALL/images/srpr/logo1w.png' ),
        array( 'code'=>'A126', 'image_url'=>'http://www.google.com/intl/en_ALL/images/srpr/logo1w.png' ),
        array( 'code'=>'A127', 'image_url'=>'http://www.google.com/intl/en_ALL/images/srpr/logo1w.png' ),
        array( 'code'=>'A128', 'image_url'=>'http://www.google.com/intl/en_ALL/images/srpr/logo1w.png' ),
        array( 'code'=>'A134', 'image_url'=>'http://www.google.com/intl/en_ALL/images/srpr/logo1w.png' ),
        array( 'code'=>'A135', 'image_url'=>'http://www.google.com/intl/en_ALL/images/srpr/logo1w.png' ),
        array( 'code'=>'A146', 'image_url'=>'http://www.google.com/intl/en_ALL/images/srpr/logo1w.png' ),
        array( 'code'=>'A165', 'image_url'=>'http://www.google.com/intl/en_ALL/images/srpr/logo1w.png' )
    );

    if ( count( $products ) > 0 ) {
        $opts = array(
            'http'=>array(
                'protocol_version'=>'1.1',
                'method'=>'GET',
                'user_agent'=>'Image Resizer'
            )
        ); 
        $context = stream_context_create($opts);
        $def_width = 100;
        $max_width  = $def_width;
        foreach( $products as $product ) {
            $code = $product['code'];
            $folder = substr( $code, 0, 3 );
            echo( 'Looking at: ' .$product['code'] ."<br />" );
            $file = '/tmp/' .$folder .'/' .$code .'_' .$def_width .'.jpg';
            $filemtime = @filemtime($file);
            $gen_file = true;
            if ( $filemtime !== false ) {
                $file_age = (time() - $filemtime);
                if ( $file_age <= 300 ) {
                    $gen_file = false;
                }
            }
            echo( '&nbsp;&nbsp;&nbsp;&nbsp;File not cached or cached file has expired<br />' );
            if ( $gen_file ) {
                echo( '&nbsp;&nbsp;&nbsp;&nbsp;Getting File...');
                $imgStr = file_get_contents( $product['image_url'], false, $context );
                $img = @imagecreatefromstring($imgStr);
                if ( is_resource( $img ) ) {
                    echo( 'DONE' .'<br />' );
                    $image_x = imagesx($img);
                    $image_y = imagesy($img);
                    if ( $def_width >= $image_x ) {
                        $def_width = $image_x;
                    }
                    echo( '&nbsp;&nbsp;&nbsp;&nbsp;Calculating Scale<br />' );
                    $ts = min( $max_width/$image_x,$max_width/$image_y);
                    $thumbhght = $ts * $image_y;
                    $thumbwth = $ts * $image_x;

                    $thumb_image_resized = imagecreatetruecolor( $thumbwth, $thumbhght);
                    imagecopyresampled($thumb_image_resized, $img, 0, 0, 0, 0, $thumbwth, $thumbhght, $image_x, $image_y );
                    echo( '&nbsp;&nbsp;&nbsp;&nbsp;Checking For Directory<br />' );
                    if ( !is_dir( '/tmp/' .$folder ) ) {
                        mkdir( '/tmp/' .$folder );
                    }
                    echo( '&nbsp;&nbsp;&nbsp;&nbsp;Writing File<br />' );
                    $new_file = '/tmp/' .$folder .'/' .$code .'_' .$def_width .'.jpg';

                    imagejpeg( $thumb_image_resized, $new_file, 100);
                    echo( '&nbsp;&nbsp;&nbsp;&nbsp;DONE<br />' );

                    imagedestroy($img);
                    imagedestroy($thumb_image_resized);
                } else {
                    echo( 'Problem Getting Image<br />' );
                    die();
                }
            } else {
                echo( '&nbsp;&nbsp;&nbsp;&nbsp;Already Exists<br />' );
            }
        }
    }
    $end_time = mktime();
    echo( 'Completed In...' .($end_time - $start_time ) .' seconds(s)<br />' );
A: 

Your context tells file_get_contents() to close the HTTP connection every time. Perhaps that is why the code takes so long, as it closes and re-opens connections many times? I am not familiar with the internals of file_get_contents(), but you might be able to tweak the context to use "Connection: keep-alive" for all but the last connection, and "Connection: close" for the last.

Kai Chan
I would expect `file_get_contents()` to close its connection regardless -- it closes file handles when reading files from disk. If you want performance, cURL is a better bet. You can simply reuse the same curl handle over and over; the connection is kept open by default, if I recall correctly.
Frank Farmer
I agree with you about using cURL, unless there is a reason dorgan has to use file_get_contents().
Kai Chan
+3  A: 

HTTP 1.1 requests are pipelined by default. If you don't "Connection: Close", it assumes "Connection: Keep-Alive", and then you have to wait for the connection to time out (since you never explicitly closed it) before the next loop will start.

TML
+1  A: 

I ended up using http://code.google.com/p/php-pipeline which gives me the ability to pipeline on multiple connections.

dorgan