views:

178

answers:

1

I've got a client with a Magento shop. They are creating a txt file to upload to googlebase, which contains all of their products, but due to the quantity of products (20k), the script bombs out once it's taken up about 1gb. It's being run via cron.

Is there a way to either zip or segment the array, or write it to the file as it's created, rather than create the array and then write it?

<?php
define('SAVE_FEED_LOCATION','/home/public_html/export/googlebase/google_base_feed_cron.txt');



set_time_limit(0);

require_once '/home/public_html/app/Mage.php';
    Mage::app('default');

try{
    $handle = fopen(SAVE_FEED_LOCATION, 'w');


    $heading = array('id','title','description','link','image_link','price','product_type','condition','c:product_code');
    $feed_line=implode("\t", $heading)."\r\n";
    fwrite($handle, $feed_line);

    $products = Mage::getModel('catalog/product')->getCollection();
    $products->addAttributeToFilter('status', 1);//enabled
    $products->addAttributeToFilter('visibility', 4);//catalog, search
    $products->addAttributeToFilter('type_id', 'simple');//simple only (until fix is made)
    $products->addAttributeToSelect('*');
    $prodIds=$products->getAllIds();

    foreach($prodIds as $productId) {

        $product = Mage::getModel('catalog/product'); 

        $product->load($productId);

        $product_data = array();
        $product_data['sku']=$product->getSku();
        $product_data['title']=$product->getName();
        $product_data['description']=$product->getShortDescription();
        $product_data['link']=$product->getProductUrl(). '?source=googleps';
        $product_data['image_link']=Mage::getBaseUrl(Mage_Core_Model_Store::URL_TYPE_MEDIA).'catalog/product'.$product->getImage();

        // Get price of item
if($product->getSpecialPrice())
            $product_data['price']=$product->getSpecialPrice();
        else
   $product_data['price']=$product->getPrice();


        $product_data['product_type']='';
        $product_data['condition']='new';
        $product_data['c:product_code']=$product_data['sku'];


        foreach($product->getCategoryIds() as $_categoryId){
            $category = Mage::getModel('catalog/category')->load($_categoryId);
            $product_data['product_type'].=$category->getName().', ';
        }
        $product_data['product_type']=rtrim($product_data['product_type'],', ');



        //sanitize data
        foreach($product_data as $k=>$val){
        $bad=array('"',"\r\n","\n","\r","\t");
        $good=array(""," "," "," ","");
        $product_data[$k] = '"'.str_replace($bad,$good,$val).'"';
        }


        $feed_line = implode("\t", $product_data)."\r\n";
        fwrite($handle, $feed_line);
        fflush($handle);
    }

    //---------------------- WRITE THE FEED
    fclose($handle);

}
catch(Exception $e){
    die($e->getMessage());
}

?>

A: 

I have two fast answers here:

1) Try to increase php's allowed maximum memory size (for the command line since it is a cron script)

2) The way the senior developers solve similar issues, where I currently work is something like the following:

Create a date field attribute with a name like googlebase_uploaded, and run the cron script with something like const MAX_PRODUCTS_TO_WRITE. Then append to the file and flag each product that got appended.

What I am trying to say is slice the execution time into slower chunks that won't break the script.

Unfortunately that's where I'm missing java and c#

dimitris mistriotis