tags:

views:

116

answers:

2
+1  Q: 

find_in_batches

How to implement find_in_batches in cake php model

A: 

Just an idea...

function find_in_batches($value, $options){
    foreach($options as $option){
        switch($option){
            // Here you can add possible options
            case 'option_1':
                if($value != 'something') return false;
                break;
            case 'option_2':
                if($value > 5) return false;
                break;
        }
    }
    return true;
}

$array = array(
    'Value_1' => array('option_1', 'option_2'),
    'Value_2' => array('option_2')
);

foreach($array as $value => $options){
    if(find_in_batches($value, $options)){
        // Do something with this value.
    }
}
Harmen
Sorry, but what's this supposed to do…?
deceze
A: 

The difficult part is that PHP has little support for lamba style functions, and that the records returned from the db are simple arrays of data, not objects that you could call meaningful methods on that would reflect in the database. As such, making a meaningful, arbitrarily usable find_in_batches method that would iterate a callback over database records in the model is tricky.

The general idea would look like this:

$total = $this->find('count');
$counter = 0;
$batch_size = 1000;

while ($counter <= $total) {
    $records = $this->find('all', array('limit' => $batch_size, 'offset' => $counter));

    foreach ($records as $record) {
        // do something with $record
    }

    $counter += $batch_size;
}

The problem is the // do something with $record part. To do anything interesting with it, you'll probably have to call the function like this:

// Assuming PHP 5.3+
$Model->find_in_batches(function ($record) {
    // do something
    $Model->save($record);
});

The problem is, $Model won't be in scope in the function, so you'd either have to pass it into the function or do some really clunky hacking to get around this limitation. That's not to mention that any PHP version older than 5.3 will need an even clunkier syntax to just get a callback passed at all.


The best solution for Cake would probably be:

Model:

var $batch_offset = 0;
var $batch_size = 1000;

function start_batch($offset = 0, $size = 1000) {
    $this->batch_offset = $offset;
    $this->batch_size = $size;
}

function find_in_batches($conditions = array()) {
    $records = $this->find('all', array(
        'conditions' => $conditions,
        'limit'      => $this->batch_size,
        'offset'     => $this->batch_offset
    ));
    $this->batch_offset += $this->batch_size;
    return $records;
}

Controller:

$this->Model->start_batch();
while ($records = $this->Model->find_in_batches()) {
    foreach ($records as $record) {
        // do something
        $this->Model->save($record);
    }
}

Pretty ugly if you ask me.


You could of course go crazy with Iterator objects that would fetch new batches of data as necessary internally but just expose a normal array externally, but that wouldn't easily match with Cake's models.

Overall it doesn't seem like a programming style suitable to Cake/PHP and you're probably better off writing a loop like this every time you need it. Or just stick with Rails. ;)

deceze