views:

49

answers:

2

I've been using the Batch API successfully to do processing that would normally lead to PHP timeouts or out of memory errors, and it's been working nicely.

I've looked through the code a little, but I'm still unclear about what's happening behind the scenes.

Could someone familiar with the process describe how it works?

A: 

From a great example implementation:

Each batch operation callback will iterate over and over until $context['finished'] is set to 1. After each pass, batch.inc will check its timer and see if it is time for a new http request, i.e. when more than 1 minute has elapsed since the last request.

An entire batch that processes very quickly might only need a single http request even if it iterates through the callback several times, while slower processes might initiate a new http request on every iteration of the callback.

This means you should set your processing up to do in each iteration only as much as you can do without a php timeout, then let batch.inc decide if it needs to make a fresh http request.

In other words: you must split up your batch of tasks into chunks (or single tasks) thta won't time-out. Drupal will end its currrent call and open a new HTTP-request if it sees the PHP-timeout nearing.

berkes
Thanks for the answer. That great documentation is what I used to get things working. What I'm wondering is more how it works than how to use it.
wynz
I highlighted the parts that answer your question. Or did I misunderstand your question?
berkes
+1  A: 

I've looked through the code a little, but I'm still unclear about what's happening behind the scenes.

Could someone familiar with the process describe how it works?

What happens is that, to avoid PHP time outs, the browser periodically pings through AJAX the URL (http://example.com/batch?id=$id) that causes the batch operations to be executed.
See _batch_page(), which is the function called by system_batch_page(), the menu callback for the /batch URL.

kiamlaluno
Aaah, I get it now. So new operations keep starting as long as the browser sits there pinging. I see there's even a little trick for users without Javascript using a <meta http-equiv="Refresh"> tag. And to save the job status in between operations, it looks like the job gets saved to the batch table in the database. It's all clear now, thanks :)
wynz

related questions