Hi all,
I am building a website in CakePHP that processes files uploaded though an XML-RPC API and though a web frontend. Files need to be scanned by ClamAV, thumbnails need to be generated, etcetera. All resource intensive work that takes some time for which the user should not have to wait. So, I am looking into asynchronous processing with PHP in general and CakePHP in particular.
I came across the MultiTask plugin for CakePHP that looks promising. I also came across various message queue implementations such as dropr and beanstalkd. Of course, I will also need some kind of background process, probably implemented using a Cake Shell of some kind. I saw MultiTask using PHP_Fork to implement a multithreaded PHP daemon.
I need some advice on how to fit all these pieces together in the best way.
- Is it a good idea to have a long-running daemon written in PHP? What should I watch out for?
- What are the advantage of external message queue implementations? The MultiTask plugin does not use an external message queue. It rolls it's own using a MySQL table to store tasks.
- What message queue should I use? dropr? beanstalkd? Something else?
- How should I implement the backend processor? Is a forking PHP daemon a good idea or just asking for trouble?
My current plan is either to use the MultiTask plugin or to edit it to use beanstald instead of it's own MySQL table implementation. Jobs in the queue can simply consist of a task name and an array of parameters. The PHP daemon would watch for incoming jobs and pass them out to one of it's child threads. The would simply execute the CakePHP Task with the given parameters.
Any opinion, advice, comments, gotchas or flames on this?