views:

25

answers:

0

Suppose you have a database of users, which is constantly moving, and getting big. You want to do something to each user, every 24 hours, without fail. It doesn't matter what that something is, maybe send them each an email, or maybe something more complex that takes an unpredictable length of time.

For a small site, I can imagine a simple cron job, or even a continually running process that iterates over the database in chunks and processes each user in turn. With this small database you could be confident that all users will be processed in time. I even tried a technique whereby each user was processed at their time of registration each day, but this resulted in uneven distribution, as there were peaks and troughs.

With a large database and more importance placed on completing the task in time, the above just doesn't scale. What is the best way to achieve this? I'm thinking roughly of something involving multiple workers, and a message queueing system like RabbitMQ.

Examples of sites that do this, and the methods/tech they use greatly appreciated.