views:

226

answers:

2

Been using PHP for quite some time now and I was wondering what this whole "message queue" is all about. Let's take facebook for example. I can update my status but then I have to show that status updates to all my friends (let's say I have 3000 followers). Even more work if there are comments and they have to be notified to all friends who left a comment via email. With the example I've seen, it appears that all a message queue does is take the "message" (my status update) and puts it into some temporary space (filesystem or DB table). I then have a cron job that pulls it out and updates my table.

With that said, how do I go about manipulating that data? I guess I'm getting confused as to how this would really help me. How do I translate the following function into a message queue and then schedule for function to run at a later time?

1 - Update my status 2 - Now publish it across my page and all my friends. 3 - If comment is left, now email that latest comment to those who "subscribe" to that comment.

My question is, how do I manipulate that data? Do I just insert the "comment" then have a "job" that pulls that comment out and plug it into a function that processes it?

Here's an example I plan on looking into.

http://www.freeopenbook.com/php-hacks/phphks-CHP-5-SECT-18.html

Thanks in advance.

+3  A: 

My question is, how do I manipulate that data? Do I just insert the "comment" then have a "job" that pulls that comment out and plug it into a function that processes it?

Exactly.

Publishing status updates across Facebook pages probably doesn't involve message queueing - I don't actually know their specific design, but I'd guess that updated data is just supplied on-demand via a query when users load their pages. (Unless Facebook has a separate process to denormalize status update data.)1

By contrast, sending status update email notifications is a great candidate for message queuing.

A typical implementation would involve writing a new message (generally minimal, perhaps just your user id) to a specific message queue - perhaps the "EmailStatusUpdateNotifications" queue.

Another process then dequeues messages and knows exactly what to do with them. A process dedicated to sending status update email messages would use the user id (the contents of the message) to load your current status and a list of your friends' email addresses, build the email messages, and dispatch them.

1It turns out you can find a lot of good information about Facebook's architecture in Why are Facebook, Digg, and Twitter so hard to scale? at High Scalability.

Jeff Sternal
thank you for the explanation. Can the message queue have multiple "columns." If I'm storing user_id and the message, my function would need to parse that somehow. Or is it just a flat "file."
luckytaxi
Exactly (again) - you have control over the content (or body) and format of your messages, and your processor or function just needs to know how to parse them. (Most message queue implementations also offer a variety of serialization formats, but you usually don't need to deal with those directly unless you're browsing the raw queue content.)
Jeff Sternal
ah ok. i guess xml would come in handy then, eh?
luckytaxi
Definitely, if you need complex messages and have good library support for it. For simple messages you might just use comma- or semi-colon-delimited lists (or JSon, etc.).
Jeff Sternal
Is this whole concept any different from "job scheduling?" I think Six Apart has a product called gearman that pushes out jobs to an available resource.
luckytaxi
Message queuing is an asynchronous, pull-oriented technology, so worker processes don't need to be available (or even exist) at the time messages are created. Supporting that workflow entails a lot of additional support for auditing messages and their delivery, providing failure recovery points, and ensuring their transactional integrity (among other things).
Jeff Sternal
+2  A: 

Do I just insert the "comment" then have a "job" that pulls that comment out and plug it into a function that processes it?

One of the points with message queues is to decouple services, and to do asynchronous processing.

You'd have a message queue service running, when someone changes their status you'd send a message to a particular queue. You might do that from the php code that gets run when a user does change his status.

You'd then have a service/background job running somewhere that pulls messages from that queue. It'd certainly be something external to the PHP process that sent the message based on an HTTP call. That job pulls out messages from the queue, does the message processing- like figuring out who to send the mail to and then send the mail.

Now you've a flexible way to handle such mail updates.

  • It'd be quite easy for that mail update service to run on another machine than the web server.
  • It's easier to scale if you'd need more power.
  • You could more easily implement a delayed sending - e.g. you might wait a minute or 2 with sending the mail and if the user updates his status again, you send just 1 mail, not 2
  • Had you done all this within the http request of the user that changed his status, he'd have to wait until all that processing was finished.
leeeroy
Yea, i ran into an issue with sending a boat load of emails! I'll have to figure out which language to use when pulling the data out. I wonder how well php works via the CLI.
luckytaxi