views:

1102

answers:

7

Anybody know of a good and reliable open source queuing server/ platform?

I'm working on a project which we need to process millions of items from a queue per day. My challenge is that the queue is not a FIFO, meaning that new items are prioritized and can be inserted somewhere in the middle of the queue. Also, to process these items we'll be using a parallel server distribution platform so the queue needs to accessible from many servers.

Thanks!

+4  A: 

Not quite open source, but quite open: Amazon's Queue service. Very reliable, very easy to use.

Open source Ruby: Beanstalkd.

JBoss has Java-Messaging-Service support. See the message bean page.

Apache has a JMS product; all Apache products are open source.

James A. Rosen
Beanstalkd is a daemon written in C. It has client libraries for a number of languages. It is far from a ruby-only solution.
Alister Bulman
+4  A: 

I like Gaius's idea of Amazons SQS however it has a large delay time between messages. Some benchmarks show 15-30 seconds a message others are as slow as a min a message. So if speed is an issue then you might want to run your own MOM.

I would recommend ActiveMQ from Apache. We have done benchmarks and its speeds are pretty close to socket connections. Have never used it on a large production scale app though.

Bernie Perez
+5  A: 

ActiveMQ is quite a good product : easy to set up, easy to deploy, easy to work with. Clusterizable, many protocols supported, etc. We are trying it in production, seems reliable (with its own problems, but not so many)

I've tried JBoss Messaging, but I found it much more difficult to use, and less mature (it's a refactoring of JBoss MQ and it's not quite stable and complete...)

[Edit]Sorry, I did not read your question with enough attention... I did check the JMS specification : it does implement a priority ordering of message, but there are only 10 levels of priority in JMS, and implementors are not forced to respect them (I did not check what AMQ does) But priority management can also be achieved by using differend queues...[/Edit]

Laurent K
+3  A: 

Beside ActiveMQ which should work for your demands you could also take a look at RabbitMQ or OpenAMQ.

Thraidh
A: 

Write your own, using a mysql table indexed on insertion time and priority. Although that doesn't give atomic pop. A little transaction-oriented magic to get atomic pop -- a flag field. Update the top item with a null flag field to have your client's unique identifier, commit, select item with unique identifier, then delete it when done with the item. Now you've got not just a queueing system but a management dashboard too because you can see who is working on what by selecting all the queue-table items with non-null flags.

davidnicol
There's some merit to this. If your service already depends on a central database and if performance requirements aren't very high.
Seun Osewa
+11  A: 

As @davidnicol mentions you can use a database; though the downside with a database is doing great load balancing across many threads/processes is kinda hard; you often get one thread locking the head of the queue making the dispatch a bit single threaded.

One of the main uses of a message queue is usually to get a reliable load balancer - you can then run as many Competing Consumers as you want all pulling from the same queue to give you massive scalability.

If you go the message queue route; then Apache ActiveMQ is the most popular open source implementation and I'd recommend starting with that as its got the biggest and most active community (my personal favourite metric for choosing between similar open source projects).

There are various ways to implement priority queues with ActiveMQ - the main tradeoff is can you deal with the latency the Resequencer pattern introduces or is just using selectors and different process/thread pools of consumers for different ranges of priorities is a better, low latency solution.

While the selector implementation is not the purest priority queue implementation - it does tend to work better in practice as it avoids waiting around for higher priority messages to bubble up; plus it avoids low priority messages which take a long time to process hogging up your processors.

James Strachan
Thanks. I learned about the Resequencer pattern from your answer.
Stephen Harmon
+1  A: 

BTW a different aspect to your question is what API to use to talk to whatever technology you choose.

My recommendation with any middleware or infrastructure is to try to hide the middleware from your business logic as this article describes. Keeping your business logic separate from the middleware lets you change the middleware implementation easily - as there's really no one-size-fits-all middleware technologies - they all have their own pros and cons. Plus requirements do change - particular in the case of load, volume, throughput, synchrony and latency - so you sometimes do need to switch middleware technology during the lifetime of a project

James Strachan