views:

855

answers:

1

I'm trying to determine my options for clustering my ServiceMix 3.3.1/Camel 2.1/AMQ 5.3 application. I'm performing high volume message processing and I need to cluster for high availability and horizontal scalability.

Here is basically what my application does...HTTP->QUEUE->PROCESS->DATABASE->TOPIC

from("jetty:http://0.0.0.0/inbound") .to("activemq:inboundQueue");

from("activemq:inboundQueue?maxConcurrentConsumers=50") .process(decode()) .process(transform()) .process(validate()) .process(saveToDatabase()) .to("activemq:topic:ouboundTopic");

So, I've read all the ServiceMix and AcitveMQ clustering pages, but am still not sure which way to go.

I know I can use a Master/Slave setup for HA, but that doesn't help with scalability.

I've read about network of brokers, but am not sure how this applies. For example, if I deploy identical Camel routes on multiple nodes in a cluster, how will they "interact" exactly? If I point my HTTP producer at one node (NodeA), which messages will get sent to NodeB? Will the queues/topics be shared between Node A/B...if so how, are messages split or duplicated? Also, how would an external client subscribe to my "outboundTopic" exactly (and get all messages, etc)?

Alternatively, I've been thinking that I should just share a broker between multiple ServiceMix instances. That would be cleaner in that there would only be one set of queues/topics to manage and I could scale by adding more instances. But, now I'm limited to the scalability of a single broker and I'm back to a single point of failure...

If anyone can clarify the trade-offs for me...I'd appreciate it.

+2  A: 

There are a multiple strategies to scale up when you're using the ServiceMix/Camel/ActiveMQ. Because each piece of software offers so many options, there are a variety of paths you could take depending on what portion of the app needs to scale. Below is a high level list of a few strategies:

  • Increase the number of inbound Jetty instances - This requires starting more instances of the web server and either load balancing requests across the multiple instances or exposing multiple URLs and sending all requests to the same inbound queue in ActiveMQ.

  • Increase the number of ActiveMQ instances - By starting up additional ActiveMQ instances and networking them together, you're creating a network of brokers. In some circles this is referred to this as distributed queues because a given queue can be made available across all of the brokers in the network. But if you're going to start up multiple instances of ActiveMQ, you should just consider starting up additional instances of ServiceMix.

  • Increase the number of ServiceMix instances - Each instance of ServiceMix embeds an instance of ActiveMQ. By increasing the number of instances of ServiceMix, not only are you increasing the number of ActiveMQ instances (which can be networked together to form a network of brokers) but you then have the ability to deploy more copies of your application across these instances of ServiceMix. If you need to increase the number of ActiveMQ or ServiceMix instances, then you can deploy a consuming application with the appropriate amount of concurrent consumers for each instance. Messages don't get split or duplicated, they are distributed in a round robin fashion to all consumers on the queue, no matter where they are located, based on consumer demand. I.e., if one ActiveMQ instance in the network has no consumers, it won't have any messages on it's instance of the queue to be consumed. This leads to my last suggestion, increasing the number of consumers polling the inbound queue.

  • Increase the number of JMS consumers on the inbound queue - This is probably the easiest, most powerful and most manageable way to increase throughput. It's the easiest because you're deploying additional instances of your consuming application so that they compete for messages from the inbound queue (no matter whether they're competing for a local queue or a queue that is distributed through a network of brokers). This may be as simple as increasing the number of concurrent consumers or a bit more involved by splitting apart the portion of the application that contains the consumers and deploying it to many instances of ServiceMix. It's the most powerful because it's not usually difficult and scaling event driven applications is always done by increasing the number of consumers. It's the most manageable because you have the ability to change the way that your applications are packaged so that the consuming application is wholly separate giving it the ability to be distributed.

This last suggestion is the most powerful way to scale your application. As long as the incoming HTTP endpoint can handle a large amount of traffic, you may only need to increase the consumers on the inbound queue. The big reason for doing this is that either the consumers or the bean to which they hand off is doing all the heavy lifting, the major amount of processing and validation. Typically it's this process that ultimately needs the most resources.

Hopefully this provides the information you need to begin to go in one direction, or possibly a few, depending on what portion of your app you actually need to scale. If you have any questions, please let me know.

Bruce

bsnyder
Bruce, thanks. I have been using the "maxConcurrentConsumers" property to multi-thread the consumers from the inbound queue.Now I'm trying to take the next step and distribute the load across multiple machines. So, it sounds like I could just setup multiple identical SMX instances configured in a network of brokers and that would distribute the load for my needs.Do MessageGroups still provide thread affinity with a network of brokers? Also, I need to make the outboundTopic messages available to a portal...would the portal need to connect to each broker then to get all the messages?
BenODay
I don't believe that message groups will provide exclusivity in a network of brokers. Total ordering only works for a single broker at a time so I think that message groups are the same way. As long as all messages are sent to the outbound topic then all messages should be consumed, no matter which broker where the subscription is registered. Since it's a topic, you could use a durable subscription, too. Though I'm not sure that a single consumer with a durable subscription is a good idea since you're using multiple concurrent consumers to pull messages from the inbound queue.
bsnyder