views:

1030

answers:

1

I have several servers that produce xml files from Python and some other servers that consume those xmls using Java. I've only recently looked into JMS and ActiveMQ and decided to try using it to pass the xml files.

So I set up ActiveMQ daemons on the consumer servers and figured I'd implement some cycling method on the produces to distribute the xmls evenly across the consumers.

Python  -----------------------------> ActiveMQ ---> Java
        \                           /
         \                         /
          ---------------------------> ActiveMQ ---> Java
                                 /  /
Python  ----------------------------

For testing, I ran one producer and one consumer and looked at the results.

To my surprise, the messages from the producer were distributed across all the ActiveMQ servers on the network. Since I only ran one consumer it only received the xmls that got to the ActiveMQ daemon on that machine, and the rest of the xmls were waiting patiently on the other ActiveMQ daemons on other machines.

Python  -----------------------------> ActiveMQ ---> Java (work)
                                          |
                                          |
                                       ActiveMQ (xmls piling up)

EDIT: This is not what actually happened, sorry. See below for details

Now, I'm not complaining, since this is what I wanted anyway, but I'm a little confused: what is the proper way to implement this many-to-many queue that I'm after?

Should I set up ActiveMQ daemons on my producer machines as well, send the xmls to the localhost ActiveMQs and trust the automatic discovery to get the xmls to consumers?

Python ---> ActiveMQ ------------------------------ ActiveMQ ---> Java
                |                                      |
                |                                      |
                |                                -- ActiveMQ ---> Java
                |                               |  
Python ---> ActiveMQ----------------------------

Should I stick to my original plan, and cycle the messages across the consumer machines, just to be safe?

Or is there an API I should use that hide those details from my processes?

BTW, the producers are python processes using STOMP and the consumers are java using JMS.

I apologize if your eyes are hurting from my crappy ASCII art, I wasn't sure if I'm being clear enough with just words.

EDIT

Apparently, when I was running "one producer and one consumer" I didn't notice that the other consumers were already running. They were just not doing anything useful with the xmls they processed. That's why I was seeing partial results.

After reading a bit more and experimenting a little, I figured out the following:

By default, ActiveMQ will auto-discover other ActiveMQ instances on the local network, and create a store-and-forward network of brokers. This means that the producer can post the xmls to any ActiveMQ instance and they will find their way to consumers listening on other ActiveMQ instances on the same network.

Note that the documentation claims that auto-discovery is not recommended for production setups.

The accepted answer below still holds true, though. The simplest way to use ActiveMQ is just use one or two servers as "Queue Servers". Nevertheless, I chose to go with my original plan, because I think it will reduce network traffic (with a middle server, xmls have to go into it and the out of it again).

+1  A: 

Hi, Itsadok, I think you're probably not considering the use of messaging correctly.

Having an MOM instance (whether ActiveMQ, RabbitMQ, or any other MOM broker) on a one-per-consumer case doesn't really make sense conceptually. Rather, it's best to think of your MOM broker as a router of messages.

In that case, you would have one ActiveMQ broker instance (which might be sharded or otherwise scaled if you have scaling problems, or replicated if you have HA considerations) which all producers and all consumers connect to. Then all XML goes to the same broker instance, and all consumers read from the same broker instance. In that case, the broker will determine which consumer the message should go to based on any heuristics it uses.

This also means that you can add and remove producers and consumers dynamically, and nothing ever changes: they all connect to the same brokers, so you can add and remove producers and consumers as your load changes or as systems fail.

Kirk Wylie
Thanks! Can you also explain why by default two ActiveMQ servers auto-detect each other and split the messages? What's the scenario here? (Maybe I should open another question).
itsadok
I'd recommend opening another question, because if nothing else, I don't actually understand the ins-and-outs of ActiveMQ clustering enough to be able to provide a response. I'm sure some ActiveMQ experts probably do though!
Kirk Wylie