views:

164

answers:

1

A recent presentation I saw regarding RabbitMQ mentioned the use of something called an "idempotency barrier" for message de-duplication. Is this just a fancy name for a message conflator or is it something more specific. If so, what exactly is it? A google search yielded results which are mostly related to RabbitMQ, with little explanation of what it was.

+8  A: 

Idempotency is the behavior of a function where performing the function on the function's output n times results in the same state as the function being performed once. This is useful in a messaging environment because delayed messages do not cause unexpected behavior. While conflation implies that messages would be concentrated to prevent duplication, idempotency implies that the messaging framework would permit multiple transmissions and executions of duplicate messages, while maintaining that the execution of multiple copies of a message has the same result as execution of one message.

hasalottajava
If I understood correctly, your last sentence seems to imply that idempotency is actually a function of the message consumer, rather than the framework. If this is the case, what does it mean for the framework to be indempotent? If the framework allows duplicates to be executed, it would be up to the client to detect duplicates and either ignore them or remain unaffected.
omerkudat
In a messaging environment, idempotent behavior can be created in one of two ways. You can encapsulate it into your message, think of tail recursion, or you can have the client implement some caching mechanism as you describe. The caching technique you are thinking of is a conceptually simple technique, but it doesn't scale well, as the cache will need to continually grow as you leave the message consumer running. This could lead to an out of memory condition if there is high throughput through the service.
hasalottajava
You can implement a sliding window (similar to TCP) if you know what your SLAs are in terms of delayed messages. ie, If you create an SLA that only the last 1000 messages need to be kept in the cache, then you've got bounded memory. This type of arrangement should suffice for most situations as they will have a realistic upper bound for delays - but obviously doesn't allow "infinite" delays if you ever had a need for that. Bottom line, figure out your SLAs.
Michael Hart