views:

228

answers:

4

This is in relation to the ongoing discussion in my previous Question

http://stackoverflow.com/questions/1810313/performance-penalty-of-message-passing-as-opposed-to-shared-data

One of issues being discussed was the amount of work needed to distributed algorithms in Erlang using Message Passing vs. Shared state. My viewpoint is that implementing distributed leader election using a shared state (perhaps a Record in DB) is easier than designing an algorithm that is robust to loss of messages. Isn't it?

The problem with implementing message passing based algorithms is that we must either make the distributed algorithm robust to loss of messages or somehow make sure that messages are always delivered even if multiple attempts are required. Of course, distributed leader election is a well-known problem and I think a robust message passing algorithm already exists (may be nancy lynch's book gives it), but I am just taking this problem as an example to drive a point.

Thanks!

+2  A: 

As many things in life: it depends.

It depends on the scale we are talking about here. What are the characteristics of your shared-state? If it is a SPOF ("Single Point Of Failure"), then you'll be facing fault-tolerance issues as well as probably access issue (bandwidth, processing).

If we are talking a small scale system with lesser concern given to fault-tolerance, then yes it might be easier just to use a central DB.

Then again, I am a bit confused about the question: Message Passing deals with the communication aspects whereas a database is used to deal with storage aspects. Why do I get the impression these things get mixed up in your question? (it could be just me of course).


On Message Passing robustness: what is the issue here really? The communication layer being TCP helps a lot albeit being imperfect as anything on this earth. If you are concerned about the overall complications of communicating between nodes, then this won't go away regardless of the platform / language you choose. Erlang just makes distributed communication a lot easier.

jldupont
Yes. That's my point. But the algorithms given in Textbooks assume that message is always delivered. Since that cannot be guaranteed in the real world, basically the algorithm fails in such conditions and give up right? Practically speaking, what % of messages are lost?
ajay
Depends on the "substrate" (i.e. what makes up the communication medium) and the loading on the "server side". As the load increases on the network, the packet-loss ratio goes up.
jldupont
Consider my point of view: if the algorithm in question **must** be distributed, then this implies **communication between nodes**. In this case, Erlang gives you very compelling tools. On the other hand, if you are talking about an isolated environment, Erlang looses its charms.
jldupont
In a real DB you are never allowed to directly access the internal data records, you always have to go through an interface. So talking about shared state is not really valid here.
rvirding
Why talking about a DB? Because any form of communication between 2 nodes can be simulated using the DB to store the state information.Lets say node A wants to tell node B that it is READY.Two options. (Option 1) Use Message Passing to send a Message="READY" to node B (Option2 ) Update a column of a record to value "READY". node B periodically checks the same column. As soon as it sees the value "READY" node B has got the message! Message Passing simulated successfully using the DB.
ajay
ajay
... and the replication activity relies on **communication message** that can fail and of course the replication process recovers from the said communication loss/temporary failures. Furthermore, the nodes themselves (assuming you are referring to nodes existing on separate physical hardware) need **communication** facilities to reach the said DB.
jldupont
+1  A: 

I think you make some weird assumptions in your questions/discussions. You accept that messages can disappear but that working with shared state will always work and never fail, and then you try to compare the complexity of algorithms. It doesn't work like that. When things go wrong, which they will, they will do so however they are programmed. Using mutable shared state for building concurrent systems is difficult, using it to build robust concurrent systems is a right bastard.

In any real distributed system if you want to make sure it is robust you will always have to take into account that messages may not arrive. Exactly how this is done depends on the application.

rvirding
+4  A: 

Message sending is asynchronous and safe, the message is guaranteed to eventually reach the recipient, provided that the recipient exists.

If a process dies, an exit signal will be emitted to all linked processes.

In a distributed environment, you can even use erlang:monitor_node/2 to monitor the status of a node. From the docs,

A message{nodedown, Node}is received if the connection to it is lost.

IMO, all you need to do is to handle connection loss.

TP
+1  A: 

From where you say "The problem with implementing...", you have a tautology.

Irrespective of if you implement your algorithm by using message passing or shared-state, your code MUST handle failures in order to be robust. By all means you can not bother with error handling, but then your code is by definition not robust. If you say "Erlang has to do such-and-such in order to be robust" and then say "but I can update a row in a database and it will always work" then you're not comparing apples to apples. In any case the statement about the database (or shared-state) always working at the first attempt is obviously false.

So to implement using message passing you need an API that wraps the message passing constructs to a high enough level such that it is suitable to be used by your application programmers, but it is still up to them to check for and handle failures.

It boils down to this (this is why I said it is a tautology) :

a) If using Message Passing, you code must implement "the message is delivered successfully or you get an error (timeout)".

b) If using a database then your code must implement "the row is updated successfully, or you get an error." (The same applies to any shared-state solution).

Since a and b are equivalent, the "problem" you talk about applies equally to message passing and to your database approach, and so there's not really much more to discuss.

Now as to which is easier (or even more appropriate) there is no simple answer. The sensible answer is "use whatever approach fits your problem domain better". When talking about Erlang you should also consider what libraries/tools you are using, such as OTP, as they can have a massive impact on how you go about implementing these things.

Tim