views:

410

answers:

4

I'm studying Erlang's process model at the moment. I have hit a snag in a tech report (section 3, paragraph 2) on Erlang:

This explains why it in some cases can be more efficient to run several SMP VM's with one scheduler each instead on one SMP VM with several schedulers. Of course the running of several VM's require that the application can run in many parallel tasks which has no or very little communication with each other.

Now this paragraph is confusing me; I can see the uni-process multiple scheduler scenario, but I am failing to see multiple processes with a single scheduler; Presumably each process would have a different node name, and this would mean a certain application, without modification, cannot be used with this model; the virtue of not requiring modification has been mentioned as a key feature of SMP in the report. If the multiple processes have the same node names, than performance would be disastrous due to inter-Erlang-process messaging storms -- this assume the use of in-memory amnesia. Is there some process model that is not introduced in the article and that I am missing here ?

What is the author trying say here ? is he trying to suggest that an application would have to be rewritten (to take multiple unique node-names into account) for the multi-process single-scheduler case ?

-- edit 1: Clarification of Source of Problem --

The question has been answered through discussion; the following is an outline of the trouble I had.

The issue for this question has been that the documentation, as I recall, does not touch on a scenario of running multiple Erlang emulators per physical machine -- it has always been shown that the emulator represents your physical machine (in industrial usage); also, the scenario of having to explicitly partition a program for computational efficiency has never been considered. This sudden introduction has been the source of my woe.

The convention is still biased towards creating LOTS of processes and that the future holds many improvements for the SMP emulator for Erlang, and this means that single node per machine is still a very viable option assuming favourable application design.

A: 

I believe the answer is in the preceding paragraph:

The SMP VM with only one scheduler is slightly slower (10%) than the non SMP VM. This is because the SMP VM need to use locks for all shared datastructures. But as long as there are no lock-conflicts the overhead caused by locking is not that high (it is the lock conflicts that takes time).

Scheduler's reliance on locks for shared data structures can impose an overhead on a given system. It seems to follow that having multiple schedulers on one SMP VM imposes a collectively greater overhead.

jkndrkn
yes I understand that; however,my problem is with the running of **several** Erlang VM's with one scheduler each and its implications.
Hassan Syed
+4  A: 

Rewrite after reading article:

This explains why it in some cases can be more efficient to run several SMP VM's with one scheduler each instead on one SMP VM with several schedulers.

  • Non-SMP VM has no-lock so runs fast.
  • Single scheduler SMP VM 10% slower, due to cost of checking locks
  • Multiple scheduler SMP VM slower again due to using/waiting for locks

Of course the running of several VM's require that the application can run in many parallel tasks which has no or very little communication with each other.

  • I think: Nodes on the same server have to have different names.
  • Inter process messaging while by slower due to the inter-process nature verse intra process messaging of a VM node.
Simeon Pilgrim
Thank you, It looks like we got the same information out of the article: there is no magic process model that lets Erlang processes collaboratively serve a single node-name :D
Hassan Syed
@Vainstah, what do you mean by Erlang process? "there is no magic process model that lets Erlang processes collaboratively serve a single node-name" is simply non-sense.
Zed
All nodes, whether on one machine or not, always have different names. Node names are unique.
rvirding
@zed well it is possible for the Erlang Emulator machinery to seamlessly use multiple processes to serve the same node-name on the same physical machine (or even across machines) -- this would be non-sense. for example there is nothing in the programming model (that I can see) preventing an implementation where erlang-process-1 (e1), physical-process-1 (p1) to communicate with e2p2 as if they were in the same process. I agree completely with you that this is a crazy way of doing things, and this was another source of confussion for me. All this would require is to enhance the PID structure.
Hassan Syed
+1  A: 

If you have multiple schedulers in a single VM, they will inevitably contend over various resources (e.g. ets meta table, atom-table, scheduler run-queue during migration, etc.) because of the inner architecture. If you have a single scheduler, contention will obviously not occur. Lock checking and acquiring will still be done though, so running a non SMP VM instead shall yield even better performance (but requires a rebuilding of the VM from source).

Take a four-core machine for example. Option one means that you run four instances of the Erlang VM, each with a single scheduler, affinity set to different processor cores. Option two means running a single Erlang VM with four schedulers, each scheduler's affinity set to different processor cores.

If you have a whole lot of independent processes to run, option two will result in better performance, because the four cores will be fully utilized (theoretically). In contrast, in option one, this won't be possible, because the lock contention will make execution on cores wait for each other every now and then.

On the other hand if your processes need to chatter a lot, option one is the way to go because the inter-process communication is way cheaper than communication between different VMs. You gain more with this than you lose with lock contention.

Zed
Are you trying to say that in option one your application would have to be written using four separate nodes (each node with a unique name ? ). Such a suggestion from this author seems quite odd for Erlang as these are emulator/hardware specific configuration issues. I suppose the application could reflect how many Erlang processes are up and get the names of the nodes in order to parallelize across them -- dynamically.
Hassan Syed
If you have ten servers, then the question becomes running 10 VMs, or 40. It's not that different.
Zed
btw, Kenneth Lundin ("this author") is the Manager of the Erlang/OTP team.
Zed
As @Zed says, it depends on how you want to structure your application. One benefit of having many nodes is that it allows you take down the system in smaller chunks than if you had one, or a few, nodes. Which may be important to you.
rvirding
+1  A: 

There are some advatanges with several nodes on one physical machine.

1) Resource locking overhead as mentioned.

2) Fail-over. In telecom products you really don't want to have the beam come crashing down on you. If you have NIFs or linked-in drivers in your system this might occur.

3) Memory locality. Few nodes gives you a poor-mans way to force processes to a few cores. This could be a big boost for NUMA archs typically but also for SMP. The scheduler don't take NUMA into account (yet). You can spawn a process to a specific scheduler and lock it to it, it won't migrate but that is an undocumented feature ... or it was removed all together. I forget.

With several nodes you will need a load balancer between the nodes of course but that is the usual way to do it anyways. Some logic that supervises the nodes.

However, the numbers from the EUC papers are over a year old [@] and I wouldn't recommend a multi-node approach if you don't really need it. The runtime system is much better at handling these types of problems today. A lot of lock overhead has been removed and the mrq-scheduler has been improved.

@ 2009's numbers look like this.

Edit:

Regarding 3) the spawn feature i mentioned is,

spawn_opt(fun() -> ... end, [{scheduler, Id}]) -> pid(),
    where Id is an integer and refers to a specific scheduler.

I wouldn't recommend using it since it undocumented.

psyeugenic
@Hassan, The numbers in Patriks presentation are in the Tilera case from ~august 2008 and in the Nehalem case from ~may 2009.
psyeugenic