tags:

views:

156

answers:

2

I am facing problem with writing quicksort in erlang. What I am doing is I am spawning two processes and then blocking the current process till I get the response from both the left and right sub-arrays. Getting both these responses I send a message to its parent giving it the computed list. Parent ! {self(), Lone ++ [H] ++ Ltwo}

But I am getting error of gettting undef in both the sub-processes. Here is the code.

 22 quick(Parent, []) -> Parent ! {self(), []};
 23 quick(Parent, [H | T]) ->
 24     Pone = spawn_link(main, quick, [ self(), [ X || X <- T, H >= X ] ]) ,
 25     Ptwo = spawn_link(main, quick, [ self(), [ Y || Y <- T, H < Y ] ]) ,
 26     receive
 27         {Pone, Lone} ->
 28             receive
 29                 {Ptwo, Ltwo} -> Parent ! {self(), Lone ++ [H] ++ Ltwo}
 30             end;
 31         {Ptwo, Ltwo} ->
 32             receive
 33                 {Pone, Lone} -> Parent ! {self(), Lone ++ [H] ++ Ltwo}
 34             end
 35     end.
 36 
 37 sortquick(List) ->
 38     quick(self(), List).

called as:

main:sortquick([12,4,7,22,25]).
+11  A: 

The code itself is not the problem. The quick sort works fine. The reason, probably, why you are getting an undef in the subprocesses is due to the fact that the function quick/2 is not exported at all. When you call spawn_link with the module and function, the function needs to be exported.

You can fix this by either added

-export([quick/2]).

Or by changing the spawn_links to something like

spawn_link(fun() -> quick(Self, [Y || Y <- T, H < Y]) end

Though if you do it the latter way, you'll need to create a variable

Self = self()

Before you make the calls or else it wont return to the proper process.

Olives
Thanks, it works by exporting quick/2. May I ask why why I need to export quick/2 ?
pranjal
Because if it is not exported, then it can only be called from functions that are native to your module. Think of it as a private function.Then when you call erlang:spawn_link/3 it is a different modules effectively trying to call main:quick/2. But it can't because main:quick/2 is a private function and no other module knows about it.
Olives
+1  A: 

The code above works fine by exporting quick/2 function.

Later on I compared the run time of the spawned quicksort vs the unspawned quicksort. The spawned quicksort is taking anywhere between 15sec to 32 sec to sort a list of 1 million random numbers in range (1,1000000).

The unspawned quicksort is(code below):

 55 quicksort([]) -> [];
 56 quicksort([H]) -> [H];
 57 quicksort([H | T]) ->
 58     [Headnew | Tailnew] = [H | T],
 59     quicksort([ X || X <- Tailnew, Headnew >= X ]) ++ [Headnew] ++ quicksort([ Y || Y <- Tailnew, Headnew < Y ]).

takes anywhere between 5sec and 8sec to sort same list of a million random numbers.

I am testing code on my old Thinkpad, 1.7 Ghz processor (single core) and 512 Mb RAM.

Any explanation for spawned quicksort to perform poorer than unspawned one ?

pranjal
Well, if you only have a single core then you don't have any ability to do work in parallel, so I wouldn't expect any benefit.On the other hand, while processes may be cheap in Erlang they aren't free and message sends and spawns require a complete copy of the list. Also think about what happens near the base case, you spawn lots of processes to send back the empty list.It might be interesting to write a version that spawns processes for the first couple of steps and then falls back to the unspawned version. (If you have a multicore machine to try it on.)
cthulahoops
On my dual core laptop it's still slightly slower even when I spawn two processes only.
cthulahoops
I tested on dual core, in fact I limited the spawning when array size reaches below 100. (total list size being 1 million), I get around 2.4 sec with spawned version and around 3.2 sec with unspawned version.
pranjal
Don't forget that for each spawned process Erlang copies the list to that processes own heap. With a rough calculation `log2Len * Len` list elements are copied in total, which is quite a lot for a list of a million integers.
Zed
`lists:sort/1` for 1 million takes about 800ms. `lists:sort/1` is merge sort written in smart way. If you consider that this list can be sorted by algorithm with worst case O(N*log2N) (i.e. merge sort) than even one copying of all list members between processes can matter and it does. You have to have a lot of cores to make parallel version faster and tune it carefully.
Hynek -Pichi- Vychodil