views:

78

answers:

5

Hi,

I am fairly new with concurrent programming and I am learning it.

I am implementing a quick sort in Java JDK 7 (Fork Join API) to sort a list of objects (100K).

While using this recursive piece of code without using concurrency,i observe no memory explosion, everything is fine.

I just added the code to use it on multi cores (by extending the class RecursiveAction) and then the memory usage jumped very high until it reached its limits. By doing some profiling i observe a high creation rate of threads and i think its expectable. But, is a java Thread by itself much more memory demanding or am i missing something here ?

Quicksort must requires a lot of threads but not much than regular objects.

Should I stop creating RecursiveAction Threads when i meet a threshold and then just switch to a sequential piece of code (no more threads)?

Thank you very much.

A: 

Can you describe how you're making QuickSort multi-threaded? Are you creating a thread at each recursive step? How many threads are being created exactly?

Amir Afghani
My first and very naive way is to create one new Thread/sublist to partition. I am trying to find another way but I struggle with Fork Join way of creating threads as it creates a big tree of dependant threads where a node wont be done until both of its children are also done.
charpentier damien
A: 

Yes, switching over to sequential code is a good idea when the unit of work is in the region of ca. 10,000-100,000 operations. This is just a rule of thumb. So, for quick sort, I'd drop out to sequential execution when the size to be sorted is less than say 10-20,000 elements, depending upon the complexity of the comparison operation.

What's the size of the ForkJoinPool - usually it's set to create the same number of threads as processors, so you shouldn't be seeing too many threads. If you've manually set the parallelism to be high (say, in the hundreds or thousands) then you will see high (virtual) memory use, since each thread allocates space for the stack (256K by default on 32-bit windows and linux.)

mdma
Hi mdma,I am playing with ForkJoinPool a bit. I will come back with more outputs. Thank you.
charpentier damien
+3  A: 

Java threads usually take 256k/512k(depeding in OS, jdk versions..) of stack space alone, by default.

You're wasting huge resources and speed if you're running more threads than you have processors/cores for a CPU intensive process such as doing quicksort, so try to not run more threads than you have cores.

nos
So fast ! Thank you very much for the tip !
charpentier damien
A: 

As a rule of thumb for a CPU bound computation, once your number of threads exceeds the number of available cores, adding more threads is not going to speed things up. In fact, it will probably slow you down due to the overheads of creating the threads, the resources tied down by each thread (e.g. the thread stacks), and the cost of synchronizing.

Indeed, even if you had an infinite number of cores, it would not be worth creating threads to do small tasks. Even with thread pools and other clever tricks, if the amount of work to be done in a task is too small the overheads of using a thread will exceed any savings. (It is difficult to predict exactly where that threshold is, and it certainly depends on the nature of the task as well as platform-related factors.)

Stephen C
Exactly what i faced here. Thank you.The thing is, the parallelization and threads control will be tougher than i thought.Indeed with fork join, in a Thread i can create two subthreads (one per sublist around the pivot). One of the two can be done very quickly but the parent thread wont be done until the second child is also done with its work ( join() method). This scheme repeats itself a lot.By limiting the number of threads in the ForkJoinPool to two, I will end up with one of the two cores still running because the other one can be done very quickly.I have to rethink it.
charpentier damien
A: 

I changed my code and so far I have better results. I invoke the main Thread task in the ForkJoinPool, in the Threads, I dont create more threads if there are a lot more active threads than available cores in the ForkJoinPool.

I dont do synchronism through the join() method. As a result a parent thread will die as soon as it created its offsprings. In the main function that invoked the root task. I wait for the tasks to be completed, aka no more active threads. Its seems to work fine as the memory stays normal and i gained lots of time over a the same piece of code executed sequentially.

I am going to learn more.

Thank you all !

charpentier damien