ansaurus

Question

OpenMP parallelization on a recursive function

Answer 1

A:

to parallelise the child thread, simply put a pragma before the loop:

#pragma omp parallel for
for (i=0; i < elements; i++) 
{
}

Job done.

Now, you're quite right you cannot get any threading library to do one bit before another in a fully parallel way (obviously!), and openMP doesn't have a 'lock' or 'wait' feature (it does have a 'wait for all to finish' keyword - Barrier), its not designed to emulate a thread library, but it does allow you to store values "outside" the parallel section, and to mark certain sections as 'single threaded only' (Ordered keyword) so this may help you to assign the indexes in a parallel loop while other threads are assigning elements.

Take a look at a getting started guide.

If you're using Visual C++, you'll also need to set the /omp flag in your compiler build settings.

gbjbaanb 2009-05-07 17:17:42

Answer 2

+1 A:

I think you should clarify better your question (e.g. what exactly must be done serially and why)

OpenMP (like many other parallelization libraries) does not guarantee the order in which the various parallel sections will be executed, and since they are truly parallel (on a multicore machine) there might be race conditions if different sections write the same data. If that's ok for your problem, surely you can use it.

Davide 2009-05-07 17:24:16

Davide, thanks for making me think through the process a little more. In editing my question and thinking it through more rigorously, I figured out a sufficient answer.

Anthony Johnson 2009-05-07 18:09:16

Answer 3

+1 A:

As gbjbaanb mentioned, you can do this easily - it just requires a pragma statement to parallelize this.

However, there are a few things to watch out for:

First, you mention that order is crutial here. If you need to preserve ordering in flattening a hierarchical structure, parallelizing (at this level) is going to be problematic. You're likely going to completely lose your ordering.

Also, parallelizing recursive functions has many problems. Take an extreme case - say you have a dual core machine, and you have a tree where each "parent" node has 4 children. If the tree is deep, you very, very quickly "over-parallelize" the problem, typically making things worse, not better, performance wise.

If you're going to do this, you should probably put a level parameter, and only parallelize the first couple of levels. Take my 4 child-per-parent example, if you parallelize the first 2 levels, you already are breaking this into 16 parallel chunks (called from 4 parallel chunks).

From what you mentioned, I'd leave this portion serial, and focus instead of the second where you mention:

"Then it traverses that array multiple times to draw objects/overlays, etc."

That sounds like an ideal place to parallelize.

Reed Copsey 2009-05-07 17:30:57

Reed, I agree that traversing a one dimensional array is much more easy to parallelize than a recursive tree search, but since OpenGL is not thread-safe, the actual drawing part has to be done in serial. However, I think I've got a valid solution where I can do a minimalist recursive algorithm to make the array index associations in serial, and then do the filling of the array parallel.

Anthony Johnson 2009-05-07 18:24:40

Answer 4

A:

Here's a modified piece of pseudo-code that should work.

populatearray(thescene)
{
  recursivepopulatearray(thescene)

  #pragma omp parallel for
  for each element in array
    populate array element based on associated object
}

recursivepopulatearray(theobject)
{
  for each childobject in theobject
  {
     assign array index and associate element with childobject
     recursivepopulatearray(childobject)
  }
}

Anthony Johnson 2009-05-07 19:49:06

Answer 5

A:

Questions about OpenMP programming can get answered by the experts over at the OpenMP Forum at openmp.org, the official OpenMP website.

2009-05-17 07:04:16

ansaurus

tags:

views:

answers:

OpenMP parallelization on a recursive function

related questions