views:

65

answers:

3

I have a single-threaded linux app which I would like to make parallel. It reads a data file, creates objects, and places them in a vector. Then it calls a compute-intensive method (.5 second+) on each object. I want to call the method in parallel with object creation. While I've looked at qt and tbb, I am open to other options.

I planned to start the thread(s) while the vector was empty. Each one would call makeSolids (below), which has a while loop that would run until interpDone==true and all objects in the vector have been processed. However, I'm a n00b when it comes to threading, and I've been looking for a ready-made solution.

QtConcurrent::map(Iter begin,Iter end,function()) looks very easy, but I can't use it on a vector that's changing in size, can I? And how would I tell it to wait for more data?

I also looked at intel's tbb, but it looked like my main thread would halt if I used parallel_for or parallel_while. That stinks, since their memory manager was recommended (open cascade's mmgt has poor performance when multithreaded).

/**intended to be called by a thread
\param start the first item to get from the vector
\param skip how many to skip over (4 for 4 threads)
*/
void g2m::makeSolids(uint start, uint incr) {
  uint curr = start;
  while ((!interpDone) || (lineVector.size() > curr)) {
    if (lineVector.size() > curr) {
      if (lineVector[curr]->isMotion()) {
        ((canonMotion*)lineVector[curr])->setSolidMode(SWEPT);
        ((canonMotion*)lineVector[curr])->computeSolid();
      }
      lineVector[curr]->setDispMode(BEST);
      lineVector[curr]->display();

      curr += incr;
    } else {
      uio::sleep(); //wait a little bit for interp
    }
  }
}

EDIT: To summarize, what's the simplest way to process a vector at the same time that the main thread is populating the vector?

+1  A: 

Firstly, to benefit from threading you need to find similarly slow tasks for each thread to do. You said your per-object processing takes .5s+, how long does your file reading / object creation take? It could easily be a tenth or a thousandth of that time, in which case your multithreading approach is going to produce neglegible benefit. If that's the case, (yes, I'll answer your original question soon incase it's not) then think about simultaneously processing multiple objects. Given your processing takes quite a while, the thread creation overhead isn't terribly significant, so you could simply have your main file reading/object creation thread spawn a new thread and direct it at the newly created object. The main thread then continues reading/creating subsequent objects. Once all objects are read/created, and all the processing threads launched, the main thread "joins" (waits for) the worker threads. If this will create too many threads (thousands), then put a limit on how far ahead the main thread is allowed to get: it might read/create 10 objects then join 5, then read/create 10, join 10, read/create 10, join 10 etc. until finished.

Now, if you really want the read/create to be in parallel with the processing, but the processing to be serialised, then you can still use the above approach but join after each object. That's kind of weird if you're designing this with only this approach in mind, but good because you can easily experiment with the object processing parallelism above as well.

Alternatively, you can use a more complex approach that just involves the main thread (that the OS creates when your program starts), and a single worker thread that the main thread must start. They should be coordinated using a mutex (a variable ensuring mutually-exclusive, which means not-concurrent, access to data), and a condition variable which allows the worker thread to efficiently block until the main thread has provided more work. The terms - mutex and condition variable - are the standard terms in the POSIX threading that Linux uses, so should be used in the explanation of the particular libraries you're interested in. Summarily, the worker thread waits until the main read/create thread broadcasts it a wake-up signal indicating another object is ready for processing. You may want to have a counter with index of the last fully created, ready-for-processing object, so the worker thread can maintain it's count of processed objects and move along the ready ones before once again checking the condition variable.

Tony
I'll add this: don't run any more active threads than you have cores to process them. Multi-threading on a single-core single-CPU machine is slower than simple serial processing.
Pontus Gagge
Ummm... I think you have something backwards. Reading the data and creating the objects is quick, and I couldn't do that in parallel if I wanted to - each object's ctor depends on calculations done by the previous obj's ctor. What I want to do in parallel is call some methods on each object (see the function in OP).
Mark
@Pontus Gagge: There are valid times to create more threads than you have cores; one I can see right away (related to the qt4 tag) is is you want your GUI to remain responsive and updating while other processing is happening. Then you'd want at least 2 threads, even on a single-core system.
Caleb Huitt - cjhuitt
@Mark: not me that got it backwards... "Then it calls a compute-intensive method (.5 second+) on each object. I want to call the method in parallel with object creation." ;-)
Tony
A: 

It's hard to tell if you have been thinking about this problem deeply and there is more than you are letting on, or if you are just over thinking it, or if you are just wary of threading.

Reading the file and creating the objects is fast; the one method is slow. The dependency is each consecutive ctor depends on the outcome of the previous ctor - a little odd - but otherwise there are no data integrity issues so there doesn't seem to be anything that needs to be protected by mutexes and such.

Why is this more complicated than something like this (in crude pseudo-code):

while (! eof)
{
    readfile;
    object O(data);
    push_back(O);
    pthread_create(...., O, makeSolid);
}


while(x < vector.size())
{
    pthread_join();
    x++;
}

If you don't want to loop on the joins in your main then spawn off a thread to wait on them by passing a vector of TIDs.

If the number of created objects/threads is insane, use a thread pool. Or put a counter is the creation loop to limit the number of threads that can be created before running ones are joined.

Duck
`charlie_foxtrot = n00b + overthink;` :-D I had seen some threading examples a while back, and I thought they looked pretty complicated. As a result, I didn't even consider bare-bones stuff like `pthread`. That example proves that it's much simpler than I had imagined.
Mark
Now I remember... the example I'd seen used fork() to create another process, and then the two communicated. So of course it wasn't this simple!
Mark
@Mark - Well I did leave out details but that is the idea. I just noticed your comment about 100k objects so you should definitely think about a strategy for limiting the threads.
Duck
A: 

@Caleb: quite -- perhaps I should have emphasized active threads. The GUI thread should always be considered one.

Pontus Gagge