views:

784

answers:

9

I do some c++ programming related to mapping software and mathematical modeling. Some programs take anywhere from one to five hours to perform and output a result; however, they only consume 50% of my core duo. I tried the code on another dual processor based machine with the same result.

Is there a way to force a program to use all available processer resources and memory?

Note: I'm using ubuntu and g++

+11  A: 

A thread can only run on one core at a time. If you want to use both cores, you need to find a way to do half the work in another thread.

Whether this is possible, and if so how to divide the work between threads, is completely dependent on the specific work you're doing.

To actually create a new thread, see the Boost.Thread docs, or the pthreads docs, or the Win32 API docs.

[Edit: other people have suggested using libraries to handle the threads for you. The reason I didn't mention these is because I have no experience of them, not because I don't think they're a good idea. They probably are, but it all depends on your algorithm and your platform. Threads are almost universal, but beware that multithreaded programming is often difficult: you create a lot of problems for yourself.]

Steve Jessop
@Edit: Yea, I don't see the point in using libraries in C++ (like boost). =/ @Mild-Offtopic: Yea, just learned threads, roughly, in C++ recently. :-) Good advice. :)
Zack
@Zack: do you mean you don't see the point in using libraries in C++ in general? Because Boost is amazing.
rlbond
The point in using concurrency libraries/data structures/design patterns is that it simplifies some of the issues related to concurrency. With a suitable design, it will be enough for just a few components to be concurrency-aware and the rest can be single-threaded, while still taking advantage of all available CPU cores.
Esko Luontola
+2  A: 

To take full use of a multicore processor, you need to make the program multithreaded.

Esko Luontola
A: 

By 50%, do you mean just one core?

If the application isn't either multi-process or multi-threaded, there's no way it can use both cores at once.

R. Bemrose
+5  A: 

You need to have as many threads running as there are CPU cores available in order to be able to potentially use all the processor time. (You can still be pre-empted by other tasks, though.)

There are many way to do this, and it depends completely on what you're processing. You may be able to use OpenMP or a library like TBB to do it almost transparently, however.

Tim Sylvester
+5  A: 

The quickest method would be to read up about openMP and use it to parallelise your program.

Compile with the command g++ -fopenmp provided that your g++ version is >=4

Simon Walker
Did you mean "version >= 4"?
dmckee
I did thanks, changed
Simon Walker
+3  A: 

You're right that you'll need to use a threaded approach to use more than one core. Boost has a threading library, but that's not the whole problem: you also need to change your algorithm to work in a threaded environment.

There are some algorithms that simply cannot run in parallel -- for example, SHA-1 makes a number of "passes" over its data, but they cannot be threaded because each pass relies on the output of the run before it.

In order to parallelize your program, you'll need to be sure your algorithm can "divide and conquer" the problem into independent chunks, which it can then process in parallel before combining them into a full result.

Whatever you do, be very careful to verify the correctness of your answer. Save the single-threaded code, so you can compare its output to that of your multi-threaded code; threading is notoriously hard to do, and full of potential errors.

It may be more worth your time to avoid threading entirely, and try profiling your code instead: you may be able to get dramatic speed improvements by optimizing the most frequently-executed code, without getting near the challenges of threading.

ojrac
+1 for the importance of dividing your problem set.
windfinder
+1  A: 

An alternative to multi-threading is to use more than one process. You would still need to divide & conquer your problem into mutiple independent chunks.

jon hanson
+1 Why do people always forget this option? In some cases it can both be easier to get right and scale better than threads (no locking to access the same malloc/free heap in the same address space).
Daniel Earwicker
A: 

Add a while(1) { } somewhere in main()?

Or to echo real advice, either launch multiple processes or rewrite the code to use threads. I'd recommend running multiple processes since that is easier, although if you need to speed up a single run it doesn't really help.

MSN
A: 

To get to 100% for each thread, you will need to:

(in each thread):

  • Eliminate all secondary storage I/O (disk read/writes)
  • Eliminate all display I/O (screen writes/prints)
  • Eliminate all locking mechanisms (mutexs, semaphores)
  • Eliminate all Primary storage I/O (operate strictly out of registers and cache, not DRAM).

Good luck on your rewrite!

kmarsh
Sometimes it doesn't pay to tell the truth.
kmarsh