views:

442

answers:

2

Hi,

When I run my multi-threaded code, the system (linux) sometimes moves the threads from one processor to another. As I have as many threads as I have processors, it invalidates caches for no good reasons and it confuses my tracing activities.

Do you know how to bind threads to processors, and why does a system would do this ?

+9  A: 

Use sched_setaffinity (this is Linux-specific).

Why would a scheduler switch threads between different processors? Well, imagine that your thread last ran on processor 1 and is currently waiting to be scheduled for execution again. In the meantime, a different thread is currently running on processor 1, but processor 2 is free. In this situation, it's reasonable for the scheduler to switch your thread to processor 2. However, a sophisticated scheduler will try to avoid "bouncing" a thread between processors more than necessary.

Martin B
I would expect the scheduler to avoid this if there is less threads than processors ...
Ben
That's true... you said in your question, though, that you have "many more threads than processors" -- was that supposed to be the other way round?
Martin B
Not it is because knittl edit my question without knowing what I was talking about, I have as many threads as I have processors
Ben
@Ben: unless you have written your own OS, there are not "less threads than processors". Other things on the system are continually taking timeslices, and this can cause some or all of your threads to be descheduled. When they come to be rescheduled, the core on which they last ran may or may not be available.
Steve Jessop
@Ben: (+onebyone) For the scheduler, the response time is as important as the cache efficiency. Note that by constraining the system to cpu cores, the response time may go up a lot. Also the system will be less portable. Check if this is worth the gains.
Adriaan
+2  A: 

You can do this from bash. There is a wonderful taskset command I acquainted in this question (you may also find valuable discussion on how scheduler should operate there). The command takes a pid of a process and binds it to the specific processor(s).

taskset -c 0 -p PID

binds the process with PID to processor (core) number 0.

What does it have to do with threads? To each thread is assigned an identifier with the same rights as pid, also known as "tid". You can get it with gettid syscall. Or you can watch it, for example, in top program by pressing H (some processes will split to many seemingly equal entries with different pids---those are threads).

Pavel Shved