views:

202

answers:

3

I'm working on an embedded C++ application running on Linux. I've recently encountered some really strange performance problems with pthreads.

My system has 8 threads passing information back and forth protected using a pthread mutex lock. When running my application stand-alone, thread performance is abysmally slow when taking a mutex lock. Locking and unlocking the mutex ~200 times takes 2.4 seconds on a 500 MHz ARM board, and longer on my 200MHz board.

The strange thing is that when I run my application under GDB, the application runs extremely quickly. The same block of code that took 2.4 seconds stand-alone takes about 2ms when GDB is running.

I've tested this behavior on 2 different ARM-based SBCs: one running Linux 2.4.26 with gcc 3.4.4 and glibc 2.3.2, and the other running Linux 2.6.21 also with gcc 3.4.4 and glibc 2.3.2.

After extensive testing, I'm suspecting that the problem lies in the pthreads library that happens to be the same version on both boards' toolchains. This would be unfortunate as my SBC supplier doesn't offer a very wide variety of toolchains for their board and I'm afraid that they'll all have this problem. Does anyone have any insight into what could be causing poor performance when not running under GDB?

A: 

Are you sure you're initializing the mutex the way you think you are? Is this thing in a file scope variable or allocated? I'm thinking that gdb is ending up giving you a different set of options on the mutex due to the luck of the memory initialization draw.

bmargulies
Thanks for the feedback. It's a global variable, initialized like so: pthread_mutex_t pThreadMutex = PTHREAD_MUTEX_INITIALIZER;
Maha
Urk. that should be safe in either case. I suppose you could try an explicit init and make sure you like the options.
bmargulies
+1  A: 

One idea where you could look at is the spin count value. This is for sure different on an ARM instead of an Intel system.

You can try with using "pthread_mutex_trylock()" and then doing an explicit "sched_yield()" if it can't aquire the lock, maybe add a millisecond sleep inside the loop. With this you can break the lock operation and see if there is some contention on the lock.

I would bet that you need to look at the source code or at least the disassembled "pthread_mutex_lock" to fix it.

Lothar
+3  A: 

Never had problem with pthreads on ARM, and I suspect there is a race or initialisation problem in your code. Try to reduce your code to the minimum code that reproduce the problem. You should post this code here, or the part you think is relevant.

And don't forget, usually, select isn't broken

Are you using LinuxThreads or NPTL ("kernel" threads ?) If you are using the latter, you can also try to strace your application.

shodanex
You were right, it wasn't broken of course. :) I narrowed the problem down to a 3rd-party framework we're using for state machine functionality. It was using a standard pthread mutex instead of a recursive one. Changing the mutex to recursive causes the program to run the same way it does under GDB. Thanks for your feedback!
Maha