views:

183

answers:

1

Given: multithreaded (~20 threads) C++ application under RHEL 5.3. When testing under load, top shows that CPU usage jumps in range 10-40% every second.

The design mostly pretty simple - most of the threads implement active object design pattern: thread has a thread-safe queue, requests from other queues are pushed to the queue, while the thread only polling on the queue and process incomming requests. Processed request causes to a new request to be pushed to next processing thread.

The process has several TCP/UDP connection over each a data is received/sent in a high load.

I know I did not provided sufficiant data. This is pretty big application, and I'n not familiar well with all it's parts. It's now ported from Windows on Linux over ACE library (used for networking part).

Suppusing the problem is in the application and not external one, what are the techicues/tools/approaches can be used to discover the problem. For example I suspect that this maybe caused by some mutex contention.

+1  A: 

I have faced similar problem some time back and here are the steps that helped me. 1) Start with using strace to see where the application is spending the time executing system calls.

2) Use OProfile to profile both the application and the kernel.

3) If you are using an SMP system , look at the numa settings, In my case that caused a havoc . /proc/appPID/numa_maps will give a quick look at how the access to the memory is happening. numa misses can cause the jumps.

4) You have mentioned about TCP connections in your app. Look at the MTU size and see its set to right value and Depending upon the type of Data getting transferred use the Nagles Delay appropriately. Nagles Delay

pv