views:

1618

answers:

6

Have you used any profiling tool like Intel Vtune analyzer?

What are your recommendations for a C++ multi threaded application on Linux and windows? I am primarily interested in cache misses, memory usage, memory leaks and CPU usage.

I use valgrind (only on UNIX), but mainly for finding memory errors and leaks.

+1  A: 

The Rational PurifyPlus suite includes both a well-proven leak detector and pretty good profiler. I'm not sure if it does go down to the level of cache misses, though - you might need VTune for that.

PurifyPlus is available both on various Unices and Windows so it should cover your requirements, but unfortunately in contrast to Valgrind, it isn't free.

Timo Geusch
+4  A: 

Following are the good tools for multithreaded applications. You can try evaluation copy.

  1. Runtime sanity check tool
    • Thread Checker -- Intel Thread checker / VTune, here
  2. Memory consistency-check tools (memory usage, memory leaks) - Memory Validator, here
  3. Performance Analysis. (CPU usage) - AQTime , here

EDIT: Intel thread checker can be used to diagnose Data races, Deadlocks, Stalled threads, abandoned locks etc. Please have lots of patience in analyzing the results as it is easy to get confused.

Few tips:

  1. Disable the features that are not required.(In case of identifying deadlocks, data race can be disabled and vice versa.)
  2. Use Instrumentation level based on your need. Levels like "All Function" and "Full Image" are used for data races, where as "API Imports" can be used for deadlock detection)
  3. use context sensitive menu "Diagnostic Help" often.
aJ
+2  A: 

VTune give you a lot of details on what the processor is doing and sometimes I find it hard to see the wood for the trees. VTune will not report on memory leaks. You'll need purify plus for that, or if you can run on a Linux box valgrind is good for memory leaks at a great price.

VTune shows two views, one is useful the tabular one, the other I think is just for sales men to impress people with but not that useful.

For quick and cheap option I'd go with valgrind. Valgrind also has a cache grind part to it but i've not used it, but suspect its very good also.

cheers, Martin.

martsbradley
+2  A: 

On Linux, try oprofile. It supports various performance counters.

On Windows, AMD's CodeAnalyst (free, unlike VTune) is worth a look. It only supports event profiling on AMD hardware though (on Intel CPUs it's just a handy timer-based profiler).

A colleague recently tried Intel Parallel Studio (beta) and rated it favourably (it found some interesting parallelism-related issues in some code).

timday
+1  A: 

I'll put in another answer for valgrind, especially the callgrind portion with the UI. It can handle multiple threads by profiling each thread for cache misses, etc. They also have a multi-thread error checker called helgrind, but I've never used it and don't know how good it is.

Caleb Huitt - cjhuitt
Helgrind is quite good at finding potential threading issues such as mutex lock order inconsistencies, race conditions etc. It only works with pthreads though, so users of other threading libraries may be out of luck.It does run much slower than valgrind on my machine though, so patience is key when using it!
Soo Wei Tan
+1  A: 

For simple profiling gprof is pretty good..

sean riley