Have you used any profiling tool like Intel Vtune analyzer?
What are your recommendations for a C++ multi threaded application on Linux and windows? I am primarily interested in cache misses, memory usage, memory leaks and CPU usage.
I use valgrind (only on UNIX), but mainly for finding memory errors and leaks.