views:

295

answers:

7

Some of the platforms that I develop on, don't have profiling tools. I am looking for suggestions/techniques that you have personally used to help you identify hotspots, without the use of a profiler.

The target language is C++.

I am interested in what you have personally used.

+3  A: 

In essence, if a profiling tool is not available, you emulate what a profiler would have done. You insert counters into functions you think are interesting and count how many times, and potentially with what size/sort of arguments they're called.

If you have access to any timers on your platform, you may start/stop these at the beginning/end of said functions to get execution time information as well, if this is unclear from the code. This is going to give you the biggest bang for your buck in complex code, as there will usually be too many functions to instrument them all. Instead, you can obtain the time spent in certain sections of code by dedicating a timer to each one.

These two techniques in tandem can form an iterative approach, where you find the broad section of code that consumes the majority of your cycles using timers, then instrument individual functions at a finer granularity to hone in on the problem.

Matt J
+2  A: 

If it is something sufficiently long in duration (e.g. a minute or more), I run the software in a debugger then break a few times and see where the debugger breaks, this gives a very rough idea of what the software is up to (e.g. if you break 10 times and they are all in the same place, this tells you something interesting!). Very rough and ready but doesn't require any tools, instrumentation etc.

Celestial M Weasel
+5  A: 

I've found the following quite useful:

#ifdef PROFILING
# define PROFILE_CALL(x) do{ \
    const DWORD t1 = timeGetTime(); \
    x; \
    const DWORD t2 = timeGetTime(); \
    std::cout << "Call to '" << #x << "' took " << (t2 - t1) << " ms.\n"; \
  }while(false)
#else
# define PROFILE_CALL(x) x
#endif

Which can be used in the calling function as such:

PROFILE_CALL(renderSlow(world));
int r = 0;
PROFILE_CALL(r = readPacketSize());
Andreas Magnusson
I like this. One can define it as x; for normal operation.
EvilTeach
Yes exactly, I do that but I just forgot to add it to my answer. Thanks for the reminder.
Andreas Magnusson
+6  A: 

No joke: In addition to dumping timings to std::cout and other text/data oriented approaches I also use the Beep() function. There's something about hearing the gap of silence between two "Beep" checkpoints that makes a different kind of impression.

It's like the difference between looking at a written sheet music, and actually HEARING the music. It's like the difference between reading rgb(255,0,0) and seeing fire-engine red.

So, right now, I have a client/server app and with Beeps of different frequencies, marking where the client sends the message, where the server starts its reply, finishes its reply, where reply first enters the client, etc, I can very naturally get a feel for where the time is spent.

Corey Trager
What an excellent idea.
Preet Sangha
I use this technique to mark when constructors and destructors are called.
da_code_monkey
Back in my TRS80 model 1 days, I had a friend who wrote a realtime star trek game, and you would put a radio by the keyboard, and discover the that code had some special loops in it, that caused sound effects to come onver the radio.
EvilTeach
+1  A: 

I would use the 80/20 rule and put timers around hotspots or interesting call paths. You should have a general idea where the bottlenecks will be (or at least a majority of the execution paths) and use the appropriate platform dependent high resolution timer (QueryPerformanceCounters, gettimeofday, etc.).

I usually don't bother with anything at startup or shutdown (unless needed) and will have well defined "choke points", usually message passing or some sort of algorithmic calculation. I've generally found that message sinks/srcs (sinks moreso), queues, mutexes, and just plain mess-ups (algorithms, loops) usually account for most of the latency in an execution path.

PiNoYBoY82
I agree with the 80/20 rule: the problem is that human beings are usually bad in guessing where bottlenecks are... yet the 20/80 rules can be applied :)
Nicola Bonelli
+2  A: 

I'm not sure what platforms you had in mind, but on embedded microcontrollers, it's sometimes helpful to twiddle a spare digital output line and measure the pulse width using an oscilloscope, counter/timer, or logic analyzer.

bk1e
+1  A: 

Are you using Visual Studio?

The you can use the /Gh and /GH switches. Here's an example involving stack inspection

These flags allow you, by a file-by-file basis, to register undecorated functions that are called every time a method is entered and/or left in runtime.

You can then register all times of profiling information, not just timing information. Stack-dumps, calling address, return address, etc. Which is important, because you may want to know that 'function X used Y time under function Z' and not just the total time spent in function X.

kervin