views:

114

answers:

4

I have an application that runs on an embedded processor (ARM), and I'd like to profile the application to get an idea of where it's using system resources, like CPU, memory, IO, etc. The application is running on top of Linux, so I'm assuming there's a number of profiling applications available. Does anyone have any suggestions?

Thanks!

edit: I should also add the version of Linux we're using is somewhat old (2.6.18). Unfortunately I don't have a lot of control over that right now.

+2  A: 

if your Linux is not very limited then you may find gprof and valgrind useful

bobah
To use these, I'd have to port them over to the ARM, wouldn't I?
kidjan
Valgrind is, in my experience, unusable in embedded devices, its just too slow. grpof on the other hand works great. gprof is a tool that is provided by GCC itself, so if you got GCC working, you've got gprof. It will produce a stats files which you can later analyze in a PC with full GUI thingies.
Gianni
A: 

On a related note, the C++ working group did a technical report on the performance cost of various C++ language features. For example they analyze the cost of dynamic_casting one or 2 levels deep. The reports here http://www.open-std.org/jtc1/sc22/wg21/docs/TR18015.pdf and it might give you some insight into where the pain points in your embedded application might be.

sashang
+2  A: 

As bobah said, gprof and valgrind are useful. You might also want to try OProfile. If your application is in C++ (as indicated by the tags), you might want to consider disabling exceptions (if your compiler lets you) and avoiding dynamic casts, as mentioned above by sashang. See also Embedded C++.

George
Thanks, OProfile was the ticket. It's very nice for embedded profiling.
kidjan
A: 

gprof may disappoint you.

Assuming the program you are testing is big enough to be useful, then chances are the call tree could be pruned, so the best opportunities for optimization are function/method calls that you can remove or avoid. That link shows a good way to find them.

Many people approach this as sort of a hierarchical sleuthing process of measuring times. Or you can simply catch it in the act, which is what I do.

Mike Dunlavey
Yeah, interrupting a program running on an embedded processor....is not easy. So while the method you outline might work well for a desktop with a full-blown debugger attached, it's problematic for embedded development. And OProfile is basically doing the exact same method you're proposing, except in a much more consistent and useful manner.
kidjan
@kidjan: I'm not familiar with OProfile. Does it take stack samples? On wall-clock time (i.e. even when blocked)? Does it summarize percent at the line-of-code level, not just functions? Does it ignore recursion? Then it's getting there. (That's what Zoom and LTProf do.) I've had to use an ICE box on an embedded processor to do this. I'll grant you it may not be easy, but I think nothing finds you the problems like actually studying individual samples of the stack.
Mike Dunlavey
It does all those things. And pausing an embedded processor, in a real time program (we're doing live video streaming, among other real-time sensitive things, so "Heisenberg" applies) isn't just "not easy," it's "not possible." Tools like OProfile are the only option.
kidjan
@kidjan: Then it sounds like you don't have a problem. That's good.
Mike Dunlavey