tags:

views:

413

answers:

3

I try to profile a simple c prog using valgrind:

[zsun@nel6005001 ~]$ valgrind --tool=memcheck ./fl.out
==2238== Memcheck, a memory error detector
==2238== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==2238== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==2238== Command: ./fl.out
==2238==
==2238==
==2238== HEAP SUMMARY:
==2238== in use at exit: 1,168 bytes in 1 blocks
==2238== total heap usage: 1 allocs, 0 frees, 1,168 bytes allocated
==2238==
==2238== LEAK SUMMARY:
==2238== definitely lost: 0 bytes in 0 blocks
==2238== indirectly lost: 0 bytes in 0 blocks
==2238== possibly lost: 0 bytes in 0 blocks
==2238== still reachable: 1,168 bytes in 1 blocks
==2238== suppressed: 0 bytes in 0 blocks
==2238== Rerun with --leak-check=full to see details of leaked memory
==2238==
==2238== For counts of detected and suppressed errors, rerun with: -v
==2238== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 12 from 8)
Profiling timer expired

The c code I am trying to profile is the following:

void forloop(void){
    int fac=1;
    int count=5;
    int i,k;

    for (i = 1; i <= count; i++){
        for(k=1;k<=count;k++){
            fac = fac * i;
        }
    }
}

"Profiling timer expired" shows up, what does it mean? How to solve this problem? thx!

A: 

You are not going to be able to compute 10000! like that. You will need some sort of bignum implementation for computing factorials. This is because int is "usually" 4 bytes long which means that "usually" it can hold 2^32 - 1 (signed int, 2^31) - 13! is more than that. Even if you used an unsigned long ("usually" 8 bytes) you'd overflow by the time you reached 21!.

As for what it "profiling timer expired" means - it means valgrind received the signal SIGPROF: http://en.wikipedia.org/wiki/SIGPROF (probably means your program took too long).

laura
ok, if i change the code like this:int main(void){ int fac=1; int count=10; int k; for(k=1;k<=count;k++) { fac = fac * k; } return 0;}i type the commands:[zsun@nel6005001 ~]$ gcc -g -pg -o fl.out forAndWhileLoop.c[zsun@nel6005001 ~]$ valgrind --tool=massif ./fl.out==2639== Massif, a heap profiler==2639== Copyright (C) 2003-2009, and GNU GPL'd, by Nicholas Nethercote==2639== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info==2639== Command: ./fl.out==2639== ==2639== Profiling timer expired,,same problem. why?
martin
+2  A: 

The problem is that you are using valgrind on a program compiled with -pg. You cannot use valgrind and gprof together. The valgrind manual suggests using OProfile if you are on Linux and need to profile the actual emulation of the program under valgrind.

mark4o
if i use oprofile, then what's the use of valgrind?
martin
valgrind memcheck detects memory errors, gprof and oprofile help to find performance issues. You cannot use gprof and valgrind at the same time, and it wouldn't be that useful unless you are working on the performance of valgrind itself or one of its tools. If you want to check for memory errors and also want to find performance issues in your own program then run it once under valgrind and then compile it with -pg and run that one under gprof.
mark4o
A: 

By the way, this isn't computing factorial.

If you're really trying to find out where the time goes, you could try stackshots. I put an infinite loop around your code and took 10 of them. Here's the code:

 6: void forloop(void){ 
 7:   int fac=1; 
 8:   int count=5; 
 9:   int i,k; 
10:
11:   for (i = 1; i <= count; i++){ 
12:       for(k=1;k<=count;k++){ 
13:           fac = fac * i; 
14:       } 
15:   } 
16: } 
17:
18: int main(int argc, char* argv[])
19: {
20: int i;
21: for (;;){
22:     forloop();
23: }
24: return 0;
25: }

And here are the stackshots, re-ordered with the most frequent at the top:

forloop() line 12
main() line 23

forloop() line 12 + 21 bytes
main() line 23

forloop() line 12 + 21 bytes
main() line 23

forloop() line 12 + 9 bytes
main() line 23

forloop() line 13 + 7 bytes
main() line 23

forloop() line 13 + 3 bytes
main() line 23

forloop() line 6 + 22 bytes
main() line 23

forloop() line 14
main() line 23

forloop() line 7
main() line 23

forloop() line 11 + 9 bytes
main() line 23

What does this tell you? It says that line 12 consumes about 40% of the time, and line 13 consumes about 20% of the time. It also tells you that line 23 consumes nearly 100% of the time.

That means unrolling the loop at line 12 might potentially give you a speedup factor of 100/(100-40) = 100/60 = 1.67x approximately. Of course there are other ways to speed up this code as well, such as by eliminating the inner loop, if you're really trying to compute factorial.

I'm just pointing this out because it's a bone-simple way to do profiling.

Mike Dunlavey