Path Profiling is interesting as a theoretical problem. gprof
is also interesting, because it deals in call graphs, cyclical subgraphs, and such. There are nice algorithms for manipulating this information and propogating measurements throughout a structure.
All of which might tempt you to think it works (though they never say it does) - for finding general performance problems.
However, suppose your program hangs. How do you find the problem?
What I do is get it into the infinite loop, and then interrupt (pause) it to see what it's doing. I look at the code on each level of the call stack, because I know the loop is somewhere on the stack. If it's not obvious, I just step it along until I see it repeating itself, and then I know where the problem is. I suspect almost anyone would do that.
In fact, if you stop the program while it's taking too long and examine its state several times, you can not only find infinite loops, but almost any problem where the program runs longer than you would like.
There are profiler tools based on this concept, such as Zoom and LTProf, but for my money nothing gives as much insight as thoroughly understanding representative snapshots.
You won't find good references on this technique because (oddly) not many people are aware of it, and it's too simple to publish.
There's considerably more to say on the subject.
Actually, FWIW, I "published" an article on it, but it was only reviewed by an editor, and I don't think anyone's actually read it: Dunlavey, “Performance tuning with instruction-level cost derived from call-stack sampling”, ACM SIGPLAN Notices 42, 8 (August, 2007), pp. 4-8.