As Jon Skeet wrote above there are two strategies: instrumentation and sampling.
Instrumentation is done both manually and also automatically. In manual case: the developer manually inserts code to track the start/end of a region of code of interest. For example a simple "StartTimer" and "EndTimer". Some profiler tools can do this automatically also - for this the profiler will need to do a static analysis of the code i.e. it parses out the code and identify important checkpoints like the start/end of a particular method(s). This is most easy with languages that support reflection (e.g. any .net language). Using 'reflection' the profiler is able to rebuild the entire source code tree (along with call graphs).
Sampling is done by the profiler and it looks into the binary code. The profiler can also techniques like Hooks or trap Windows events/ messages for the purpose of profiling.
Both Instrumentation and sampling methods have their own overheads. The amount of overhead depends - e.g. if the sampling frequency is set to high values, then the profiling itself can contribute significantly to the performance being reported.
Instrumentation Vs Sampling:
It is not like one is better than the other approach. Both have their place.
The best approach is to start with a sampling based profiler and look at the whole system level. That is run the sampler and see the system wide resource usage: memory, hard disk, network, CPU.
From the above identify the resources that are getting choked.
With the above info, you can now add instrumentation to your code to pin-point the culprit. For example if memory is the most used resource then it will help to instrument your memory allocation related code. Note that with instrumentation you are really concentrating on a particular area of your code.