The quick summary is that you will probably see the wait in Present(), but it really depends on what it is the Present() call.
Generally, unless you specifically say you want notice of when the GPU is finished, you might end up waiting at the (random to you) point the driver's input buffer fills up. Think of the GPU driver & card as a very long pipeline. You can put in work at one end and eventually after a while it comes out to the display. You might be able to put in several frames worth of commands into the pipeline before it fills up. The card could be taking a lot of time drawing primitives, but you might see the CPU waiting at a point several frames later.
If your Present() call contains the equivalent of glFinish(), that entire pipeline must drain before that call can return. So, the CPU will wait there.
I hope the following can be helpful:
Clear ();
Causes all the pixels in the current buffer to change color, so the GPU is doing
work. Lookup your GPU's clear pix/sec
rate to see what time this should be taking.
SetBuffers ();
SetTexture ();
The driver may do some work here, but generally it wants to wait until you
actually do drawing to use this new data. In any event, the GPU doesn't do
much here.
DrawPrimitives ();
Now here is where the GPU should be doing most of the work. Depending on the
primitive size you'll be limited by vertices/sec or pixels/sec. Perhaps you have
an expensive shader you'll be limited by shader instructions/sec.
However, you may not see this as the place the CPU is waiting. The driver
may buffer the commands for you, and the CPU may be able to continue on.
Present ();
At this point, the GPU work is minimal. It just changes a pointer to start displaying from a different buffer.
However, this is probably the point that appears to the CPU to be where it is waiting on the GPU. Depending on your API, "Present()" may include something like glFlush() or glFinish(). If it does, then you'll likely wait here.