esp is as you say it is, the top of the stack.
ebp is usually set to esp at the start of the function. Local variables are accessed by subtracting a constant offset from ebp. All x86 calling conventions define ebp as being preserved across function calls. ebp itself actually points to the previous frame's base pointer, which enables stack walking in a debugger and viewing other frames local variables to work.
Most function prologs look something like:
push ebp ; Preserve current frame pointer
mov ebp, esp ; Create new frame pointer pointing to current stack top
sub esp, 20 ; allocate 20 bytes worth of locals on stack.
Then later in the function you may have code like (presuming both local variables are 4 bytes)
mov [ebp-4], eax ; Store eax in first local
mov ebx, [ebp - 8] ; Load ebx from second local
FPO or frame pointer omission optimization which you can enable will actually eliminate this and use ebp as another register and access locals directly off of esp, but this makes debugging a bit more difficult since the debugger can no longer directly access the stack frames of earlier function calls.
EDIT:
For your updated question, the missing two entries in the stack are:
var_C= dword ptr -0Ch
var_8= dword ptr -8
var_4= dword ptr -4
savedFramePointer= dword ptr 0
return address= dword ptr 4
hInstance= dword ptr 8h
PrevInstance= dword ptr 0C
hlpCmdLine= dword ptr 10h
nShowCmd= dword ptr 14h
This is because the flow of the function call is:
- Push parameters (hInstance, etc.)
- Call function, which pushes return address
- Push ebp
- Allocate space for locals