There's plenty going on beyond the stages already stated (loading the PE, enumerating the dlls it depends on, calling their entry points and calling the exe entry point).
Probably the earliest action is the OS and the processor collaborating to create a new address space infrastructure (I think essentially a dedicated TLB). The OS initializes a Process Environment Block, with process-wide data (e.g., Process ID and environment variables). It initializes a Thread Environment Block, with thread-specific data (thread id, SEH root handlers, etc). Once an appropriate address is selected for every dll and it is loaded there, re-addressing of exported functions happens by the windows loader. (very briefly - at compile time, the dll cannot know the address at which it will be loaded by every consumer exe. the actual call addresses of its functions is thus determined only at load time). There are initializations of memory pages shared between processes - for windows messages, for example - and I think some initialization of disk-paging structures. there's PLENTY more going on. The main windows component involved is indeed the windows loader, but the kernel and executive are involved. Finally, the exe entry point is called - it is by default BaseProcessStart.
Typically a lot of preparation happens still after that, above the OS level, depending on used frameworks (canonical ones being CRT for native code and the CLR for managed): the framework must initialize its own internal structures to be able to deliver services to the application - memory management, exception handling, I/O, you name it.
A great place to go for such in depth discussions is Windows Internals. You can also dig a bit deeper in forums like SO, but you have to be able to break it down into more focused bits. As phrased, this is really too much for an SO post.