In order to understand it in detail, you'll have to take a course in operating system (or at least buy a good book on the subject) because it actually involves quite a few systems.
Basically, however, it relates to how thread state is managed. A thread is one of a few different states at any one time: sleeping, ready or running (there's usually more, but that's all that is needed for the purpose of this discussion). A thread in the running state is actually running and code in the thread is executing. A thread in the "sleeping" state is not running and the scheduler will skip over it when deciding who to run next. A thread in the "ready" state is not currently running, but once another thread goes to sleep or it's timeslice runs out, the scheduler is free to choose to schedule that thread to go into the running state.
So basically, when you call "wait" on a mutex object, the OS checks whether the object is already owned by another thread and if so, sets the current thread's state to "sleeping" and also marks the thread as "waiting on" that particular mutex.
When the thread that owns the mutex is finished, the OS loops through all of the threads that were waiting on it and sets them to "ready". The next time the scheduler comes around, it sees a "ready" thread and puts it in the "running" state. The thread starts running and checks whether it can get a lock on the mutex again. This time nobody owns it, so it can continue on it's merry way.
In reality, it's a lot more complicated than that, and a lot of effort goes into making the system as efficient as possible (for example, to avoid waking a thread only to have it go immediately back to sleep, to avoid having a thread starving on a mutex that has lots of other threads waiting on it, etc, etc)