% man fork
RETURN VALUES
Upon successful completion, fork() returns a value of 0 to the child
process and returns the process ID of the child process to the parent
process. Otherwise, a value of -1 is returned to the parent process, no
child process is created, and the global variable errno is set to indi-
cate the error.
What happens is that inside the fork system call, the entire process is duplicated. Then, the fork call in each returns. These are different stacks now though, so they can return different return codes.
If you really want to know how it works at a low level, you can always check the source! The code is a bit confusing if you're not used to reading kernel code, but the inline comments give a pretty good hint as to what's going on.
The most interesting part of the source with an explicit answer to your question is at the very end of the fork() definition itself -
if (error == 0) {
td->td_retval[0] = p2->p_pid;
td->td_retval[1] = 0;
}
"td" apparently holds a list of the return values for different threads. I'm not sure exactly how this mechanism works (why there are not two separate "thread" structures). If error (returned from fork1, the "real" forking function) is 0 (no error), then take the "first" (parent) thread and set its return value to p2 (the new process)'s PID. If it's the "second" thread (in p2), then set the return value to 0.