I'll try to sort out the confusion the best I can. I will explain the concepts the way they are traditionally defined. The problem is that people start to mix the meaning of many of these concepts and a lot of confusion arise from that.
Whenever we have a piece of code that modifies a bit of memory (say a variable) that is share between different processes or threads we have a critical section. If we don't take care to synchronize this bit of code properly then we will get bugs. One example of a critical section is a producer adding an element to a shared container of some sort.
One way to synchronize critical sections is to enforce mutual exclusion. Mutual exclusion means that only one process or thread at a time can execute the critical section and gain access to the shared piece of memory. Note that mutual exclusion is not a mechanism in it self, it is a principle that we can enforce by different means. Some people talk about locks and binary semaphores as mutexes but that mixes the concepts in a way that will lead to confusion.
A binary semaphore is a way to enforce mutual exclusion. Whenever a process wants to get access to the mutex it can aquire the semaphore. This operation will block if there is another process holding the semaphore at that moment. Hence we have mutual exclusion. Once a process is done with the mutex then we releases the semaphore letting other processes into the mutex. In this way we can achieve mutual exclusion with a binary semaphore, but it is by no means the only possible application of a binary semaphore.
Semaphores are nice for producer-consumer problems because they can take on an arbitrary natural number, not just 0 and 1 in the case of binary semaphores. This is very useful when synchronizing producer-consumer problems because you can let the value of a semaphore contain the number of available elements. If the number of elements go down to zero then the semaphore operations will automatically block.
I realize the explanation of the producer-consumer problem is a bit brief and I encourage you to look at solutions which uses semaphores and also compare these solutions to other solutions which uses other synchronization constructs such as monitors or message passing. I've found it to be very illuminating.