views:

1316

answers:

8

Throughout the resources I've read about multithreading, mutex is more often used and discussed compared to a semaphore. My question is when do you use a semaphore over a mutex? I don't see semaphores in Boost thread. Does that mean semaphores no longer used much these days?

As far as I've understand, semaphores allow a resource to be shared by several threads. This is only possible if those threads are only reading the resource but not writing. Is this correct?

+5  A: 

To control access to a limited number of resources being shared by multiple threads (either inter- or intra-process).

In our application, we had a very heavy resource and that we did not want to allocate one for each of the M worker threads. Since a worker thread needed the resource for just one small part of their job, we rarely were using more then a couple of the resources simultaneously.

So, we allocated N of those resources and put them behind a semaphore initialized to N. When more then N threads were trying to use the resource, they would just block until one was available.

R Samuel Klatchko
But that resource was only used for reading is that right?
jasonline
The intended use of a semaphore is to signal between threads... and a semaphore cannot solve a multiple identical resource problem on its own, here is the same article I provided in my answer to explain why that's the case: http://www.netrino.com/node/202
Lirik
@R Samuel: I'm interested to know what is that resource?
jasonline
@jasonline - an instance of the Autonomy/Verity text extraction engine. It's a library that can be used to extract the basic text from a PDF, Word, etc. file.
R Samuel Klatchko
@Lirik - yes, you need something more then just a semaphore, but the semaphore provides an key property. When the resources being guarded are identical and a thread wants to block until any single resource is available, the semaphore allows you to efficiently do that.
R Samuel Klatchko
@Samuel unfortunately the key is not unique, so if we're talking about a key to a hotel room... the key will be able to open ANY room, even a room which already has occupants.
Lirik
@Lirik: 60 guests waiting to sleep in one of 20 hotel rooms do of course wait for one of 20 different keys. However without using a semaphore you would need to either have all 60 guests call `WaitForMultipleObjects()` on an array of 20 mutexes for the keys (won't scale), or create some other complicated solution. There is no OS primitive that will solve this problem on its own.
mghie
@mghie I guess my point was that the semaphore does not give you a way to define ownership... the key itself has to define ownership in some way (i.e. it can only open room 205). But the same thing can be achieved with a blocking queue (although not an OS primitive): the key is taken out of the queue and then returned to the queue... threads are limited by the number of keys circulating in the queue.
Lirik
@Lirik: A blocking queue is certainly a great alternative. But we're no longer speaking about mutex vs. semaphore now.
mghie
@Lirik - as I said in my previous comment you need more then a semaphore. The semaphore isn't the room key, it's the reservation for a room. First you get your reservation (via the semaphore), and then you get your room key (via some other technique).
R Samuel Klatchko
@Lirik - one last thought on a semaphore based solution vs a blocking queue solution. Both of them require two synchronization primitives (my solution requires one counting semaphore for efficient reservations and one mutex to control access to the individual resources - a blocking queue requires a mutex and a condition variable/event to implement). In the end, they both are pretty similar.
R Samuel Klatchko
+5  A: 

The typical use case for a mutex (allowing only one thread access to a resource at any time) is far more common than the typical uses if a semaphore. But a semaphore is actually the more general concept: A mutex is (almost) a special case of a semaphore.

Typical applications would be: You don't want to create more than (e.g.) 5 database connections. No matter how many worker threads there are, they have to share these 5 connections. Or, if you run on a N-core machine, you might want to make sure that certain CPU/memory-intensive tasks don't run in more than N threads at the same time (because that would only reduce throughput due to context switches and cache thrashing effects). You might even want to limit the number of parallel CPU/memory intensive tasks to N-1, so the rest of the system doesn't starve. Or imagine a certain task needs a lot of memory, so running more than N instances of that task at the same time would lead to paging. You could use a semaphore here, to make sure that no more than N instances of this particular task run at the same time.

EDIT/PS: From your question "This is only possible if those threads are only reading the resource but not writing. Is this correct?" and your comment, it seems to me as if you're thinking of a resource as a variable or a stream, that can be read or written and that can only be written to by one thread at a time. Don't. This is misleading in this context.

Think of resources like "water". You can use water to wash your dishes. I can use water to wash my dishes at the same time. We don't need any kind of synchronization for that, because there is enough water for both of us. We don't necessarily use the same water. (And you can't "read" or "write" water.) But the total amount of water is finite. So it's not possible for any number of parties to wash their dishes at the same time. This kind of synchronization is done with a semaphore. Only usually not with water but with other finite resources like memory, disk space, IO throughput or CPU cores.

nikie
But if you use semaphore to allow N instances of that task, does it mean that those N instances are already synchronized like what a mutex does? Like for example, if those tasks are for writing to a common resource, does the semaphore internally handle this such that there will be no race conditions between those N instances/tasks?
jasonline
@jasonline A semaphore does not internally handle race conditions between tasks, if that was the case then everybody would be using semaphores and nobody would have trouble with multithreading :)... a semaphore can do MANY things, a semaphore can even do the work of a mutex ^^ hehe. Check out my answer for more details.
Lirik
@nikie: From your latest update about the example on water... what then should a resource be for a semaphore to be used?
jasonline
@jasonline: for example suppose you know that you have 2 CPU cores, and you have 27 parallel tasks to run. If you run more than 2 at once, they'll time-share OK through the scheduler and perhaps finish roughly together. If you create a semaphore with a count of 2, and have each thread wait on the semaphore before starting and post after finishing, then 2 tasks at a time will run, and whatever runs first will finish as soon as possible. That might be more useful, depending on the task. You've used a semaphore to control access to a resource, but cores aren't a "read-only" resource.
Steve Jessop
@Steve: Thanks for your example. So you're saying to run two threads at a time and control them through a semaphore. Is my understanding correct?
jasonline
Yes, that's right. The reason you only want 2 threads at a time is nothing to do with a read-only vs read-write resource, just that in this hypothetical example, you want to get *something* finished ASAP.
Steve Jessop
A: 

As far as I understand semaphores is a strongly IPC-related term these days. It still means protected variable many processes can modify, but among processes and this feature is supported by OS.

Usually, we don't need a variable and a simple mutex cover all our requirements. If we still need a variable, probably, we code it ourselves - "variable + mutex" to get more control.

Resume: we don't use semaphores in multithreading because usually use mutex for simplicity and control, and we use semaphores for IPC because it's OS-supported and an official name for processes synchronization mechanism.

I think you misunderstood the term. You can use mutexes to replace semaphores or events, but generally you shouldn't. Understand the primitives and use them where appropriate.
nikie
+1  A: 

I feel like there is no simple way to REALLY answer your question without disregarding some important information about semaphores. People have written many books about semaphores, so any one or two paragraph answer is a disservice. A popular book is The Little Book of Semaphores... for those who don't like big books :).

Here is a decent lengthy article which goes into a LOT of the details on how semaphores are used and how they're intended to be used.

Update:
Dan pointed out some mistakes in my examples, I'll leave it with the references which offer MUCH better explanations than mine :).

Here are the references showing the RIGHT ways one should use a semaphore:
1. IBM Article
2. University of Chicago Class Lecture
3. The Netrino article I originally posted.
4. The "sell tickets" paper + code.

Lirik
Ok, I'll read it first...
jasonline
+1 for the article link
zebrabox
@Lirik: I read the article. But can you give one example of a resource in an application where semaphore is used to control the number of instances it shares? I still don't understand what or how the resource can be shared and whether a race condition will occur.
jasonline
@jasonline as other people have pointed out: a semaphore does not protect against race conditions... it is COMPLETELY different from a mutex. A semaphore does not tell you who owns the resource or who should actually access it, it only tells you that there are available instances of that resource (keys to hotel rooms, ticket counters, etc.).
Lirik
@Lirik: The netrino article is good. I disagree with the rest of your answer. There is a simple way to explain the difference: a mutex is used to protect a critical section, a semaphore is used to signal from one task to another that an event has occurred or a condition has been met. Your code example uses a semaphore where a mutex should be used.
Dan
Also, your ticket-counter solution is flawed. Say three clerks are working and no others have showed up to work, then two clerks leave. You say when a clerk leaves "it will signal via the semaphore which ticket counter is being vacated." A semaphore doesn't store this info, it would be in a separate variable. Better to use a queue/stack to store the list of empty ticket counters, in case there is more than one. When a clerk leaves, its ticket counter number goes on the queue. When a clerk arrives, it gets a counter from the queue, or blocks (condition variable) if the queue is empty.
Dan
@Dan Good point! I'll just leave the references... they explain it MUCH better than I.
Lirik
@Lirik: Yes, thanks. I got your point.
jasonline
+1  A: 

As taken from this article:

A mutex allows inter-process synchronisation to occur. If you instantiate a mutex with a name (as in the code above), the mutex becomes system-wide. This is really useful if you're sharing the same library between many different applications and need to block access to a critical section of code that is accessing resources that can't be shared.

Finally, the Semaphore class. Let's say you have a method that is really CPU intensive, and also makes use of resources that you need to control access to (using Mutexes :)). You've also determined that a maximum of five calls to the method is about all your machine can hanlde without making it unresponsive. Your best solution here is to make use of the Semaphore class which allows you to limit a certain number of threads' access to a resource.

Kane
Good example of how a semaphore can be used in a way that is not "signaling from one thread to another".
Dan
@Dan: IMO it is still "signalling from one thread to another". You create the semaphore with 5 slots. Posting the semaphore says "I have just freed up one of the slots". Waiting on the semaphore consumes a slot if possible, otherwise the thread waits until it is "signalled". There's a lot you can do with a 1-bit message, but the point about semaphores is that's *all they do*. Mutexes not only do this, they also have an owner thread, which means the scheduler can do things for mutexes which it cannot do for locks implemented with semaphores, or this 5-slot processing throttle.
Steve Jessop
I say "signalled" in inverted commas, because of course signals are a particular thing, commonly (but not necessarily) used for IPC, with a more specific meaning than just the general English word. But in the general English word, a semaphore just tracks how many signals/messages/slots/posts have been sent and how many have been consumed, and delivers them to consumers when available.
Steve Jessop
@Steve: you're right, it is still "signaling". What struck me as different is that it wasn't one specific thread signaling to another specific thread, such as "the Built-in-test thread detected a fault condition and gave the Danger semaphore, which the Shutdown-reactor thread had been blocking on." That's how I normally see semaphores used. Anything more complicated than that, especially multiple-producer/consumer, usually requires more sophisticated IPC (condition variable, blocking-queue, mutex-protected object). I found this example novel because you can get away with just a semaphore.
Dan
A: 

From what I learned about semaphores and mutex's in college, semaphore are more theoretical objects while mutex's are one implementation of semaphores. Taking that into account, semaphores are more flexible.

Mutex's are highly implementation dependent. They have been optimized for their binary locking purpose. The normal use case of a mutex is a binary semaphore.

In general, when trying to write bug-free multithreaded code, simplicity helps. Mutex's are used more because their simplicity helps avoid complex deadlock scenarios that arise from using semaphores.

Kousha
+2  A: 

Boost.Thread has mutexes and condition variables. Purely in terms of functionality, semaphores are therefore redundant[*], although I don't know if that's why they're omitted.

Semaphores are a more basic primitive, simpler, and possibly implemented to be faster, but don't have priority-inversion avoidance. They're arguably harder to use than condition variables, because they require the client code to ensure that the number of posts "matches" the number of waits in some appropriate way. With condition variables it's easy to tolerate spurious posts, because nobody actually does anything without checking the condition.

Read vs. write resources is a red herring IMO, it has nothing to do with the difference between a mutex and a semaphore. If you use a counting semaphore, you could have a situation where multiple threads are concurrently accessing the same resource, in which case it would presumably have to be read-only access. In that situation, you might be able to use shared_mutex from Boost.Thread instead. But semaphores aren't "for" protecting resources in the way mutexes are, they're "for" sending a signal from one thread to another. It's possible to use them to control access to a resource.

That doesn't mean that all uses of semaphores must relate to read-only resources. For example, you can use a binary semaphore to protect a read/write resource. Might not be a good idea, though, since a mutex often gives you better scheduling behaviour.

[*] Here's roughly how you implement a counting semaphore using a mutex and a condition variable. To implement a shared semaphore of course you need a shared mutex/condvar:

struct sem {
    mutex m;
    condvar cv;
    unsigned int count;
};

sem_init(s, value)
    mutex_init(s.m);
    condvar_init(s.cv);
    count = value;

sem_wait(s)
    mutex_lock(s.m);
    while (s.count <= 0) {
        condvar_wait(s.cv, s.m);
    }
    --s.count;
    mutex_unlock(s.m);

sem_post(s)
    mutex_lock(s.m);
    ++s.count;
    condvar_broadcast(s.cv)
    mutex_unlock(s.m);

Therefore, anything you can do with semaphores, you can do with mutexes and condition variables. Not necessarily by actually implementing a semaphore, though.

Steve Jessop
@Steve: Yes, I can understand the mutexes and condition variables from Boost. I'm just not sure how I could relate it with semaphores or how I could implement mutexes/condition variables to function like a semaphore (?)
jasonline
@Steve: Your statement: "If you use a counting semaphore, you could have a situation where multiple threads are concurrently accessing the same resource, in which case it would presumably have to be read-only access." You're using the semaphore just to limit the number of threads accessing it right? And if you're not using a semaphore, you don't really need a mutex also because it's only read access and it's safe for multiple threads to access the resource. Correct?
jasonline
@jasonline: To the first question, I've edited. To the second question yes, read-only vs. read-write access usually is entirely about how many threads are accessing the resource at once: at most 1, or more than that, because concurrent read-only access to resources is usually safe whereas concurrent access involving a write isn't.
Steve Jessop
+3  A: 

The essence of the difference between a mutex and a semaphore has to do with the concept of ownership. When a mutex is taken, we think of that thread as owning the mutex and that same thread must later release the mutex back to release the resource.

For a semaphore, think of taking the semaphore as consuming the resource, but not actually taking ownership of it. This is generally referred to as the semaphore being "empty" rather than owned by a thread. The feature of the semaphore is then that a different thread can "fill" the semaphore back to "full" state.

Therefore, mutexes are usually used for the concurrency protection of resources (ie: MUTual EXlusion) while semaphores are used for signaling between threads (like semaphore flags signaling between ships). A mutex by itself can't really be used for signaling, but semaphores can. So, selecting one over the other depends on what you are trying to do.

See another one of my answers here for more discussion on a related topic covering the distinction between recursive and non-recursive mutexes.

Tall Jeff