views:

247

answers:

5

Windows provides a number of objects useful for synchronising threads, such as event (with SetEvent and WaitForSingleObject), mutexes and critical sections.

Personally I have always used them, especially critical sections since I'm pretty certain they incur very little overhead unless already locked. However, looking at a number of libraries, such as boost, people then to go to a lot of trouble to implement their own locks using the interlocked methods on Windows.

I can understand why people would write lock-less queues and such, since thats a specialised case, but is there any reason why people choose to implement their own versions of the basic synchronisation objects?

+4  A: 

One particular reason I can think of is portability. Windows locks are just fine on their own but they are not portable to other platforms. A library which wishes to be portable must implement their own lock to guarantee the same semantics across platforms.

JaredPar
So basically your saying boost windows code using the interlocked methods is basically design to work identically to pthreads on other platforms?
Fire Lancer
@Fire Lancer, Yes, I'm **speculating** that is the case
JaredPar
+1  A: 

Writing locking code for a library is useful if that library is meant to be cross platform. Users of the library can use the library's locking functionality and not have to care about the underlying platform implementation. Assuming the library has versions for all the platforms being targetted it's one less bit of code that has to be ported.

Herms
JaredPar has posted same answer
P.K
+2  A: 
  1. In many libraries (aka Boost) you need to write corss platform code. So, using WaitForSingleObject and SetEvent are no-go. Also, there common idioms, like Monitors, Conditions that Win32 API misses, (but it can be implemented using these basic primitives)
  2. Some lock-free data structures like atomic counter are very useful; for example: boost::shared_ptr uses them in order to make it thread safe without overhead of critical section, most compilers (not msvc) use atomic counters in order to implement thread safe copy-on-write std::string.
  3. Some things like queues, can be implemented very efficiently in thread safe way without locks at all that may give significant perfomance boost in certain applications.
Artyom
+14  A: 

Libraries aren't implementing their own locks. That is pretty much impossible to do without OS support.

What they are doing is simply wrapping the OS-provided locking mechanisms.

Boost does it for a couple of reasons:

  • They're able to provide a much better designed locking API, taking advantage of C++ features. The Windows API is C only, and not very well-designed C, at that.
  • They are able to offer a degree of portability. the same Boost API can be used if you run your application on a Linux machine or on Mac. Windows' own API is obviously Windows-specific.
  • The Windows-provided mechanisms have a glaring disadvantage: They require you to include windows.h, which you may want to avoid for a large number of reasons, not least its extreme macro abuse polluting the global namespace.
jalf
Looking at boost\thread\win32\basic_timed_mutex.hpp I dont see any use of the win32 threading objects, it all seems to be done with the interlocked methods, which afaik use special CPU instructions anyway that are atomic.
Fire Lancer
The interlocked methods are the basis for some Win32 synchronization objects too, that's all you really need for say mutexes or selects.
Blindy
The interlocked methods are part of the Win32 API as well though. They're no more portable than the high-level Win32 synchronization objects.
jalf
+2  A: 

There may occasionally be good reasons for implementing your own locks that don't use the Windows OS synchronization objects. But doing so is a "sharp stick." It's easy to poke yourself in the foot.

Here's an example: If you know that you are running the same number of threads as there are hardware contexts, and if the latency of waking up one of those threads which is waiting for a lock is very important to you, you might choose a spin lock implemented completely in user space. If the waiting thread is the only thread spinning on the lock, the latency of transferring the lock from the thread that owns it to the waiting thread is just the latency of moving the cache line to the owner thread and back to the waiting thread -- orders of magnitude faster than the latency of signaling a thread with an OS lock under the same circumstances.

But the scenarios where you want to do this is pretty narrow. As soon as you start having more software threads than hardware threads, you'll likely regret it. In that scenario, you could spend entire OS scheduling quanta doing nothing but spinning on your spin lock. And, if you care about power, spinlocks are bad because they prevent the processor from going into a low-power state.

I'm not sure I buy the portability argument. Portable libraries often have an OS portability layer that abstracts the different OS APIs for synchronization. If you're dealing with locks, a pthread_mutex can be made semantically the same as a Windows Mutex or Critical Section under an abstraction layer. There's some exceptions here, but for most people this is true. If you're dealing with Windows Events or POSIX condition variables, well, those are tougher to abstract. (Vista did introduce POSIX-style condition variables, but not many Windows software developers are in a position to require Vista...)

boiler96