views:

882

answers:

7

Question How can I make sure my application is thread-safe? Are their any common practices, testing methods, things to avoid, things to look for?

Background I'm currently developing a server application that performs a number of background tasks in different threads and communicates with clients using Indy (using another bunch of automatically generated threads for the communication). Since the application should be highly availabe, a program crash is a very bad thing and I want to make sure that the application is thread-safe. No matter what, from time to time I discover a piece of code that throws an exception that never occured before and in most cases I realize that it is some kind of synchronization bug, where I forgot to synchronize my objects properly. Hence my question concerning best practices, testing of thread-safety and things like that.

mghie: Thanks for the answer! I should perhaps be a little bit more precise. Just to be clear, I know about the principles of multithreading, I use synchronization (monitors) throughout my program and I know how to differentiate threading problems from other implementation problems. But nevertheless, I keep forgetting to add proper synchronization from time to time. Just to give an example, I used the RTL sort function in my code. Looked something like

FKeyList.Sort (CompareKeysFunc);

Turns out, that I had to synchronize FKeyList while sorting. It just don't came to my mind when initially writing that simple line of code. It's these thins I wanna talk about. What are the places where one easily forgets to add synchronization code? How do YOU make sure that you added sync code in all important places?

+8  A: 

You can't really test for thread-safeness. All you can do is show that your code isn't thread-safe, but if you know how to do that you already know what to do in your program to fix that particular bug. It's the bugs you don't know that are the problem, and how would you write tests for those? Apart from that threading problems are much harder to find than other problems, as the act of debugging can already alter the behaviour of the program. Things will differ from one program run to the next, from one machine to the other. Number of CPUs and CPU cores, number and kind of programs running in parallel, exact order and timing of stuff happening in the program - all of this and much more will have influence on the program behaviour. [I actually wanted to add the phase of the moon and stuff like that to this list, but you get my meaning.]

My advice is to stop seeing this as an implementation problem, and start to look at this as a program design problem. You need to learn and read all that you can find about multi-threading, whether it is written for Delphi or not. In the end you need to understand the underlying principles and apply them properly in your programming. Primitives like critical sections, mutexes, conditions and threads are something the OS provides, and most languages only wrap them in their libraries (this ignores things like green threads as provided by for example Erlang, but it's a good point of view to start out from).

I'd say start with the Wikipedia article on threads and work your way through the linked articles. I have started with the book "Win32 Multithreaded Programming" by Aaron Cohen and Mike Woodring - it is out of print, but maybe you can find something similar.

Edit: Let me briefly follow up on your edited question. All access to data that is not read-only needs to be properly synchronized to be thread-safe, and sorting a list is not a read-only operation. So obviously one would need to add synchronization around all accesses to the list.

But with more and more cores in a system constant locking will limit the amount of work that can be done, so it is a good idea to look for a different way to design your program. One idea is to introduce as much read-only data as possible into your program - locking is no longer necessary, as all access is read-only.

I have found interfaces to be a very valuable aid in designing multi-threaded programs. Interfaces can be implemented to have only methods for read-only access to the internal data, and if you stick to them you can be quite sure that a lot of the potential programming errors do not occur. You can freely share them between threads, and the thread-safe reference counting will make sure that the implementing objects are properly freed when the last reference to them goes out of scope or is assigned another value.

What you do is create objects that descend from TInterfacedObject. They implement one or more interfaces which all provide only read-only access to the internals of the object, but they can also provide public methods that mutate the object state. When you create the object you keep both a variable of the object type and a interface pointer variable. That way lifetime management is easy, because the object will be deleted automatically when an exception occurs. You use the variable pointing to the object to call all methods necessary to properly set up the object. This mutates the internal state, but since this happens only in the active thread there is no potential for conflict. Once the object is properly set up you return the interface pointer to the calling code, and since there is no way to access the object afterwards except by going through the interface pointer you can be sure that only read-only access can be performed. By using this technique you can completely remove the locking inside of the object.

What if you need to change the state of the object? You don't, you create a new one by copying the data from the interface, and mutate the internal state of the new objects afterwards. Finally you return the reference pointer to the new object.

By using this you will only need locking where you get or set such interfaces. It can even be done without locking, by using the atomic interchange functions. See this blog post by Primoz Gabrijelcic for a similar use case where an interface pointer is set.

mghie
+1 good answer. But please see my edit to the question.
Smasher
+2  A: 

I'll second mghie's advice: thread safety is designed in. Read about it anywhere you can.

For a really low level look at how it is implemented, look for a book on the internals of a real time operating system kernel. A good example is MicroC/OS-II: The Real Time Kernel by Jean J. Labrosse, which contains the complete annotated source code to a working kernel along with discussions of why things are done the way they are.

Edit: In light of the improved question focusing on using a RTL function...

Any object that can be seen by more than one thread is a potential synchronization issue. A thread-safe object would follow a consistent pattern in every method's implementation of locking "enough" of the object's state for the duration of the method, or perhaps, narrowed to just "long enough". It is certainly the case that any read-modify-write sequence to any part of an object's state must be done atomically with respect to other threads.

The art lies in figuring out how to get useful work done without either deadlocking or creating an execution bottleneck.

As for finding such problems, testing won't be any guarantee. A problem that shows up in testing can be fixed. But it is extremely difficult to write either unit tests or regression tests for thread safety... so faced with a body of existing code your likely recourse is constant code review until the practice of thread safety becomes second nature.

RBerteig
+4  A: 

Simple: don't use shared data. Every time you access shared data you risk running into a problem (if you forget to synchronize access). Even worse, each time you access shared data you risk blocking other threads which will hurt your paralelization.

I know this advice is not always applicable. Still, it doesn't hurt if you try to follow it as much as possible.

EDIT: Longer response to Smasher's comment. Would not fit in a comment :(

You are totally correct. That's why I like to keep a shadow copy of the main data in a readonly thread. I add a versioning to the structure (one 4-aligned DWORD) and increment this version in the (lock-protected) data writer. Data reader would compare global and private version (which can be done without locking) and only if they differr it would lock the structure, duplicate it to a local storage, update the local version and unlock. Then it would access the local copy of the structure. Works great if reading is the primary way to access the structure.

gabr
I agree, but my server application performs background indexing and all clients (each by its own communication thread) query the indexing results in some form. I don't think that'll work without shared data.
Smasher
Reading shared data from multiple places is less of a problem than writing to it from multiple places. You only have to synchronize the writing place and a lesser probability of deadlocks.
pi
I don't think that's true. Consider the Sort example. While writing the list, I can't read it because it's not in a consistent state. So i have to synchronize the reading accesses too. Or am I being wrong here?
Smasher
@Smasher: You are completely right - all access must be synchronized unless everything is read access. But keep the indexing results in a read-only object, and set that object in a thread-safe manner, then you do not need to lock inside of the results object.
mghie
@gabr: thanks for the update! But considering that the indexing structures are very larget I don't really want the clients to keep local versions. But interesting approach. I have to give it some more thought...
Smasher
@gabr: Does the technique you describe have a name? I would love to read some more about it
Smasher
@Smasher: I have no idea. I "invented" it myself. Most probably it has a name and was describe somewhere but I am not aware of that.
gabr
This is a variation on the "Copy-on-Write" pattern, which is in turn a special case of "immutable object". In their original form these patterns don't require any locking though. http://en.wikipedia.org/wiki/Immutable_object#Copy-on-write
Wim Coenen
+1  A: 

My simple answer combined with those answer is:

  • Create your application/program using thread safety manner
  • Avoid using public static variable in all places

Therefore it usually fall into this habit/practice easily but it needs some time to get used to:

program your logic (not the UI) in functional programming language such as F# or even using Scheme or Haskell. Also functional programming promotes thread safety practice while it also warns us to always code towards purity in functional programming. If you use F#, there's also clear distinction about using mutable or immutable objects such as variables.


Since method (or simply functions) is a first class citizen in F# and Haskell, then the code you write will also have more disciplined toward less mutable state.

Also using the lazy evaluation style that usually can be found in these functional languages, you can be sure that your program is safe fromside effects, and you'll also realize that if your code needs effects, you have to clearly define it. IF side effects are taken into considerations, then your code will be ready to take advantage of composability within components in your codes and the multicore programming.

eriawan
Well, I don't consider choosing another programming language a best-practice in this case. That's just a bit too much.
Smasher
Looking into functional programming is a good idea in any case. You can apply some of the same principles, as side-effect-free programming is very good when dealing with multiple threads.
mghie
+2  A: 

As folks have mentioned and I think you know, being certain, in general, that your code is thread safe is impossible (I believe provably impossible but I would have to track down the theorem). Naturally, you want to make things easier than that.

What I try to do is:

  • Use a known pattern of multithreaded design: A thread pool, the actor model paradigm, the command pattern or some such approach. This way, the syncronization process happens in the same way, in a uniform way, throughout the application.
  • Limit and concentrate the points of synchronization. Write your code so you need synchronization in as few places as possible and the keep the synchronization code in one or few places in the code.
  • Write the synchronization code so that the logical relation between the values is clear on both on entering and on exiting the guard. I use lots of asserts for this (your environment may limit this).
  • Don't ever access shared variables without guards/synchronization. Be very clear what your shared data is. (I've heard there are paradigms for guardless multithreaded programming but that would require even more research).
  • Write your code as cleanly, clearly and DRY-ly as possible.
Joe Soul-bringer
+1 great answer. Thanks for your insights!
Smasher
+1  A: 

I just wanted to add two links to this discussion that I found useful, when thinking about thread-safety and the possibilites to achieve it:

A short guide to mastering thread safety

Design for thread safety

Smasher
A: 

M2C - Java Concurrency in Practice is really good.

Ash Kim