views:

398

answers:

10

I have some status data that I want to cache from a database. Any of several threads may modify the status data. After the data is modified it will be written to the database. The database writes will always be done in series by the underlying database access layer which queues database operations in a different process so I cam not concerned about race conditions for those.

Is it a problem to just modify the static data from several threads? In theory it is possible that modifications are implemented as read, modify, write but in practice I can't imagine that this is so.

My data handling class will look something like this:

class StatusCache
{
public:
    static void SetActivityStarted(bool activityStarted)
        { m_activityStarted = activityStarted; WriteToDB(); }
    static void SetActivityComplete(bool activityComplete);
        { m_activityComplete = activityComplete; WriteToDB(); }
    static void SetProcessReady(bool processReady);
        { m_processReady = processReady; WriteToDB(); }
    static void SetProcessPending(bool processPending);
        { m_processPending = processPending; WriteToDB(); }
private:
    static void WriteToDB(); // will write all the class data to the db (multiple requests will happen in series)
    static bool m_activityStarted;
    static bool m_activityComplete;
    static bool m_processReady;
    static bool m_processPending;
};

I don't want to use locks as there are already a couple of locks in this part of the app and adding more will increase the possibility of deadlocks.

It doesn't matter if there is some overlap between 2 threads in the database update, e.g.

thread 1                        thread 2                    activity started in db
SetActivityStarted(true)        SetActivityStarted(false)   
  m_activityStated = true
                                  m_activityStarted = false
                                  WriteToDB()               false
  WriteToDB()                                               false

So the db shows the status that was most recently set by the m_... = x lines. This is OK.

Is this a reasonable approach to use or is there a better way of doing it?

[Edited to state that I only care about the last status - order is unimportant]

+1  A: 

Im no c++ guy but i dont think it will be safe to write to it if you dont have some sort of synchronization..

Petoj
+7  A: 

No, it's not safe.

The code generated that does the writing to m_activityStarted and the others may be atomic, but that is not garantueed. Also, in your setters you do two things: set a boolean and make a call. That is definately not atomic.

You're better off synchronizing here using a lock of some sort.

For example, one thread may call the first function, and before that thread goes into "WriteDB()" another thread may call another function and go into WriteDB() without the first going there. Then, perhaps the status is written in the DB in the wrong order.

If you're worried about deadlocks then you should revise your whole concurrency strategy.

Dave Van den Eynde
I didn't make it clear but it doesn't matter if the status update from a thread is written to the db from another thread. at any time all the database can show is the last known status. If this is the case do I still need locks?
danio
If you're going to serialize all the calls to WriteToDB(), why not serialize all the access to the setters in the first place?
Dave Van den Eynde
This is part of a multi-process system and the serialization of DB writes is done by a different process.
danio
A: 

You may get away with it with bools, but if the static objects being changed are of types of any great complexity, terrible things will occur. My advice - if you are going to write from multiple threads, always use synchronisation objects, or you will sooner or later get bitten.

anon
What makes bools so special? Data is data. If it's not an atomic operation it doesn't matter how 'complex' the type is.
OJ
A: 

This is not a good idea. There are many variables that will affect the timing of different threads.

Without some kind of lock you will not be guaranteed to have the correct last state.

It is possible that two status updates could be written to the database out of order.

As long as the locking code is designed properly dead locks should not be an issue with a simple process like this.

Dana Holt
This part of the application is simple, but it is just a very small part of it and may be called from several different threads, each of which also interact with at least 2 other locks, hence my deadlock concerns.
danio
I agree that you should not use unneeded locks. You shouldn't try to eliminate them just for the sake of having fewer locks though. If they are needed they should be used. More detailed info on your system would be required to evaluate the need.
Dana Holt
+1  A: 

It looks like you have two issues here.

#1 is that your boolean assignment is not necessarily atomic, even though it's one call in your code. So, under the hood, you could have inconsistent state. You could look into using atomic_set(), if your threading/concurrency library supports that.

#2 is synchronization between your reading and writing. From your code sample, it looks like your WriteToDB() function writes out the state of all 4 variables. Where is WriteToDB serialized? Could you have a situation where thread1 starts WriteToDB(), which reads m_activityStarted but doesn't finish writing it to the database, then is preempted by thread2, which writes m_activityStarted all the way through. Then, thread1 resumes, and finishes writing its inconsistent state through to the database. At the very least, I think that you should have write access to the static variables locked out while you are doing the read access necessary for the database update.

mbyrne215
A: 

As others have pointed out, this is generally a really bad idea (with some caveats).

Just because you don't see a problem on your particular machine when you happen to test it doesn't prove that the algorithm works right. This is especially true for concurrent applications. Interleavings can change dramatically for example when you switch to a machine with a different number of cores.

Caveat: if all your setters are doing atomic writes and if you don't care about the timing of them, then you may be okay.

Based on what you've said, I'd think that you could just have a dirty flag that's set in the setters. A separate database writing thread would poll the dirty flag every so often and send the updates to the database. If some items need extra atomicity, their setters would need to lock a mutex. The database writing thread must always lock the mutex.

Mr Fooz
Even for a single dirty flag, you might need some form of synchronization.
Dave Van den Eynde
If (a) you don't care if a state change is delayed by one polling cycle before being sent to the database, and (b) all state changes are atomic, then you shouldn't need to synchronize.
Mr Fooz
+3  A: 

On multi CPU machines, there's no guarantee that memory writes will be seen by threads running on different CPUs in the correct order without issuing a synchronisation instruction. It's only when you issue a synch order, e.g. a mutex lock or unlock, that the each thread's view of the data is guaranteed to be consistent.

To be safe, if you want the state shared between your threads, you need to use synchronisation of some form.

seejay
+3  A: 

You never know exactly how things are implemented at the lower levels. Especially when you start dealing with multiple cores, the various cache levels, pipelined execution, etc. At least not without a lot of work, and implementations change frequently!

If you don't mutex it, eventually you will regret it!

My favorite example involves integers. This one particular system wrote its integer values in two writes. E.g. not atomic. Naturally, when the thread was interrupted between those two writes, well, you got the upper bytes from one set() call, and the lower bytes() from the other. A classic blunder. But far from the worst that can happen.

Mutexing is trivial.

You mention: I don't want to use locks as there are already a couple of locks in this part of the app and adding more will increase the possibility of deadlocks.

You'll be fine as long as you follow the golden rules:

  • Don't mix mutex lock orders. E.g. A.lock();B.lock() in one place and B.lock();A.lock(); in another. Use one order or the other!
  • Lock for the briefest possible time.
  • Don't try to use one mutex for multiple purposes. Use multiple mutexes.
  • Whenever possible use recursive or error-checking mutexes.
  • Use RAII or macros to insure unlocking.

E.g.:

#define RUN_UNDER_MUTEX_LOCK( MUTEX, STATEMENTS ) \
   do { (MUTEX).lock();  STATEMENTS;  (MUTEX).unlock(); } while ( false )

class StatusCache
{
public:
    static void SetActivityStarted(bool activityStarted)
        { RUN_UNDER_MUTEX_LOCK( mMutex, mActivityStarted = activityStarted );
          WriteToDB(); }
    static void SetActivityComplete(bool activityComplete);
        { RUN_UNDER_MUTEX_LOCK( mMutex, mActivityComplete = activityComplete );
          WriteToDB(); }
    static void SetProcessReady(bool processReady);
        { RUN_UNDER_MUTEX_LOCK( mMutex, mProcessReady = processReady );
          WriteToDB(); }
    static void SetProcessPending(bool processPending);
        { RUN_UNDER_MUTEX_LOCK( mMutex, mProcessPending = processPending );
          WriteToDB(); }

private:
    static void WriteToDB(); // read data under mMutex.lock()!

    static Mutex mMutex;
    static bool  mActivityStarted;
    static bool  mActivityComplete;
    static bool  mProcessReady;
    static bool  mProcessPending;
};
Mr.Ree
"Don't mix mutex lock orders" - that's what I'm concerned about. I will have to analyse all the locks and how they can be used. DDJ has an interesting way to enforce mutex order which I am going to investigate: http://www.ddj.com/cpp/212201754
danio
Look at my example: There's no possible way to mix mutex lock orders in there. It falls out naturally from "Lock for the briefest possible time".
Mr.Ree
+1  A: 

In theory it is possible that modifications are implemented as read, modify, write but in practice I can't imagine that this is so.

Generally it is so unless you've set up some sort of transactional memory. Variables are generally stored in RAM but modified in hardware registers, so the read isn't just for kicks. The read is necessary to copy the value out of RAM and into a place it can be modified (or even compared to another value). And while the data is being modified in the hardware register, the stale value is still in RAM in case somebody else wants to copy it into another hardware register. And while the modified data is being written back to RAM somebody else may be in the process of copying it into a hardware register.

And in C++ ints are guaranteed to take at least a byte of space. Which means it is actually possible for them to have a value other than true or false, say due to race condition where the read happens partway through a write.

On .Net there is some amount of automatic synchronization of static data and static methods. There is no such guarantee in standard C++.

If you're looking at only ints, bools, and (I think) longs, you have some options for atomic reads/writes and addition/subtraction. C++0x has something. So does Intel TBB. I believe that most operating systems also have the needed hooks to accomplish this.

Max Lybbert
Excellent points: I hadn't thought about it in terms of multiple registers for the same memory data.
danio
After I wrote this I started wondering if the read takes place since you know beforehand that the modification will be a write of constant data. So I got the assembly code output from g++ and the read takes place before the function is called (the flags are set as part of the this pointer).
Max Lybbert
+1  A: 

While you may be afraid of deadlocks, I am sure you will be extremely proud of your code to know it works perfectly.
So I would recommend you throw in the locks, you may also want to consider semaphores, a more primitive(and perhaps more versatile) type of lock.

gnomed
The main reason I am interested in trying to reduce the dependencies of locks after reading a few articles on DDJ: http://www.ddj.com/cpp/184401930 , http://www.ddj.com/cpp/204801163?pgno=2
danio
true, I always say simplicity is key. Sometimes a lockless program is simplest. But if its a complex program then locks might make it simpler than finding a way around each lock. It's a judgment call, good luck.
gnomed