tags:

views:

66

answers:

3

I have a data class with lots of data in it (TV schedule data). The data is queried from one side and periodically updated from the other side. There are two threads: the first thread queries the data on request and the second thread updates the data on regular intervals. To prevent locking, I use two instances (copies) of the data class: the live instance and the backup instance. Initially, both instances are filled with the same data. The first thread only reads from the live instance. The second thread periodically updates both instances as follows:

  • Update the backup instance.
  • Swap the backup and live instance (i.e. the backup instance becomes the live instance).
  • Update the backup instance.
  • Both backup instance and live instance are now up-to-date.

My questions is: how should I use the volatile keyword here?

public class data
{
  // Lots of fields here.
  // Should these fields also be declared volatile?
}

I have already made the references volatile:

public volatile data live
public volatile data backup
A: 

fields should be declared volatile if you plan to modify them outside locks, or without Interlocked. Here is the best article that explain volatile deeply: http://igoro.com/archive/volatile-keyword-in-c-memory-model-explained/

Andrey
A: 

To be honest, I would just lock on it. The correctness is so much easier to check, and the need for the backup is removed.

With your plan here, the fields would also have to be volatile. Consider the case otherwise:

public class Data
{
  public int SimpleInt;
}

Here we have just a single public field for simplicity, the same applies to more realistic structures. (Incidentally, captials for class names is a more common convention in C#).

Now consider live.SimpleInt as seen by thread A. Because live could be cached, we need to have it as volatile. However, consider that when the object is swapped with backup, and then swapped back to live, then live will have the same memory location as it did before (unless the GC has moved it). Therefore live.SimpleInt will have the same memory location as it did before, and therefore if it was not volatile, thread A may be using a cached version of live.SimpleInt.

However, if you created a new Data object, rather than swapping in and out, then the new value of live.SimpleInt will not be in the thread's cache, and it could be safely non-volatile.

It's also important to consider that the fields of the fields will have to be volatile too.

Indeed now you need just one stored Data object. The new one will be created as an object referenced only by one thread (hence it cannot be damaged by or do damage to another thread), and its creation will be based on values read from live, which is also safe as the other thread is only reading (barring some memoisation techniques that mean that "reads" are really writes behind the scenes, reads can't harm other reads, though they can be harmed by writes) altered while visible to just a single thread, and hence only the final write requires any concern about synchronisation which should indeed be safe with only volatile or a MemoryBarrier used for protection, since assigning a reference is atomic, and since you don't care about the old value anymore.

Jon Hanna
A: 

I do not think you are going to get the effect you want by marking things with volatile. Consider this code.

volatile data live;

void Thread1()
{

  if (live.Field1)
  {
    Console.WriteLine(live.Field1);
  }
}

In the example above false could be written to the console if the second thread swapped the live and backup references between the time the first thread entered the if and called Console.WriteLine.

If that problem does not concern you then all you really need to do is mark the live variable as volatile. You do not need to mark the individual fields in data as volatile. The reason is because volatile reads create acquire fence memory barriers and volatile writes create release fence memory barriers. What that means is that when thread 2 swaps the references then all writes to the individual fields of data must commit first and when thread 1 wants to read the individual fields of the live instance the live variable must be reacquired from main memory first. You do not need to mark the backup variable as volatile because it is never used by thread 1.

The advanced threading section in Joe Albahari's ebook goes into a great deal of detail on the semantics of volatile and should explain why you only need to mark your live reference as such.

Brian Gideon