tags:

views:

113

answers:

3

I'm doing a program to handle many blocking I/O operations at a time by spawning an Agent/MailboxProcessor per operation. I've got a bunch of files I've cached in memory in a Map which I want to share among these agents. However, I've also got a FileSystemWatcher to callback whenever changes are made to the files, so that I can update the cache.

How do I make this happen without risking the cache being corrupted by multi-threaded read and write ?

It seems to me that the Map is already based on pointers to objects, so would that automatically solve my problem as I'm simply changing the pointers to the new objects as they are loaded, or is this a broken understanding of it?

Thanks

+2  A: 

It seems to me that the Map is already based on pointers to objects, so would that automatically solve my problem as I'm simply changing the pointers to the new objects as they are loaded, or is this a broken understanding of it?

I think your understanding is correct. You can just have a single mutable reference to an immutable Map. Writing a new map to the reference is atomic so there is no need to synchronize that.

Jon Harrop
Thank you, I'm glad to hear that.But, I'm a bit confused about this 'mutable reference to an immutable Map'. How would you declare that ?
CSkau
@CSkau: `let x = ref (Map.empty<string>)`
Juliet
Wonderful Juliet ! It seems I got it right then. :)
CSkau
@Jon: hmmmmm... I'm not sure about this approach. If one child thread calls `x := (!x.add("Juliet")` and another calls `x := (!x.add("Jon"))` at the same time, someone loses their changes. So you either you have a single thread dedicated to writing, or you protect writes from multiple threads with a lock. And even then, without marking a critical section with a lock or declaring your variable as volatile, some threads might have a cached pointer which prevents mutation from being visible immediately to all threads. Best to avoid mutable state altogether.
Juliet
But there is only one writer, the `FileSystemWatcher`, right?
Jon Harrop
Yes in this case there is just one writer at a time. First the initial loader, then after that only the `FileSystemWatcher`.The main concern for me is the multiple other threads that are reading from the `Map` at any point in this process.
CSkau
@CSkau: The readers can only ever see the old or new data structure either side of a write because it is atomic. They will eventually see the latest version because of cache coherence and the .NET memory model.
Jon Harrop
+1  A: 

When I've seen similar Erlang programs, the systems are set up like this:

  • You can wrap up the FileSystemWatcher with a MailboxProcessor, that way you're handling incoming updates as messages and not windows events. Your FileSystemWatcherProcess can hold a list of children who are listening, and push out updates as needed. This is basically the same thing event-based programming, only with messages and actors instead.

    Your FileSystemWatcherProcess should not need to maintain a your cache of files, it just blindly pushes out messages.

  • OR You have a master process which holds the state of the map. File SystemWatcher sends updates to the master. Each child thread holds a reference to the master, so that each time they finish processing an item or batch of items, they send a message to the master process requesting the latest Map.

Neither system requires any locking.

Juliet
A: 

Following up to Jon's answer. if you end up having multiple writers, then instead of locks you can always do CAS:

 let updateMap value =
    let mutable success = false
    while not success do
      let v = !x
      let result = Interlocked.CompareExchange(x, v.Add(value), v)
      success <- Object.ReferenceEqual(v, result)

And, if targeting only .Net 4.0 is an option for you, you shouldn't be inventing all that stuff yourself: there is a System.Collections.Concurrent.ConcurrentDictionary class which implements concurrently readable and writable dictionary already.

Mitya
In this case though I've luckily only got a single writer, plus I'm using 3.5.I was however not aware of there being a concurrent collections library. This will be greatly helpful in the future !Thanks.
CSkau