views:

708

answers:

10

A fairly basic question, but I don't see it asked anywhere.

Let's say we have a global struct (in C) like so:

struct foo {
  int written_frequently1;
  int read_only;
  int written_frequently2;
};

It seems clear to me that if we have lots of threads reading and writing, we need a semaphore (or other lock) on the written_frequently members, even for reading, since we can't be 100% sure that assignments to this struct will be atomic.

If we want lots of threads to read the read_only member, and none to write, to we need a semaphore on the struct access just for reading?

(I'm inclined to say no, because the fact that the locations immediately before and after are constantly changed shouldn't affect the read_only member, and multiple threads reading the value shouldn't interfere with each other. But I'm not sure.)


[Edit: I realize now I should have asked this question much better, in order to clarify very specifically what I meant. Naturally, I didn't really grok all of the issues involved when I first asked the question. Of course, if I comprehensively edit the question now, I will ruin all of these great answers. What I meant is more like:

struct bar {
  char written_frequently1[LONGISH_LEN];
  char read_only[LONGISH_LEN];
  char written_frequently2[LONGISH_LEN];
};

The major issue I asked about is, since this data is part of a struct, is it at all influenced by the other struct members, and might it influence them in return?

The fact that the members were ints, and therefore writes are likely atomic, is really just a red herring in this case.]

+1  A: 

If all the threads are only reading, you don't need a semaphore.

Eli Bendersky
+6  A: 

If the read_only member is actually read only, then there is no danger of the data being changed and therefore no need for synchronization. This could be data that is set up before the threads are started.

You will want synchronization for any data that can be written, regardless of the frequency.

Sam Hoice
If data is read frequently but written infrequently, you can get thread safety and good performance by using separate read locks and write locks.
Adam Rosenfield
Right. More info is available from many sources, I'm sure.http://en.wikipedia.org/wiki/Readers-writer_lock
Sam Hoice
As long as the write is atomic (ie just a single int) you don't need locks at all.
Martin York
If the LOCK is atomic (i.e a single int), you don't need mutex / futex locks at all.
Tim Post
A: 

I would hide each field behind behind a function call. The write-only fields would have a semaphore. The read-only just returns the value.

Robert
A semaphore doesn't prevent a reader from seeing a "snapshot" when a write is half-done. The semaphore means "values are changing and may be inconsistent" -- so readers need to respect them as well as writers.
Adam Liss
I didn't mean them as "write-only." Just that they were likely to be changed frequently. They would certainly have to be read. The "read-only" is really read-only.
JXG
+1  A: 

No.

In general you need semaphores to prevent concurrent access to resources (an int in this case). However, since the read_only member is read only, it won't change between/during accesses. Note that it doesn't even have to be an atomic read — if nothing changes, you're always safe.

How are you setting read_only initially?

Yogi
`read_only` is set by a single thread, once, before the rest of the threads are run.
JXG
A: 

Adding to previous answers:

  1. In this case the natural synchronization paradigm is mutual exclusion, not semaphores.
  2. I agree that you don't need any mutex on readonly variables.
  3. If the read-write part of the structure has consistency constraints, in general you will need one mutex for all of them, in order to keep the operations atomic.
Federico Ramponi
+2  A: 

"Read only" is a bit misleading, since the variable is written to at least once when it's initialized. In that case you still need a memory barrier between the initial write and subsequent reads if they're in different threads, or else they could see the uninitialized value.

sk
+5  A: 

You need a mutex to guarantee that an operation is atomic. So in this particular case, you may not need a mutex at all. Specifically, if each thread writes to one element and the write is atomic and the new value is independent of the current value of any element (including itself), there is no problem.

Example: each of several threads updates a "last_updated_by" variable that simply records the last thread that updated it. Clearly, as long as the variable itself is updated atomically, no errors will occur.


However, you do need a mutex to guarantee consistency if a thread reads or writes more than one element at a time, particularly because you mention locking an element rather than the entire structure.

Example: a thread updates the "day", "month" and "year" elements of a structure. This must happen atomically, lest another thread read the structure after the "month" increments but before the "day" wraps to 1, to avoid dates such as February 31. Note that you must honor the mutex when reading; otherwise you may read an erroneous, half-updated value.

Adam Liss
Good answer. If you don't need schronisation between threads and operation is atomic not need to lock.
Martin York
Re-reading this, I'm trying to imagine the algorithm that would generate "dates such as February 31." Hmmm .... sounded good at the time!
Adam Liss
You didn't say that it's a good algorithm. ;)
Sam Hoice
@Sam Hoice: Good point! In fact, if I reviewed that code, I would send myself to remedial calendar algorithms class.
Adam Liss
Your 'last updated by' example has a problem - the writes could be reordered if they are unsynchronized, meaning a value written to the last update field could be overwritten by an earlier value.
sk
+1  A: 

Readers need mutexes, too!

There seems to be a common misconception that mutexes are for writers only, and that readers don't need them. This is wrong, and this misconception is responsible for bugs that are extremely difficult to diagnose.

Here's why, in the form of an example.

Imagine a clock that updates every second with the code:

if (++seconds > 59) {        // Was the time hh:mm:59?
   seconds = 0;              // Wrap seconds..
   if (++minutes > 59)  {    // ..and increment minutes.  Was it hh:59:59?
     minutes = 0;            // Wrap minutes..
     if (++hours > 23)       // ..and increment hours.  Was it 23:59:59?
        hours = 0;           // Wrap hours.
    }
}

If the code is not protected by a mutex, another thread can read the hours, minutes, and seconds variables while an update is in progress. Following the code above:

[Start just before midnight] 23:59:59
[WRITER increments seconds]  23:59:60
[WRITER wraps seconds]       23:59:00
[WRITER increments minutes]  23:60:00
[WRITER wraps minutes]       23:00:00
[WRITER increments hours]    24:00:00
[WRITER wraps hours]         00:00:00

The time is invalid from the first increment until the final operation six steps later. If a reader checks the clock during this period, it will see a value that may be not only incorrect but illegal. And since your code is likely to depend on the clock without displaying the time directly, this is a classic source of "ricochet" errors that are notoriously difficult to track down.

The fix is simple.

Surround the clock-update code with a mutex, and create a reader function that also locks the mutex while it executes. Now the reader will wait until the update is complete, and the writer won't change the values mid-read.

Adam Liss
I don't think you understand what the OP was asking about. He's referring to variables that aren't written at all (after program startup, presumably). In your example, `seconds/minutes/hours` all refer to writeable variables, so of course you need synchronization primitives.
Tom Barta
I understand the OP is asking a very specific question about a special case, but I'm concerned that several of the answers imply that a reader doesn't need a mutex in _general_ -- which is incorrect and leads to particularly nasty bugs. Clearly, a quantity that never changes is thread-safe.
Adam Liss
+1  A: 

You might enjoy reading any one of these papers on practical lock free programming, or just dissecting and understanding the provided snippets.

Tim Post
A: 

Many thanks to all the great answerers (and for all the great answers).

To sum up:

If there is a read-only member of a struct (in our case, if the value is set once, long before any thread might want to read it), then threads reading this member do not need locks, mutexes, semaphores, or any other concurrency protection.

This is true even if the other members are written to frequently. The fact that the different variables are all part of the same struct makes no difference.

JXG