ansaurus

Question

Answer 1

A:

I think this general approach is valid (re-loading a cache on a background thread, not preventing access to the cache while the load is in progress by loading data to separate instances), but I'm not sure what declaring the reference as volatile really gives you.

You could just as easily have the get(id) method and the portion of loadMappingsFromDB() which overwrites the reference (not the entire load from DB, but just the re-assignment of mappings) synchronize on the same lock.

Honestly though I would look into re-using an established caching library like EhCache which has features for background loading or loading-on-startup, as these libraries have likely worked out all of it's synchronization issues a long time ago and you can go back to worrying about the logic of your app rather than the low-level safety of a home-brewed cache.

matt b 2010-06-10 16:29:00

i don't want to use synchronized blocks because they incur a performance overhead.

luke 2010-06-10 16:47:41

the overhead of a block like `synchronized (lockObject) { mappings = temp; }` is far less than you think. And using volatile has overhead as well.

matt b 2010-06-10 17:31:54

volatile only has overhead immediately after a write, because caches have been invalidated (this really depends on the underlying architecture, but that is usually how the barrier works). while synchronized, imposes the same penalty plus a number of method calls and other cheks as well as potential blocking whereas volatile will always read a value immediately.

luke 2010-06-10 19:39:20

Answer 2

A:

I will be honest, I don't quite follow all your code but the volatile key word just lets your compiler know that the variable may change outside of your code.

Richard 2010-06-10 16:32:49

Naah, not really. That may be the c/c++ definition. From jls: A field may be declared volatile, in which case the Java memory model (§17) ensures that all threads see a consistent value for the variable.

aioobe 2010-06-10 16:34:06

You are right aioobe, I come from more of a c background, more so with interfacing with hardware. So if I am reading your definition correctly, would volatile in Java insure the children threads have access to the same parent variable?

Richard 2010-06-10 16:46:10

In Java, `volatile` just means that a variable's value is checked against main memory for thread safety purposes. I'll update my answer with a link to the JLS.

Lord Torgamus 2010-06-10 17:11:48

Answer 3

A:

Actually, your proposed approach makes sense even without using 'volatile' keyword. Because reference assignment (mappings = tmp;) is an atomic operation (translates to one machine command) there's no chance of memory inconsistence:
http://java.sun.com/docs/books/tutorial/essential/concurrency/atomic.html
Reads and writes are atomic for reference variables and for most primitive variables (all types except long and double).

The idea behind 'volatile' keyword is that if you change such variable at time T, any other thread accessing it at time T + x (x > 0) will see new value. Otherwise, even after value is changed, there's still some time period during which other threads may see old value. But that doesn't seem to be a problem for you.
edit
Same idea is described in the link above.

Nikita Rybak 2010-06-10 16:53:46

It would not be "safe publication" if `mappings` weren't volatile. The reference would not be updated immediately, and the values could be stale when it was.

gustafc 2010-06-10 16:57:07

Atomic operations are _not quite guaranteed_ to be thread-safe: http://www.ibm.com/developerworks/java/library/it/it-1001art24/index.html#3

Lord Torgamus 2010-06-10 16:58:30

@gustafc If I understand poster correctly, few seconds of delay (which is unlikely already) won't be a problem here. It's not like value will be 'stale' for a year. But I agree that it's not desirable in many conditions. I just described the behaviour and it's for author to decide whether use it or not.

Nikita Rybak 2010-06-10 17:00:02

@Nikita, that's not the kind of staleness I mean. The client could see an incompletely constructed `HashMap` if the reference wasn't volatile.

gustafc 2010-06-10 17:02:10

@Lord Torgamus Nobody's talking about full thread-safety. I just point out that caveats described in your link aren't a threat here (with one assignment every couple of minutes). Having 'happens-before relationship' doesn't seem to be critical here.

Nikita Rybak 2010-06-10 17:03:27

@gustafc Sorry, but that sounds like sci-fi to me. If object is fully constructed/populated in one thread and then passed to another, how can second thread see only part of it? I'll be very interested in seeing any links confirming it. You'll really broaden my horizon in this case :)

Nikita Rybak 2010-06-10 17:07:28

Point taken, but the language you used was pretty... expansive.

Lord Torgamus 2010-06-10 17:09:35

@Lord Torgamus Not sure I understand what 'expansive' means here. If it was harsh or 'offensive', then I'm sorry, it wasn't meant to be :)

Nikita Rybak 2010-06-10 17:13:34

@Nikita, explanations are sparse, but my understanding is that different memory pages may be flushed at different times, so a reference pointing to an object on another page could point at a page which hasn't been flushed. http://www.ibm.com/developerworks/java/library/j-jtp06197.html#2.2

gustafc 2010-06-10 17:13:47

@gustafc Thanks, the link is very interesting. I thought it's logical to assume that reference is 'refreshed' completely, but apparently it's not.

Nikita Rybak 2010-06-10 17:25:46

@Nikita, no, you weren't offensive at all. It's just that you didn't mention full thread safety until you responded to my comment, and I thought someone reading your answer later might get the wrong idea.

Lord Torgamus 2010-06-10 17:38:25

@Nikita, it *does* seem logical that everything is published at once, but the only safe assumption when it comes to concurrency is that there are no safe assumptions :)

gustafc 2010-06-10 17:52:00

Answer 4

+1 A:

Does this approach make sense? and is it actually thread-safe?

It does make sense, and it is thread safe. To some extent, anyway. Some things to think about:

When updating, you let the application read old, out-dated values. Is this what you intended? It's fine for some applications, in other cases you might want to block until the cache has been updated (FutureTask makes this behavior rather easy).
When loadMappingsFromDB() kicks in, the thread originally calling get(ID) will block until the update is finished.
Several threads may call checkLoad() concurrently, which means that if the reload is slow and you have several threads calling get(ID), you may end up with craploads of concurrent updates. While the result would be the same, it would be a waste of system resources. An easy way to fix it in your current code would be having an AtomicBoolean which you check before updating:
```
private final AtomicBoolean isUpdating = new AtomicBoolean(false);
private void checkLoad()
{
    if (System.currentTimeMillis() - last_update_time <= TIMEOUT) return;
    if (!isUpdating.compareAndSet(false, true)) return; // already updating
    try {
        loadMappingsFromDB();
    } finally {
        isUpdating.set(false);
    }
}
```

gustafc 2010-06-10 16:58:39

@gustfc these are all good points in general, but they don't really matter for my case. only one thread will be calling get so only that thread would ever call checkUpdate at a time, the question was more about the safety of modifying this shared variable. Also its ok if there are 'stale' reads because we are just polling for changes so the data might have been stale for a while before we attempted an update (and that ok for my use case). But thanks for the code snippet, i haven't used AtomicBoolean before, so +1 :)

luke 2010-06-10 19:36:35

Answer 5

+2 A:

There's some misinformation in the existing answers to this question. Use of volatile actually is a good step in ensuring thread safety. See item 3 in Dispelling Java programming language myths by IBM's Peter Haggar. Haggar gives a bit of background and an example, but the nut is this:

So, how can atomic operations not be thread safe? The main point is that they may indeed be thread safe, but there is no guarantee that they are. Java threads are allowed to keep private copies of variables separate from main memory.

By using volatile, you will guarantee that threads are referring to main memory and not using private copies of variables that you don't know about or expect.

To answer your question, then: yes, your strategy is safe.

EDIT:
In response to another post, here is the JLS section about volatile fields.

Lord Torgamus 2010-06-10 17:06:16

thats a nice article, thanks

luke 2010-06-10 17:11:18

ansaurus

tags:

views:

answers:

proper use of volatile keyword

related questions