views:

577

answers:

6

Hi,

I have a general doubt regarding publishing data and data changes across threads. Consider for example the following class.

public class DataRace {
  static int a = 0;

  public static void main() {
    new MyThread().start();
    a = 1;
  }

  public static class MyThread extends Thread {
    public void run() { 
      // Access a, b.
    }
  }
}

Lets focus on main().

Clearly

new MyThread().start();
a = 1;

There we change the shared variable a after the MyThread is started and thus may not be a thread-safe publication.

a = 1;
new MyThread().start();

However this time the change in a is safely published across the new thread, since Java Language Specification (JLS) guarantees that all variables that were visible to a thread A when it starts a thread B are visible to thread B, which is effectively like having an implicit synchronization in Thread.start().

new MyThread().start();
int b = 1;

In this case when a new variable is allocated after both the threads have been spawned, is there any guarantee that that the new variable will be safely published to all threads. i.e. if var b is accessed by the other thread, is it guaranteed to see its value as 1. Note that I'm not talking about any subsequent modifications to b after this (which certainly needs to be synchronized), but the first allocation done by the jvm.

Thanks,

A: 
a = 1;
new MyThread().start();

The assignment to 'a' will happen before start is called on the new thread, so the new thread will always print the value '1'

new MyThread().start();
a = 1;

In this case, MyThread can print 1 or 0 as the value of 'a'.

Laplie
+1  A: 

I wasn't not entirely sure about this one:

a = 1;
new MyThread().start();

... in that I wasn't sure there was any guarantee that the value of a would be "flushed" to shared memory before the new thread started. However, the spec explicitly talks about this. In section 17.4.4 it states:

An action that starts a thread synchronizes-with the first action in the thread it starts.

(The discussion after that basically shows that it'll be okay.)

I'm not sure what your last example (with b) is meant to be about.

Jon Skeet
When a new variable is allocated after both the threads have been spawned, is there any guarantee that that the new variable will be safely published to all threads. i.e. if var b is accessed by the other thread, is it guaranteed to see its value as 1?
baskin
It really depends on the exact situation. There'd have to be something specific to guarantee both that the reading thread doesn't use a cached value and that the writing thread has published the write to the shared memory. See chapter 17 of the spec referenced above for more (brain-busting) details.
Jon Skeet
"Flushed to shared memory"? (Are you referring to the heap?) There is no flushing involved, as this has nothing to do with stream semantics. Value assignment is atomic (with minor caveat regarding longs) in the JVM. After a=1 executes, the variable <CL>:DataRace.a has the value of 1. If a thread is spawned before the statement a=1, there are no guarantees in what value of DataRace.a it will see. It is certainly possible that thread starts and then swapped, and the main thread executes, and sets a=1, and then the other thread will see a==1. It is more likely that it will see a==0.
>> "reading thread doesn't use a cached value "There is no point of a the reading thread caching b's value here, since it never existed before.Jon,consider e.g. final String msg = "blah blah";myExecutorService.submit(new LoggingTask(msg));Where loggingTask simple writes the msg somewhere and the executor service in not a same thread executor.You mean to say that in this case i'll have to explicitly synchronize the string 'msg'?
baskin
@baskin: Imagine that the CPU had a cached value of the memory which includes the area containing b. Even if it hadn't explicitly read before, would it definitely know that that wasn't valid to cache? I'm not sure - maybe, maybe not. I suspect that submitting a new task to an executor service creates an appropriate memory barrier though.
Jon Skeet
@alphazero: I'm referring to shared memory as mentioned in the spec. I'm not talking about stream semantics - I'm talking about the JIT using registers and only writing to the heap when it needs to, and about CPU caches too. You don't need to be using streams for flushing to be relevant...
Jon Skeet
A: 

MyThread().start() starts a new thread that runs seperately. The declaration of int b = 1 has nothing to do with the thread that you started. It continues in the main thread.

Thread safety is an issue if two threads read or mutate the same Object at the same time, or if two threads obtain locks on resources in reverse order, so that each one is waiting for the other one (this is called a deadlock).

Yishai
+1  A: 

I'm not sure what you're asking here. I think you're talking about thread-safe access to the "a" variable.

The issue is not the order of invocation but the fact that

-access to a is not threadsafe. So in an example with multiple threads updating and reading a, you won't ever be able to guarantee that the "a" you're reading is the same value as what you updated from (some other thread may have changed the value).

-in a multithreaded environment the jvm does not guarantee that the updated values for a are kept in sync. E.g.

Thread 1: a=1

Thread 2: a=2

Thread 1: print a <- may return 1

You can avoid this by declaring a "volatile".

As written there are no guarantees at all about the value of a.

BTW, Josh Bloch's Concurrency in Practice is a great book on this subject ( and I say that not having gotten all the way through it yet ;) - it really helped me to understand just how involved threading issues can get.

Steve B.
Java Language Specification (JLS) guarantees that all variables that were visible to a thread A when it starts a thread B are visible to thread B. Thus the order of invocation would matter here?
baskin
Concurrency in Practice is by Brian Goetz et al, Block wrote Effective Java and Java Puzzlers with either Pugh or Gafter
non sequitor
A: 

I have my doubts about this question :)

Issues regarding access to shared resources between concurrent threads of execution is a fairly well understood and investigated topic.

In general, if you have a source that is accessed with read/write semantics by a multiplicity of threads, you will need to regulate access to that resource. (Here the resource is the static int variable DataRace.a)

(I also +1 Steve B.'s note here that the issue here is not the order of invocations.)

A: 

If b is a local variable created within DataRace.main() as the code snippet indicates, it is not accessible to MyThread to begin with, so the question is moot.

If b is a shared variable between the main thread and the MyThread thread, it does not have the correct visibility in the absence of a proper memory barrier, so MyThread may not see the correct value promptly.

sjlee