ansaurus

Question

Answer 1

+6 A:

Yes, using independent threads will use multiple cores in a normal JVM, without you having to do any work.
If anything is only ever read, it should be fine to be read by multiple threads. If you can make the objects in question immutable (to guarantee they'll never be changed) that's even better
I'm not sure what sort of clustering you're considering, but you might want to look at Hadoop. Note that distributed computing distributes tasks rather than threads (normally, anyway).

Jon Skeet 2009-10-07 17:05:10

Currently I'm cannibalizing cycles from department help room/student lab computers with BOINC (for entire problem runs, not individual threads). Looking forward to the point where I might have access to an actual cluster (or use of my app by someone who does). To clarify my question about the distributed computing, is there an existing, easy to use framework that converts threads to tasks or makes an entire cluster look like a single JVM, etc.

Carl 2009-10-07 17:11:58

Removed my post as it dup's this post, however you should still read up on the java memory model as it's very important to understanding how data is transmitted between threads. http://java.sun.com/docs/books/jls/third%5Fedition/html/memory.html

reccles 2009-10-07 17:31:42

@Carl: I don't know of anything which will convert threads to tasks - quite often you'd use different techniques. Have a look at Hadoop though.

Jon Skeet 2009-10-07 17:33:43

I've had a look at Hadoop before, some ways back; have to take a second glance.

Carl 2009-10-07 17:35:05

Answer 2

+4 A:

Multi-core Usage

Java runtimes conventionally schedule threads to run concurrently on all available processors and cores. I think it's possible to restrict this, but it would take extra work; by default, there is no restriction.

Immutable Objects

For read-only objects, declare their member fields as final, which will ensure that they are assigned when the object is created and never changed. If a field is not final, even if it never changed after construction, there can be some "visibility" issues in a multi-threaded program. This could result in the assignments made by one thread never becoming visible to another.

Any mutable fields that are accessed by multiple threads should be declared volatile, be protected by synchronization, or use some other concurrency mechanism to ensure that changes are consistent and visible among threads.

Distributed Computing

The most widely used framework for distributed processing of this nature in Java is called Hadoop. It uses a paradigm called map-reduce.

Native Code Integration

Integrating with other languages is unlikely to be worthwhile. Because of its adaptive bytecode-to-native compiler, Java is already extremely fast on a wide range of computing tasks. It would be wrong to assume that another language is faster without actual testing. Also, integrating with "native" code using JNI is extremely tedious, error-prone, and complicated; using simpler interfaces like JNA is very slow and would quickly erase any performance gains.

erickson 2009-10-07 17:21:31

can you elaborate on "visibility" issues? I have a few immutable collections which I pass to each thread when it starts; none of the results of any given thread need to be seen by another, and they make no modifications to the collections (nor does anything else).

Carl 2009-10-07 17:33:43

This should be ok. The problem happens if a thread makes a change to any of these values, the change may not be "visible". It is possible that the change remains on the other threads private cache and isn't written back to the shared memory pool. This has to do with predicate rules in the java memory model.

reccles 2009-10-07 17:37:36

It depends on the memory model - I can't remember what the acquire/release semantics are on thread creation, but I believe there are none, meaning you can initialize your object before the thread starts but the thread itself will never see it. Please refer to my answer for a more complete explanation.

Vitali 2009-10-07 17:42:42

Volatile is insufficient for thread safety unless you can guarantee only a single thread will modify the volatile variable.

Vitali 2009-10-07 17:43:24

The predicate rules on volatile will force a flush of the memory before the thread start runs. The thread safety issue for volatile is that there are no guarantees of atomicity.

reccles 2009-10-07 18:01:02

@Vitali: A call to start on a thread happens-before any action in the started thread [(ref)](http://java.sun.com/javase/6/docs/api/java/util/concurrent/package-summary.html) In other words, new started thread will see all changes that were done before thread was started.

Peter Štibraný 2010-07-25 19:13:56

Answer 3

+1 A:

As some people have said, the answers are:

Threads on cores - Yes. Java has had support for native threads for a long time. Most OSes have provided kernel threads which automagically get scheduled to any CPUs you have (implementation performance may vary by OS).
The simple answer is it will be safe in general. The more complex answer is that you have to ensure that your Object is actually created & initialized before any threads can access it. This is solved one of two ways:
- Let the class loader solve the problem for you using a Singleton (and lazy class loading):
```
public class MyImmutableObject
{
    private static class MyImmutableObjectInstance {
        private static final MyImmutableObject instance = new MyImmutableObject();
    }
    public MyImmutableObject getInstance() {
        return MyImmutableObjectInstance.instance;
    }
}
```
- Explicitly using acquire/release semantics to ensure a consistent memory model:
```
MyImmutableObject foo = null;
volatile bool objectReady = false;


// initializer thread:
....
/// create & initialize object for use by multiple threads
foo = new MyImmutableObject();
foo.initialize();


// release barrier
objectReady = true;


// start worker threads
public void run() {
   // acquire barrier
   if (!objectReady)
       throw new IllegalStateException("Memory model violation");


   // start using immutable object foo
}
```
I don't recall off the top of my head how you can exploit the memory model of Java to perform the latter case. I believe, if I remember correctly, that a write to a volatile variable is equivalent to a release barrier, while a read from a volatile variable is equivalent to an acquire barrier. Also, the reason for making the boolean volatile as opposed to the object is that access of a volatile variable is more expensive due to the memory model constraints - thus, the boolean allows you to enforce the memory model & then the object access can be done much faster within the thread.
As mentioned, there's all sorts of RPC mechanisms. There's also RMI which is a native approach for running code on remote targets. There's also frameworks like Hadoop which offer a more complete solution which might be more appropriate.
For calling native code, it's pretty ugly - Sun really discourages use by making JNI an ugly complicated mess, but it is possible. I know that there was at least one commercial Java framework for loading & executing native dynamic libraries without needing to worry about JNI (not sure if there are any free or OSS projects).

Good luck.

Vitali 2009-10-07 17:40:29

ansaurus

tags:

views:

answers:

Java Multi-Threading Beginner Questions

Multi-core Usage

Immutable Objects

Distributed Computing

Native Code Integration

related questions