I'm writing a Java program which uses a lot of CPU because of the nature of what it does. However, lots of it can run in parallel. When I run it, it only seems to use one CPU until it needs more then it uses another CPU - is there anything I can do in Java to force different threads to run on different cores/CPUs?
The easiest thing to do is break your program into multiple processes. The OS will allocate them across the cores.
Somewhat harder is to break your program into multiple threads and trust the JVM to allocate them properly. This is -- generally -- what people do to make use of available hardware.
Edit
How can a multi-processing program be "easier"? Here's a step in a pipeline.
public class SomeStep {
public static void main( String args[] ) {
BufferedReader stdin= new BufferedReader( System.in );
BufferedWriter stdout= new BufferedWriter( System.out );
String line= stdin.readLine();
while( line != null ) {
// process line, writing to stdout
line = stdin.readLine();
}
}
}
Each step in the pipeline is similarly structured. 9 lines of overhead for whatever processing is included.
This may not be the absolute most efficient. But it's very easy.
The overall structure of your concurrent processes is not a JVM problem. It's an OS problem, so use the shell.
java -cp pipline.jar FirstStep | java -cp pipline.jar SomeStep | java -cp pipline.jar LastStep
The only thing left is to work out some serialization for your data objects in the pipeline.
Standard Serialization works well. Read http://java.sun.com/developer/technicalArticles/Programming/serialization/ for hints on how to serialize. You can replace the BufferedReader
and BufferedWriter
with ObjectInputStream
and ObjectOutputStream
to accomplish this.
There are two basic ways to multi-thread in Java. Each logical task you create with these methods should run on a fresh core when needed and available.
Method one: define a Runnable or Thread object (which can take a Runnable in the constructor) and start it running with the Thread.start() method. It will execute on whatever core the OS gives it -- generally the less loaded one.
Tutorial: Defining and Starting Threads
Method two: define objects implementing the Runnable (if they don't return values) or Callable (if they do) interface, which contain your processing code. Pass these as tasks to an ExecutorService from the java.util.concurrent package. The java.util.concurrent.Executors class has a bunch of methods to create standard, useful kinds of ExecutorServices. Link to Executors tutorial.
From personal experience, the Executors fixed & cached thread pools are very good, although you'll want to tweak thread counts. Runtime.getRuntime().availableProcessors() can be used at run-time to count available cores. You'll need to shut down thread pools when your application is done, otherwise the application won't exit because the ThreadPool threads stay running.
Getting good multicore performance is sometimes tricky, and full of gotchas:
- Disk I/O slows down a LOT when run in parallel. Only one thread should do disk read/write at a time.
- Synchronization of objects provides safety to multi-threaded operations, but slows down work.
- If tasks are too trivial (small work bits, execute fast) the overhead of managing them in an ExecutorService costs more than you gain from multiple cores.
- Creating new Thread objects is slow. The ExecutorServices will try to re-use existing threads if possible.
- All sorts of crazy stuff can happen when multiple threads work on something. Keep your system simple and try to make tasks logically distinct and non-interacting.
One other problem: controlling work is hard! A good practice is to have one manager thread that creates and submits tasks, and then a couple working threads with work queues (using an ExecutorService).
I'm just touching on key points here -- multithreaded programming is considered one of the hardest programming subjects by many experts. It's non-intuitive, complex, and the abstractions are often weak.
Edit -- Example using ExecutorService:
public class TaskThreader {
class DoStuff implements Callable {
Object in;
public Object call(){
in = doStep1(in);
in = doStep2(in);
in = doStep3(in);
return in;
}
public DoStuff(Object input){
in = input;
}
}
public abstract Object doStep1(Object input);
public abstract Object doStep2(Object input);
public abstract Object doStep3(Object input);
public static void main(String[] args) throws Exception {
ExecutorService exec = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
ArrayList<Callable> tasks = new ArrayList<Callable>();
for(Object input : inputs){
tasks.add(new DoStuff(input));
}
List<Future> results = exec.invokeAll(tasks);
exec.shutdown();
for(Future f : results) {
write(f.get());
}
}
}
You should write your program to do its work in the form of a lot of Callable's handed to an ExecutorService and executed with invokeAll(...).
You can then choose a suitable implementation at runtime from the Executors class. A suggestion would be to call Executors.newFixedThreadPool() with a number roughly corresponding to the number of cpu cores to keep busy.
When I run it, it only seems to use one CPU until it needs more then it uses another CPU - is there anything I can do in Java to force different threads to run on different cores/CPUs?
I interpret this part of your question as meaning that you have already addressed the problem of making your application multi-thread capable. And despite that, it doesn't immediately start using multiple cores.
The answer to "is there any way to force ..." is (AFAIK) not directly. Your JVM and/or the host OS decide how many 'native' threads to use, and how those threads are mapped to physical processors. You do have some options for tuning. For example, I found this page which talks about how to tune Java threading on Solaris. And this page talks about other things that can slow down a multi-threaded application.
There is no way to set CPU affinity in Java. http://bugs.sun.com/bugdatabase/view%5Fbug.do?bug%5Fid=4234402
If you have to do it, use JNI to create native threads and set their affinity.
"It's do-able in <100 lines of very simple code"
HI BobMcGee we are looking for someone to do just this for us. Love to hear from you if you were interested?
I think this issue is related to Java Parallel Proccesing Framework (JPPF). Using this you can run diferent jobs on diferent processors.
First, you should prove to yourself that your program would run faster on multiple cores. Many operating systems put effort into running program threads on the same core whenever possible.
Running on the same core has many advantages. The CPU cache is hot, meaning that data for that program is loaded into the CPU. The lock/monitor/synchronization objects are in CPU cache which means that other CPUs do not need to do cache synchronization operations across the bus (expensive!).
One thing that can very easily make your program run on the same CPU all the time is over-use of locks and shared memory. Your threads should not talk to each other. The less often your threads use the same objects in the same memory, the more often they will run on different CPUs. The more often they use the same memory, the more often they must block waiting for the other thread.
Whenever the OS sees one thread block for another thread, it will run that thread on the same CPU whenever it can. It reduces the amount of memory that moves over the inter-CPU bus. That is what I guess is causing what you see in your program.
JVM performance tuning has been mentioned before in http://stackoverflow.com/questions/2867278/why-does-this-java-code-not-utilize-all-cpu-cores/2867731#2867731. Note that this only applies to the JVM, so your application must already be using threads (and more or less "correctly" at that):
http://ch.sun.com/sunnews/events/2009/apr/adworkshop/pdf/5-1-Java-Performance.pdf