I am working on a scientific application that has readily separable parts that can proceed in parallel. So, I've written those parts to each run as independent threads, though not for what appears to be the standard reason for separating things into threads (i.e., not blocking some quit command or the like).
A few questions:
Does this actually buy me anything on standard multi-core desktops - i.e., will the threads actually run on the separate cores if I have a current JVM, or do I have to do something else?
I have few objects which are read (though never written) by all the threads. Potential problems with that? Solutions to those problems?
For actual clusters, can you recommend frameworks to distribute the threads to the various nodes so that I don't have to manage that myself (well, if such exist)? CLARIFICATION: by this, I mean either something that automatically converts threads into task for individual nodes or makes the entire cluster look like a single JVM (i.e., so it could send threads to whatever processors it can access) or whatever. Basically, implement the parallelization in a useful way on a cluster, given that I've built it into the algorithm, with the minimal job husbandry on my part.
Bonus: Most of the evaluation consists of set comparisons (e.g., union, intersection, contains) with some mapping from keys to get the pertinent sets. I have some limited experience with FORTRAN, C, and C++ (semester of scientific computing for the first, and HS AP classes 10 years ago for the other two) - what sort of speed/ease of parallelization gains might I find if I tied my Java front-end to an algorithmic back-end in one of those languages, and what sort of headache might my level of experience find implementing those operations in those languages?