Hello guys,
I need help designing a system that will be used for analyzing and visualizing some 3D seismic tomography. I'v just graduated, I don't have a lot of experience and this task just landed on me.
Here is the setup:
- 3 Super Computing Clusters. Analysis is currently controlled by a bash script that coordinates 5 executable (Each being a stage in the analysis algorithm). The output of one executable is the input to another. The files are several hundred MB and can take a few hours to produce.
- Someone recommended we write a "Control Server" to submit jobs from the clients to the clusters.
- The clients need to be able to display 3D visualizations when the output from each stage in the analysis (The executable above) is available.
I thought of using a a Java Servlet as the Control Server (CS) to accept client requests for jobs. The clients would send a small input file and some parameters to the CS, the CS would assign the job an ID.
The CS would then somehow start the process on the clusters (RMI? HTTP? Another Servlet? Some Custom Server Listening on a Socket?). When each stage of the computation on the clusters is done it will some how notify the CS that a specific file is ready. (Again, what technology is appropriate for contacting CS?)
Then when a client requests a specific output file for visualization it will ask the CS for the address of the file on the clusters to download it directly from the cluster.
I was thinking of using a Java Applet for the client so I could use the OpenGL API for the 3D visualization.
I'm pretty lost and the solution above sounds pretty clumsy to me. What technologies would be better suited? Should I use http to transfer the data? Should I code my own servers with the socket API and use the Java IO streams? How do the clusters communicate back with the CS that a file is ready after a few hours? Do I even want to use Java Servlets?