views:

59

answers:

1

Dear friends,

I need an advice for best practice for a distributed system architecture. I need to use Java and MS SQL server 2008. The main program flow I as follows:

  1. New data is loaded into the database, usually from external process or via ETL.
  2. A main process should monitor periodically for new data in the database.
  3. Once new data is detected it should be dispatched to one or more data processors.
  4. The system may have different data processors, running on different machines
  5. Some of the data processors are state-full and others are stateless.
  6. Once data is processed, a result is submitted to a results queue
  7. The main process collects the results from the queue and writes them to a table in the database.

I need to be able to support sclability, as data processing is suppose to grow. It is requiered to have an administration console for the system. It is required to be able to start and stop data processors remotely. The system will have an administrator interface (I thought of FLEX with Web Services). The system will have a data analysis web interface (I guess I will use FLEX).

Everything will run on windows OS.

I thought of using Tomcat for web services, I am not sure about the main process, whether to make it a standalone application, or can it be a thread that never exit in the Application Server?

I would appreciate any thoughts, help, links Thanks.

A: 

Splitting the application logically and defining exact technologies will help. Since you are considering a distributed architecture, an SOA approach should be considered.

  1. Monitoring change in database. I am not sure if any database server provides some sort of callBackhandlers, but if there isnt one, then a simple DBChangeMonitor module can be written in Java using Java ThreadPoolExecuter. This module can be made a separate service in itself. This will allow managing (configuration/lifecycle management) this module through the SOA container. The responsibility of Threads part of this module will be

    • To poll the changes in database at some defined and configurable interval.
    • Publish the change on to a Topic using JMS. ActiveMQ is a trusted JMS implementation and can be used in this case.
  2. Data Processors Make this module as Service. This will allow to have lifecycle methods like starting/stopping etc for it. Responsibilities for this module would be;

    • Listening the data change event using JMS. Implement a JMSListener using the same JMS provider.
    • Processing the data change.
    • Publishing the changed data on a separate Queue. Use of Queue in this case as against Topic is promoted since there arent multiple modules waiting for this data.
  3. Result Processor This module will be responsible for;

    • Listening for data processed by different DataProcessors.
    • Writing it to database.
  4. SOA Container *Spring* (http://www.springsource.org/) container can be used for holding all the services (DatabaseMonitor, DataProcessors, ResultProcessor). Spring provides a lot of out-of-the-box technologies for such use cases. A management console for looking at running services and configuring their state can be easily encorporated in it.

  5. Web Container *Apache Tomcat* (http://tomcat.apache.org/) can be used as Web Container for the application.

Hope this throws little light on the way a solution can be designed for your problem.

Tushar Tarkas