views:

1125

answers:

8

I've got a doozy of a problem here. I'm aiming to build a framework to allow for the integration of different traffic simulation models. This integration is based upon the sharing of link connectivities, link costs, and vehicles between simulations.

To make a distributed simulation, I plan to have a 'coordinator' (star topology). All participating simulations simply register with it, and talk only to the coordinator. The coordinator then coordinates the execution of various tasks between each simulation.

A quick example of a distribution problem, is when one simulation is 'in charge' of certain objects, like a road. And another is 'in charge' of other roads. However, these roads are interconnected (and hence, we need synchronisation between these simulations, and need to be able to exchange data / invoke methods remotely).

I've had a look at RMI and am thinking it may be suited for this task. (To abstract out having to create an over-wire signalling discipline).

Is this sane? The issue here, is that simulation participants need to centralize some of their data storage in the 'coordinator' to ensure explicit synchronisation between simulations. Furthermore, some simulations may require components or methods from other simulations. (Hence the idea of using RMI).

My basic approach is to have the 'coordinator' run a giant RMI registry. And every simulation simply looks up everything in the registry, ensuring that the correct objects are used at each step.

Anyone have any tips for heading down this path?

A: 

Take a look at JINI, it might be of some use to you.

Dev er dev
Why do people vote down without a comment?
Mystic
+5  A: 

Is this sane? IMHO no. And I'll tell you why. But first I'll add the disclaimer that this is a complicated topic so any answer has to be viewed as barely scratching the surface.

First instead of repeating myself I'll point you to a summary of Java grid/cluster technologies that I wrote awhile ago. Its a mostly complete list.

The star topology is "natural" for a "naive" (I don't mean that in a bad way) implementation because point-to-point is simple and centralizing key controller logic is also simple. It is however not fault-tolerant. It introduces scalability problems and a single bottleneck. It introduces communication inefficiences (namely the points communicate via a two-step process through the center).

What you really want for this is probably a cluster (rather than a data/compute grid) solution and I'd suggest you look at Terracotta. Ideally you'd look at Oracle Coherence but it's no doubt expensive (compared to free). It is a fantastic product though.

These two products can be used a number of ways but the core of both is to treat a cache like a distributed map. You put things in, you take things out and you fire off code that alters the cache. Coherence (with which I'm more familiar) in this regards scales fantastically well. These are more "server" based products though for a true cluster.

If you're looking at a more distributed model then perhaps you should be looking at more of an SOA based approach.

cletus
cletus, it seems to me that the star topology you describe sounds an awful lot like an ESB, so all your criticisms apply. Would you agree?
duffymo
A: 

Well, Jini, or more specifically Javaspaces is a good place to start for a simple approach to the problem. Javaspaces lets you implement a master-worker model, where your master (coordinator in your case) writes tasks to the Javaspace, and the workers query for and process those tasks, writing the results back for the master. Since your problem is not embarrassingly parallel, and your workers need to synchronize/exchanging data, this will add some complexity to your solution.

Using Javaspaces will add a whole lot more abstraction to your implementation that using plain RMI (which is used by the Jini framework internally as the default "wire protocol").

Have a look at this article from sun for an intro.

And Jan Newmarch's Jini Tutorial is a pretty good place to start learning Jini

Mystic
If you're going to go Javaspaces, personally I would only go for GigaSpaces http://www.gigaspaces.com/ (commercial though).
cletus
+1 on GigaSpaces. I thought Jini was a wonderful idea when I first heard about it at JavaOne in 1999, but I don't see that it's ever gotten traction. Explanations might be high licensing costs, proprietary wire protocol, etc. Feels dead as a doornail. I'm surprised by the recommendation.
duffymo
Jini is conceptually clean and simple, you can use any wire-protocol you like and its open and free for commercial use. I'm not sure what you mean by dead, but as for popularity, I think it was bad marketing strategy from Sun when they showcased Jini as a technology for (small) devices.
Mystic
"dead" == "I'm not aware of anybody in my small world talking or thinking about or using Jini." Admittedly my world is limited.
duffymo
Well ok :), but calling it dead is a little inappropriate. It being a light weight framework for service-oriented computing, can only be dead if service-oriented computing is dead, because thats what Jini does. If one were to call Jini dead, RMI would be dead long before.
Mystic
Jini as I knew it implied all Java participants. The advantage that I see with SOA, via either HTTP or XML, is that the protocols are open and endpoints can be written in either Java or .NET.
duffymo
I think that is still true. But implementing a .Net service/client should be no biggy. Quite true, Jini is SOA, does not mandate what protocols are used. If fact you could implement Web Services on top of Jini.
Mystic
A: 

GridGain is a good alternative. They have a map/reduce implementation with "direct API support for split and aggregation" and "distributed task session". You can browse their examples and see if some of them fits with your needs.

Kind Regards

marcospereira
+5  A: 

You may want to check out Hazelcast also. Hazelcast is an open source transactional, distributed/partitioned implementation of queue, topic, map, set, list, lock and executor service. It is super easy to work with; just add hazelcast.jar into your classpath and start coding. Almost no configuration is required.

If you are interested in executing your Runnable, Callable tasks in a distributed fashion, then please check out Distributed Executor Service documentation at http://code.google.com/docreader/#p=hazelcast

Hazelcast is released under Apache license and enterprise grade support is also available.

Talip Ozturk
I never knew about Hazelcast. Was amazed with its simplicity and elegance. Wonder why you never got an up-vote. +1
Mystic
+1  A: 

Have you considered using a message queue approach? You could use JMS to communicate/coordinate tasks and results among a set of servers/nodes. You could even use Amazon's SQS (Simple Queue Service: aws.amazon.com/sqs) and have your servers running on EC2 to allow you to scale up and down as required.

Just my 2 cents.

simonlord
Yeah, I'd agree that a message queue approach seems to be the most sound (perhaps an ESB ish solution) but the time spent setting up the infrastructure vs returns is rather low.The project im working on is really a prototype.
Alex Lim
+1  A: 

Have a look at http://www.terracotta.org/

its a distributed Java VM, so it has the advantage of being clustered application looks no different than a standard Java application.

I have used it in applications and the speed is very impressive so far.

Paul

Paul Whelan
A: 

Just as an addition to the other answers which as far as I have seen all focus on grid and cloud computing, you should notice that simulation models have one unique characteristic: simulation time.

When running distributed simulation models in parallel and synchronized then I see two options:

  • When each simulation model has its own simulation clock and event list then these should be synchronized over the network.
  • Alternatively there could be a single simulation clock and event list which will "tick the time" for all distributed (sub) models.

The first option has been extensively researched for the High Level Architecture (HLA) see for example http://en.wikipedia.org/wiki/IEEE_1516 as a starter.

However the second option seems more simple and with less overhead to me.

Roy