views:

532

answers:

4

We have a large high-performance software system which consists of multiple interacting Java processes (not EJBs). Each process can be on the same machine or on a different machine.

Certain events are generated in one process, and are then propagated in different ways to other processes for further processing and so on.

For benchmarking purposes, we need to create a log of when each event passed through a "checkpoint", eventually combine these logs to obtain a timeline of how each event propagated through the system and with what latency (of course, process switching and IPC adds latency, which is ok).

The problem, of course, is clock synchronization. So here are my questions:

1) If all processes are on the same machine, is it guaranteed that currentTimeMilis would be accurate at the time of call? Is there some bound on the errors of ITP?

2) If some processes may be on different machines, is there an off-the-shelf solution (that is also free or open-source) for clock synchronization? I am preferably looking for a solution that may bypass the operating system (Windows or Linux) and work straight from Java. I am also ideally looking for something that can operate at microsecond accuracy. I've thought about NTP, but I'm not sure if it's available via Java rather than through the OS, and I am not sure about its complexity.

3) Is there a way to determine the margin of error in using NTP in a particular configuration (or of any solution I end up using) so that I can give a margin of error on our calculation of the latency?

Thanks!

+3  A: 

With distributed programming, clock synchronisation is often not enough. you might want to build a logical time framework (such as the Lamport or vector clocks or Singhal-Kshemkalyani methods ... and there are loads more to keep causality in sync across machines). Which you choose often depends on the application and required causality between events.

Clocks are sync'd to ensure concurrent events are kept in the right sequential order. There are other ways to do this than keeping the system clock synchronized ... which unless they share a common physical clock ... is quite tricky.

In terms of NTP error margin, there are solutions:

my recommendation:

Read: Distributed Computing: Principles, Algorithms and Systems

Especially: Chapter 3, logical time

Edit

Further to Cheeso's post, I found

http://www.uniforum.org/publications/ufm/apr96/opengroup.html

http://sourceforge.net/projects/freedce

There maybe DCE Java bindings out there.

Aiden Bell
I remember the book from college. Our project right now is looking for something quick and reasonably accurate, we can't be the first people to need something like this. I was wondering if there is an off-the shelf implementation.
Uri
Not that I know of, but with JVM there will be alot of latencies. The program environment isn't highly coupled and can stall (on GC and so on). Im not a Java expert however.
Aiden Bell
A: 

The old DCE ("Distributed Computing Environment") used to have a distributed time synch solution, with all those capabilities. It was called DTS. The admin could configure the set of machines to sync, and the latency or uncertainty was calculated and available. If any machine got out of sync, it's clock was slowly adjusted until it was in synch again. There was a guarantee that time on any machine would never be adjusted backward (in violation of basic physics). The network needed at least one NTP input in order to stay synched with "the real world".

I don't know what happened to that time synch stuff, or the DCE code in general.

Seems to me you don't need a solution "in Java". You need to sync the clocks of a set of distributed machines. The Java app is just the thing that runs on the machines.

Cheeso
My understanding was that most OSs use a synchronization method with a high margin of errors (About 10-100ms), which is acceptable for most practices. I was thinking about doing it through Java since I may be able to use a service with higher accuracy.
Uri
@Cheeso - DCE was ahead of its time :)
Aiden Bell
@Uri - yes I see you want a higher-accuracy solution. I just mean to say, it may be possible to sync independently of any Java code. It may be possible to factor the time sync part out of the Java application completely, so that an design assumption in the Java app is that "time is synchronized". But that design choice is yours to make! b
Cheeso
@Aiden - I know! We just keep re-inventing the same stuff.
Cheeso
@Cheeso - Same old story though.
Aiden Bell
+1  A: 

I'd really just use NTP. It's pretty accurate even over the internet, and on a LAN it should be even better. According to Wikipedia[1],

NTPv4 can usually maintain time to within 10 milliseconds (1/100 s) over the public Internet, and can achieve accuracies of 200 microseconds (1/5000 s) or better in local area networks under ideal conditions.

so it may be good enough for your needs if your conditions are "ideal" enough. NTP has been around long enough that pretty much everything works with it. I don't see any reason to do this through Java rather than the OS. If the OS is synced up, so will be Java.

[1] Wikipedia: Network Time Protocol

Adam Jaskiewicz
A: 

I encountered this thread after trying something on my own (should have searched first!) http://snippets.dzone.com/posts/show/11345 - may be a good method, may be bad, but it's distributed (serverless) which is nice.

Benjamin