views:

619

answers:

4

Here's the setup... Your system is receiving a stream of data that contains discrete messages (usually between 32-128 bytes per message). As part of your processing pipeline, each message passes through two physically separate applications which exchange the data using a low-latency approach (such as messaging over UDP) or RDMA and finally to a client via the same mechanism.

Assuming you can inject yourself at any level, including wire protocol analysis, what tools and/or techniques would you use to measure the latency of your system. As part of this, I'm assuming that every message that is delivered to the system results in a corresponding (though not equivalent) message being pushed through the system and delivered to the client.

The only tool that I've seen on the market like this is TS-Associates TipOff. I'm sure that with the right access you could probably measure the same information using a wire analysis tool (ala wireshark) and the right dissectors, but is this the right approach or are there any commodity solutions that I can use?

+3  A: 

Your last paragraph is the typical way it needs to be done. The usual suspects in this field (at least as far as I know for market data (wall street) latency) are:

  • TSA (TS Associates)
  • Correlix
  • Corvil
  • Napatech (hardware capture devices)
  • Endace (hardware capture devices)

There was another badly run company that recently burned through their VC money (4 million?).

For data that is processed (let's say at a direct exchange feed or RMDS or other server that changes the protocol) into different formats you need to be able to parse the payloads to correlate the messages. It can be challenging since sometimes data vendors do not expose the message definitions.

I think there are hardware devices that will inject payload information with timestamps in it so the client can see these. Of course, as another poster pointed out - the question of time is very important. All the devices and clients have to have the same reference point for time. It has to be accurate...

The last time I spoke with TSA, an installation with 4 observation points was on the order of $150k. I suspect that the others listed above are similar in price.

The hardware cards listed above start around $2k (for a bare bones card) and go up (significantly) from there.

To do it in software you'd need to have clients using pcap (or something similar) and look at the payloads and try to match them up. In some cases it is difficult to get this to be deterministic - especially at the start of a "session" or if messages are missing from one pipe. Usually after some threshold if you don't match something, you just drop it.

EDIT: DISCLAIMER: I am also part of the venture now and should disclose that.

Tim
++ TipOff works well once tuned to the specifics. You can do it yourself with raw captures but their hardware makes it much easier to get the data and to timestamp it effectively. once you get through the initial phase having something doing it automatically is wonderful.
ShuggyCoUk
A: 

The problem with doing this is much the same as measuring "speed" in space: You have to ask latency relative to what? If you try measuring it on the wire, you'll miss any extra latency in the switching, or in the protocol stack on the receiving side. You can't really measure it end-to-end, as the computers will have two different clocks which it is almost impossible to reconcile w/o introducing small errors (and they drift from each other!)

The only approach that really has any hope is measuring round-trip latency, assuming you have messages that come back from one end acknowledging receipt. UDP doesn't have ACKs in the stack, so they'd have to be coded into the application somewhere. What you do is use something like the x86's high-resolution timer to measure the time between a message being sent and its response appearing.

T.E.D.
I think he wants latency across two points. This is nice to know since if that value changes then it is something that is NOT related to the speed of light - it is related to some bottleneck in the transport.
Tim
I don;t understand what you mean when you say the only approach that has hope is round-trip latency. Can you elaborate?
Tim
Sorry tim. Sometimes I talk like I'm talking to my co-workers, who are working on the same stuff as me and would know what I'm referring to. I added a sentece at the end that might clear it up a bit.
T.E.D.
Agree with both of you, but as you can probably guess I'm dealing with systems that are delivering data one way. Attempting to do rtt to deal with skew and latency is bad enough, but when the timings are in microseconds the best I can begin to do is to monitor the delta of latency to understand if we're getting better or worse over time and under loading conditions. As for measurements, we already use high resolution timers to measure time, but measuring from 2 reference points is subject to clock skew. And measuring from 1 point is subject to transmission loss. Good comments both of you.
Ajaxx
A: 

A recent paper might be of some use (and would also be much cheaper than hardware-based solutions). There are also ways of fairly accurately accounting for clock skew; the last time I seriously looked into one-way latency measurement research (a couple years ago), the most accurate technique was a linear programming algorithm by Sue Moon (with reference code conveniently available here), but without using some rather modern linear programming techniques, it's fairly impractical to do as an online algorithm; it's best just to record timestamps without doing any calculations periodically throughout the day, and then run the LP algorithm on the accumulated data afterwards. There were a few other techniques that were quick enough to be done on-line (including the seminal paper by Vern Paxson), but they were all much less accurate.

strangelydim
A: 

If several more bytes per message won't be an overkill for you, I'd recommend just stamping message at source with full timestamp (64 bits) and on every hop add entry/leave timestamp deltas (one byte per stamp). By analyzing a bidirectional flow you will figure out the clock skew between boxes and then you'll be able to have a full real time latency info for your consideration or for publishing to monitoring tools.

bobah
Many times in this type of environment you don't have control of the content of the messages - meaning you can't just insert information into them. Some exchanges put timestamps into the messages, but I am not sure you can count on that. Note also that there is then a dependency on accurate clock syncing. Also - "...analyzing a bidirectional flow..." is not trivial I think.
Tim
"analyzing a bidirectional flow" can be part of built-in heartbeat. if you can't modify a message but can reliably identify it within a stream, you could probably use snoop/tcpdump at each hop for dumps generation and then postproces dumps to identify matching messages and calculate timing deltas
bobah