tags:

views:

129

answers:

4

Does anyone know how sites that have a real-time feed of a lot of data work? I am referring to something like a stock site, where they can tell you in real time (well, 20 minute delay mostly, but still real-time - 20 minutes as I understand it).

They have thousands of data pieces delivered to them every second, I would imagine: MSFT 25.00 +.23 VOL 12000 ???? for each stock that had a change during some interval.

So, is there just a constant feed of small pushes going on? Or do you think a site will pull from the place that has the real data and say "give me all changes since 12:23:45 CST to now" type query?

I ask this because at work we might have a situation where we need to have at our application's fingertips real time information like this, and it won't make sense to hit our third party provider over and over and over again every second...

+1  A: 

Sites like Twitter feed data to certain approved sites in real-time via XMPP (Wiki link).

ceejayoz
+4  A: 

Generally there is a server/client protocol defined between the 2 parties. In the company I work for the connection is maintained at all times.

Here is info on real time data feeds to go with your stock example

NYSE,NASDAQ

It is common for data providers to also have FTP sites with (delayed) batched data. One that comes to mind is the NWS EMWIN

cwhite
+1  A: 

In the broadest terms, a push model is going to be the best way of achieving "real time" transfer, particularly if you're talking about a large amount of data.

However you do always have a problem when using a purely push model of how to recover from missed data.

Depending on the nature of your data that may not be a problem (thinking of video delivery as an analogue, where the amount of data is huge but there is sufficient redundancy for it to recover from missing data). And if you have any control over the data you may be able to build some redundancy in. For example, on every change event you can provide absolute values rather than changes, or previous value and new value.

Andrew Edgecombe
A: 

I've done this making an attempt to retrieve the stock quote from the source, and falling back to a timestamped on-disk cache of the quote when the main source fails or times out.

Chris Ballance