tags:

views:

131

answers:

3

I am currently working on exposing data from legacy system over the web. I have a (legacy) server application that sends and receives data over UDP. The software uses UDP to send sequential updates to a given set of variables in (near) real-time (updates every 5-10 ms). thus, I do not need to capture all UDP data -- it is sufficient that the latest update is retrieved.

In order to expose this data over the web, I am considering building a lightweight web server that reads/write UDP data and exposes this data over HTTP.

As I am experienced with Python, I am considering to use it.

The question is the following: how can I (continuously) read data from UDP and send snapshots of it over TCP/HTTP on-demand with Python? So basically, I am trying to build a kind of "UDP2HTTP" adapter to interface with the legacy app so that I wouldn't need to touch the legacy code.

A solution that is WSGI compliant would be much preferred. Of course any tips are very welcome and MUCH appreciated!

+4  A: 

Twisted would be very suitable here. It supports many protocols (UDP, HTTP) and its asynchronous nature makes it possible to directly stream UDP data to HTTP without shooting yourself in the foot with (blocking) threading code. It also support wsgi.

Ivo van der Wijk
+2  A: 

The software uses UDP to send sequential updates to a given set of variables in (near) real-time (updates every 5-10 ms). thus, I do not need to capture all UDP data -- it is sufficient that the latest update is retrieved

What you must do is this.

Step 1.

Build a Python app that collects the UDP data and caches it into a file. Create the file using XML, CSV or JSON notation.

This runs independently as some kind of daemon. This is your listener or collector.

Write the file to a directory from which it can be trivially downloaded by Apache or some other web server. Choose names and directory paths wisely and you're done.

Done.

If you want fancier results, you can do more. You don't need to, since you're already done.

Step 2.

Build a web application that allows someone to request this data being accumulated by the UDP listener or collector.

Use a web framework like Django for this. Write as little as possible. Django can serve flat files created by your listener.

You're done. Again.

Some folks think relational databases are important. If so, you can do this. Even though you're already done.

Step 3.

Modify your data collection to create a database that the Django ORM can query. This requires some learning and some adjusting to get a tidy, simple ORM model.

Then write your final Django application to serve the UDP data being collected by your listener and loaded into your Django database.

S.Lott
Thank you for your elaborate answer. But no need to be rude, mate. I am very well aware I am dealing with a complex problem, thank you. I agreed with you that within longer timespan persistence, cache, etc. become issues. Right now, however, I merely need a proof-of-concept. At this point any working - even ugly and hacky code - that proves me how to read snapshots of UDP data and send them over HTTP is sufficient.
jsalonen
Please note as well that I edited my question to be more specific!
jsalonen
Sorry it appears rude. The question -- as originally posed -- was too vague and too full of holes to be taken seriously at face value. Without **details** and **specifics** the implied approach could not possibly work. Please avoid vague questions without **details** or **specifics**.
S.Lott
Thank you for your revised answer, it us much appreciated! I know my initial question wasn't a diamond. If you feel like specifics or details are lacking, please consider asking for them instead of flaming (I know I would be tempted) -- as demonstrated I am happy to provide them when asked for.
jsalonen
Also, I think this approach is very much plausible.One point: using filesystem to store UDP data sounds like allright. However, I am concerned that I would run into some file locking problems. But yeah you are right, it might be sufficient for an initial prototype (and later on we could switch to a database).
jsalonen
Sorry it appeared like flame. Your perceptions are your own. If details are lacking, you get bad answers. The choices are yours to fix the question or complain. You elected to fix. Cheers. "However, I am concerned that I would run into some file locking problems"? What kind? You wouldn't be locking anything, would you? Why lock anything? What **specific** concern about locking are you worried about?
S.Lott
Good question! I am concerned how filesystems handle concurrent read/write, but was able figure out answer quickly with some googling: http://stackoverflow.com/questions/2751734/how-do-filesystems-handle-concurrent-read-write - this pretty much covers my concern.
jsalonen
I am choosing this answer as my accepted answer since it provides the best heuristics for building what I need on the level of detail that was provided in my question. I might need to use Twisted in the future, but for now filesystem access should be just fine.THANK YOU!
jsalonen
@jsalonen: Do not make this more complex than it is. It's actually trivial. Organizations like the US National Weather Service do this constantly with monstrous data sets. The data sets and analyses are assigned simple names and tossed into directories for download. The kind of "download the latest" is what ftpd (and the httpd) were designed to do without creating any additional "application" software.
S.Lott
+1  A: 

Here's a quick "proof of concept" app using the twisted framework. This assumes that the legacy UDP service is listening on localhost:8000 and will start sending UDP data in response to a datagram containing "Send me data". And that the data is 3 32bit integers. Additionally it will respond to an "HTTP GET /" on port 2080.

You could start this with twistd -noy example.py:

example.py

from twisted.internet import protocol, defer
from twisted.application import service
from twisted.python import log
from twisted.web import resource, server as webserver

import struct

class legacyProtocol(protocol.DatagramProtocol):
    def startProtocol(self):
        self.transport.connect(self.service.legacyHost,self.service.legacyPort)
        self.sendMessage("Send me data")
    def stopProtocol(self):
        # Assume the transport is closed, do any tidying that you need to.
        return
    def datagramReceived(self,datagram,addr):
        # Inspect the datagram payload, do sanity checking.
        try:
            val1, val2, val3 = struct.unpack("!iii",datagram)
        except struct.error, err:
            # Problem unpacking data log and ignore
            log.err()
            return
        self.service.update_data(val1,val2,val3)
    def sendMessage(self,message):
        self.transport.write(message)

class legacyValues(resource.Resource):
    def __init__(self,service):
        resource.Resource.__init__(self)
        self.service=service
        self.putChild("",self)
    def render_GET(self,request):
        data = "\n".join(["<li>%s</li>" % x for x in self.service.get_data()])
        return """<html><head><title>Legacy Data</title>
            <body><h1>Data</h1><ul>
            %s
            </ul></body></html>""" % (data,)

class protocolGatewayService(service.Service):
    def __init__(self,legacyHost,legacyPort):
        self.legacyHost = legacyHost # 
        self.legacyPort = legacyPort
        self.udpListeningPort = None
        self.httpListeningPort = None
        self.lproto = None
        self.reactor = None
        self.data = [1,2,3]
    def startService(self):
        # called by application handling
        if not self.reactor:
            from twisted.internet import reactor
            self.reactor = reactor
        self.reactor.callWhenRunning(self.startStuff)
    def stopService(self):
        # called by application handling
        defers = []
        if self.udpListeningPort:
            defers.append(defer.maybeDeferred(self.udpListeningPort.loseConnection))
        if self.httpListeningPort:
            defers.append(defer.maybeDeferred(self.httpListeningPort.stopListening))
        return defer.DeferredList(defers)
    def startStuff(self):
        # UDP legacy stuff
        proto = legacyProtocol()
        proto.service = self
        self.udpListeningPort = self.reactor.listenUDP(0,proto)
        # Website
        factory = webserver.Site(legacyValues(self))
        self.httpListeningPort = self.reactor.listenTCP(2080,factory)
    def update_data(self,*args):
        self.data[:] = args
    def get_data(self):
        return self.data

application = service.Application('LegacyGateway')
services = service.IServiceCollection(application)
s = protocolGatewayService('127.0.0.1',8000)
s.setServiceParent(services)

Afterthought

This isn't a WSGI design. The idea for this would to use be to run this program daemonized and have it's http port on a local IP and apache or similar to proxy requests. It could be refactored for WSGI. It was quicker to knock up this way, easier to debug.

MattH
Thank you VERY MUCH for answer with actual code. I have stumbled across twisted many times without actually knowing how hard it is to use. This code gives a proof-of-concept code that I can disseminate and use as a starting point.
jsalonen
You're very welcome. I think one of the drawbacks with twisted is a steep learning curve and there can be a high barrier of effort before you have enough of working concept to begin iterative improvements.
MattH