views:

49

answers:

2

I have a Python application which sends 556 bytes of data across the network at a rate of 50 Hz. The binary data is generated using struct.pack() which returns a string, which is subsequently written to a UDP socket.

As well as transmitting this data, I would like to save this data to file as space-efficiently as possible, including a timestamp for each message, so that I can replay the data at a later time. What would be the best way of doing this using Python?

I have mulled over using a logging object, but have not yet found out whether Python can read in log files so that I can replay the data. Also, I don't know whether the logging object can handle binary data.

Any tips would be much appreciated! Although Wireshark would be an option, I'd rather store the data using my application so that I can automatically start new data files each time I run the program.

+4  A: 

Python's logging system is intended to process human-readable strings, and it's intended to be easy to enable or disable depending on whether it's you (the developer) or someone else running your program. Don't use it for something that your application always needs to output.

The simplest way to store the data is to just write the same 556-byte string that you send over the socket out to a file. If you want to have timestamps, you could precede each 556-byte message with the time of sending, converted to an integer, and packed into 4 or 8 bytes using struct.pack(). The exact method would depend on your specific requirements, e.g. how precise you need the time to be, and whether you need absolute time or just relative to some reference point.

David Zaslavsky
+1: just write ... to a file. Perfect. Simple. Scalable.
S.Lott
Many thanks for the response, David. I ended up saving each string, prefixed by the value returned by [time.clock()](http://docs.python.org/library/time.html#time.clock) and the message length as a float and unsigned short respectively. That should work fine.
Ham
+1  A: 

One possibility for a compact timestamp for replay purposes...: set the time as a floating point number of seconds since the epoch with time.time(), multiply by 50 since you said you're repeating this 50 times a second (the resulting unit, one fiftieth of a second, is sometimes called "a jiffy"), truncate to int, subtract from the similar int count of jiffies since the epoch that you measured at the start of your program, and struct.pack the result into an unsigned int with the number of bytes you need to represent the intended duration -- for example, with 2 bytes for this timestamp, you could represent runs of about 1200 seconds (20 minutes), but if you plan longer runs you'd need 4 bytes (3 bytes is just too unwieldy IMHO;-).

Not all operating systems have time.time() returning decent precision, so you may need more devious means if you need to run on such unfortunately limited OSs. (That's VERY os-dependent, of course). What OSs do you need to support...?

Anyway...: for even more compactness, use a slightly higher multiplier than 50 (say 10000) for more accuracy, and store, each time, the difference wrt the previous timestamp -- since that difference should not be much different from a jiffy (if I understand your spec correctly) that should be about 200 or so of these "tenth-thousands of a second" and you can store a single unsigned byte (and have no limit wrt the duration of runs you're storing for future replay). This depends even more on accurate returns from time.time() of course.

If your 556-byte binary data is highly compressible, it will be worth your while to use gzip to store the stream of timestamp-then-data in compressed form; this is best assessed empirically on your actual data, though.

Alex Martelli
@Alex, I only need to support Windows XP for the moment. Thanks for the info, though, I think storing the result of time.clock() with each data packet should be exact enough for my needs. Your idea of compressing the data is a good one too. It turns out the data is HIGHLY compressible, with 7zip compressing the data to 5% of its original size!
Ham