views:

99

answers:

6

Hi,

I need to write a small tool that parses a textual input and generates some binary encoded data. I would prefer to stay away from C and the like, in favour of a higher level, (optionally) safer, more expressive and faster to develop language.

My language of choice for this kind of tasks usually is Python, but for this case dealing with binary raw data can be problematic if one isn't very careful with the numbers being promoted to bignums, sign extensions and such.

Ideally I would like to have records with named bitfields that are portable to be serialised in a consistent manner.

(I know that there's a strong point in doing it in a language I already master, although it isn't optimal, but I think this could be a good opportunity to learn something new).

Thanks.

+4  A: 

Strangely enough, I think Erlang might fit the bill. Ignoring, unless you want to use them, the parallel facilities, it has native facilities for treating strings of bits very easily. Examine the documentation under the term bit syntax.

High Performance Mark
Great idea, I was thinking about learning Erlang anyway, so this might be perfect :-D BTW, what about parsing, does it have decent string handling or parser generators?
fortran
To quote from Joe Armstrong's bool (invaluable if you are going to learn Erlang) 'Strictly speaking, there are no strings in Erlang. Strings are really just lists of integers.' From this I conclude that it does have decent string handling, but you may decide otherwise. As for parser generators, I haven't a clue.
High Performance Mark
+3  A: 

I second the vote for Erlang; despite its oddities, it has excellent support for bit-level control of binary data. (As it must; it's a telecoms language.) Another language worth looking into is PADS, which is a more special-purpose language (also from the telecoms industry) designed for high-speed processing of ad hoc data. I believe PADS supports binary data, but I can't swear to it.

Norman Ramsey
+2  A: 

If you wanted to stay in Python an option is the bitstring module, which takes away most of the pain of dealing with binary data.

It's pretty straightforward to construct and parse arbitrary binary structures, so might be worth a look if Erlang doesn't work out for you!

Scott Griffiths