tags:

views:

66

answers:

4

Hi, I am working on a project which contains two servers, one is written in python, the other in C. To maximize the capacity of the servers, we defined a binary proprietary protocol by which these two could talk to each other.

The protocol is defined in a C header file in the form of C struct. Usually, I would use VIM to do some substitutions to convert this file into Python code. But that means I have to manually do this every time the protocol is modified.

So, I believe a parser that could parse C header file would be a better choice. However, there are at least a dozen of Python parser generator. So I don't which one is more suitable for my particular task.

Any suggestion? Thanks a lot.


EDIT:

Of course I am ask anyone to write me the code....

The code is already finished. I converted the header file into Python code in the form that construct, a python library which could parse the binary data, could recognize.

I am also not looking for some already exist C parser. I am asking this question because a book I am reading talks a little about parser generator inspired me to learn how to use a real parser generator.


EDIT Again:

When we make the design of the system, I suggested to use Google Protocol Buffer, ZeroC ICE, or whatever multi-language network programming middleware to eliminate the task of implementing a protocol.

However, not every programmer could read English documents and would like to try new things, especially when they have plenty of experience of doing it in old and simple but a little clumsy way.

A: 

I would personally use PLY:

http://www.dabeaz.com/ply/

And there is already a C parser written with PLY:

http://code.google.com/p/pycparser/

FogleBird
A: 

A C struct is unlikely to be portable enough to be sending between machines. Different endian, different word-sizes, different compilers will all change the way the structure is mapped to bytes.

It would be better to use a properly portable binary format that is designed for communications.

Douglas Leeder
My colleagues are well trained to write code to handle endian translation, word size padding, etc.
ablmf
Then the endian-defined padding-defined binary format is what you should be coding against. And the C-struct is just a temporary representation of that.
Douglas Leeder
+1  A: 

If I were doing this, I would use IDL as the structure definition language. The main problem you will have with doing C structs is that C has pointers, particularly char* for strings. Using IDL restricts the data types and imposes some semantics.

Then you can do whatever you want. Most parser generators are going to have IDL as a sample grammar.

hughdbrown
+1  A: 

As an alternative solution that might feel a bit over-ambitious from the beginning, but also might serve you very well in the long-term, is:

  • Redefine the protocol in some higher-level language, for instance some custom XML
  • Generate both the C struct definitions and any required Python versions from the same source.
unwind
that's exactly what protocol buffers do
Javier