tags:

views:

100

answers:

5

I've lots of different structs containing enum members that I have to transmit via TCP/IP. While the communication endpoints are on different operating systems (Windows XP and Linux) meaning different compilers (gcc 4.x.x and MSVC 2008) both program parts share the same header files with type declarations.

For performance reasons, the structures should be transmitted directly (see code sample below) without expensively serializing or streaming the members inside.

So the question is how to ensure that both compilers use the same internal memory representation for the enumeration members (i.e. both use 32-bit unsigned integers). Or if there is a better way to solve this problem...

//type and enum declaration
typedef enum 
{
    A        = 1,
    B        = 2,
    C        = 3
} eParameter;


typedef enum
{
    READY    = 400,
    RUNNING  = 401,
    BLOCKED  = 402
    FINISHED = 403
} eState;


#pragma pack(push,1)
    typedef struct
    {
        eParameter mParameter;
        eState mState;
        int32_t miSomeValue;
        uint8_t miAnotherValue;
        ...
    } tStateMessage;
#pragma pack(pop)


//... send via socket
tStateMessage msg;
send(iSocketFD,(void*)(&msg),sizeof(tStateMessage));

//... receive message on the other side
tStateMessage msg_received;
recv(iSocketFD,(void*)(&msg_received),sizeof(tStateMessage));

Additionally...

  • Since both endpoints are little endian maschines, endianess is not a problem here.
  • And the pack #pragma solves alignment issues satisfactorily.

Thx for your answers, Axel

A: 

I'll answer your question pragmatically because you've chosen a relatively risky path after weighing the performance gains against the possible downsides (at least I hope you have!).

If portability and robustness against future changes to those compilers have also been considered then an empirical approach would be the best guard against problems.

  1. Ensure you are using initializers for the enums (your examples do this) in all cases.
  2. Do empirical testing to see how the data is interpreted on the receiving side.
  3. Record the version numbers of the build tools on both sides and archive them with the source code. Preferably archive the tools as well.
  4. Document everything you did so unforeseen maintenance in the future is not handicapped.
  5. Pray for the best! ;-)
Amardeep
Uh, thx for the tips 3 and 4! You're right, the version numbers of the build tools might become handy for maintenance! (I'm already taking care of 1,2 and 5)
Axel
You also need to standardize on a size of the data, and cast it to/from an integer of that size for transport.
nategoose
+1  A: 

I would advise you to use one of the serialization libraries specially designed for such problems, like:

What you will get is maximum platform portability, an easy way of changing the interface and the type of messages transmitted plus a lot more useful features.

Note that only Avro has an officially supported C API. For Thrift and Protocol Buffers you either make a thin C wrapper over the C++ API or use one of the C APIs, like protobuf-c.

the_void
A: 

This is premature optimization. You have made two costly assumptions without measurements.

The first assumption is that this part of the code is a performance bottleneck in the first place. Is it? Very unlikely. If one is going to make assumptions about performance, then the safe assumption is that the network speed will be the bottleneck, not the code which sends and receives the network messages. This alone should prevent you from ever considering the second assumption.

The second assumption is that serializing the struct portably will be noticeably slower than writing the raw bits of the struct. This assumption is nearly always false.

Skeptical? Measure it! :)

John
* Yes, speed is an important issue here; I forgot to mention the real-time environment in my question...* But you're right I probably should measure the serialization overhead.
Axel
A: 

It is strongly recommended to serialize the data in some way or at least use an indicator about the hardware architecture. Even if you use the same compiler, you can have problems with internal data representations (little endian, big endian etc).

Juliano
A: 

If you don't want to go through serialization, one method I've seen used is to eschew enums and simply use 32-bit unsigned ints and #DEFINEs to emulate enums. You trade away some type safety for some assurances about data format.

Otherwise, you are relying on behaviour that isn't guarenteed in the language specification to be implemented the same way on all your compilers. If you aren't worried about general portability and just want to ensure the same effect on two compilers, it should be possible through trial and error and a lot of testing to get the two to do the same thing. I believe the C99 spec allows enums to internally be the size of int or smaller, but not larger than int. So one thing I've seen done to supposedly hint the compiler in the right direction is:

typedef enum
{
    READY    = 400,
    RUNNING  = 401,
    BLOCKED  = 402,
    FINISHED = 403,
    MAX      = MAX_INT
} eState;

This should limit the compiler's choices for how to store the enum. Note that compilers can violate the standard, however, I know gcc has a non-standard feature where it will allow 64-bit enums if necessary.

Also, check out: http://stackoverflow.com/questions/366017/what-is-the-size-of-an-enum-in-c

bdk
* Trading the type safety would be an option, however it's a thing I'd prefer not to give up. * And thx for the dummy entry "MAX" idea!
Axel