tags:

views:

1090

answers:

8

hey,

lets say we have a binary protocol, with fields network ordered (big endian).

struct msg1
{
    int32 a;
    int16 b;
    uint32 c
}

if instead of copying the network buffer to my msg1 and then use the "networkToHost" functions to read msg1

I rearrange / reverse msg1 to

struct msg1
{
    uint32 c
    int16 b;
    int32 a;
}

and simply do a reverse copy from the network buffer to create msg1. In that case, there is no need for networkToHost functions. this idiomatic approach doesn't work in big endian machines but for me this is not a problem. Apart from that, is there any other drawback that I miss?

thanks

P.S. for the above we enforce strict alignment(#pragma pack(1) etc)

+6  A: 

Are you sure this is required? More than likely, your network traffic is going to be your bottleneck, rather than CPU speed.

rlbond
+2  A: 

Depending on how your compiler packs the bytes inside a struct, the 16-bit number in the middle might not end up in the right place. It might be stored in a 32-bit field and when you reverse the bytes it will "vanish".

Seriously, tricks like this may seem cute when you write them but in the long term they simply aren't worth it.

edit

You added the "pack 1" information so the bug goes away but the thing about "cute tricks" still stands - not worth it. Write a function to reverse 32-bit and 16-bit numbers.

inline void reverse(int16 &n)
{
  ...
}

inline void reverse(int32 &n)
{
  ...
}
Jimmy J
thanks for your replies guys.let me explain a bit more:1) the trick is only for incoming msgs, read the msg and construct something useful for my application2)The reason I did that wasn't performance, I just hate having GETers that would do conversion to host order
Don't write a function. Use ntohs, ntohl, htons, and htonl. They already exist.
jmucchiello
@joe_mucchiello, Maybe these overrides can call ntohs and ntohl. I would personally call them "networkToHost" or something, and have them return a value instead of operating on a reference.
strager
+5  A: 

Agree with @ribond -

This has great potential to be very confusing to developers, since they'll have to work to keep these to semantically identical structures separate.

Given that network latency is on the order of 10,000,000x slower than it would take the CPU to process it, I'd just keep them the same.

codekaizen
+17  A: 

Apart from that, is there any other drawback that I miss?

I'm afraid you've misunderstood the nature of endian conversion problems. "Big endian" doesn't mean your fields are laid out in reverse, so that a

struct msg1_bigendian
{
    int32 a;
    int16 b;
    uint32 c
}

on a big endian architecture is equivalent to a

struct msg1_littleendian 
{
   uint32 c;
   int16 b;
   int32 a;
}

on a little endian architecture. Rather, it means that the byte-order within each field is reversed. Let's assume:

a = 0x1000000a;
b = 0xb;
c = 0xc;

On a big-endian architecture, this will be laid out as:

10 00 00 0a
00 0b
00 00 00 0c

The high-order (most significant) byte comes first.

On a little-endian machine, this will be laid out as:

0a 00 00 10
0b 00
0c 00 00 00

The lowest order byte comes first, the highest order last.

Serialize them and overlay the serialized form of the messages on top of each other, and you will discover the incompatibility:

10 00 00 0a  00 0b  00 00 00 0c (big endian)
0a 00 00 10  0b 00  0c 00 00 00 (little endian)

   int32 a  int16 b    int32 c

Note that this isn't simply a case of the fields running in reverse. You proposal would result in a little endian machine mistaking the big endian representation as:

a = 0xc000000; b = 0xb00; c = 0xa000010;

Certainly not what was transmitted!

You really do have to convert every individual field to network byte order and back again, for every field transmitted.

UPDATE:

Ok, I understand what you are trying to do now. You want to define the struct in reverse, then memcpy from the end of the byte string to the beginning (reverse copy) and reverse the byte order that way. In which case I would say, yes, this is a hack, and yes, it makes your code un-portable, and yes, it isn't worth it. Converting between byte orders is not, in fact, a very expensive operation and it is far easier to deal with than reversing the layout of every structure.

A: 

thanks for your replies guys.

let me explain a bit more:

1) the trick is only for incoming msgs, read the msg and construct something useful for my application

2)The reason I did that wasn't performance, I just hate having GETers that would do conversion to host order

I know its confusing, but the protocol isn't going to change often, there would be comments in the code explaining.

Don't answer your question to add details, edit to add details.
jmucchiello
A: 

Please do the programmers that come after you a favor and write explicit conversions to and from a sequence of bytes in some buffer. Trickery with structures will lead you straight into endianness and alignment hell (been there).

starblue
+1  A: 

Unless you can demonstrate that there is a significant performance penalty, you should use the same code to transfer data onto and off the network regardless of the endian-ness of the machine. As an optimization, for the platforms where the network order is the same as the hardware byte order, you can use tricks, but remember about alignment requirements and the like.

In the example, many machines (especially, as it happens, big-endian ones) will require a 2-byte pad between the end of the int16 member and the next int32 member. So, although you can read into a 10-byte buffer, you cannot treat that buffer as an image of the structure - which will be 12 bytes on most platforms.

Jonathan Leffler
+1  A: 

As you say, this is not portable to big-endian machines. That is an absolute dealbreaker if you ever expect your code to be used outside of the x86 world. Do the rest of us a favor and just use the ntoh/hton routines or you'll probably find yourself featured on thedailywtf someady.

HUAGHAGUAH