If you are against the text serialization and really want a struct, then do it like most network protocols do with "host to network" when sending and "network to host" when receiving for all fields within the struct. The idea is that all senders no matter what endianness they are always translate to network (big-endian I forget which is which). Then all receivers translate to whatever they are (might also be big-endian which is no change).
There are apis already for this. They are ntohs (network to host short for 16-bit fields) and ntohl (32-bit fields). Then of course htons and htonl.
So for a struct like this:
typedef struct
{
unsigned char stuff1;
unsigned char stuff2;
unsigned short stuff3;
unsigned int stuff4;
}tFoo;
The sending code would do something like:
tFoo to_send;
to_send.stuff1 = local_stuff1;
to_send.stuff2 = local_stuff2;
to_send.stuff3 = htons(local_stuff3);
to_send.stuff4 = htonl(local_stuff4);
The receiving code would do something like:
local_stuff3 = ntohs(from.stuff3);
local_stuff4 = ntohl(from.stuff4);
Note that packing/alignment matters of the struct. You have to be weary of alignment which isn't always the same from compiler to compiler and even for the same compiler on different cpu architectures. It can even change for the datatypes themselves (an int may not be the same size from arch to arch). I attempted to demonstrate that a bit with the first two 8-bit chars, followed by a 16-bit and a 32-bit for a total of 8 bytes. You have to be certain that when you port to a different compiler/arch that you are indeed getting the correct packing/size.
For that reason most people choose serialization and probably why most people of answered with that. It is the least error prone.
Good luck.