views:

651

answers:

5

I am porting an application to an ARM platform in C, the application also runs on an x86 processor, and must be backward compatible.

I am now having some issues with variable alignment. I have read the gcc manual for __attribute__((aligned(4),packed)) I interpret what is being said as the start of the struct is aligned to the 4 byte boundry and the inside remains untouched because of the packed statement.

originally I had this but occasionally it gets placed unaligned with the 4 byte boundary.

typedef struct
{
unsigned int code;
unsigned int length;
unsigned int seq;
unsigned int request;
unsigned char nonce[16];
unsigned short crc;
} __attribute__((packed)) CHALLENGE;

so I change it to this.

typedef struct
{
unsigned int code;
unsigned int length;
unsigned int seq;
unsigned int request;
unsigned char nonce[16];
unsigned short crc;
} __attribute__((aligned(4),packed)) CHALLENGE;

The understand I stated earlier seems to be incorrect as both the struct is now aligned to a 4 byte boundary, and and the inside data is now aligned to a four byte boundary, but because of the endianess, the size of the struct has increased in size from 42 to 44 bytes. This size is critical as we have other applications that depend on the struct being 42 bytes.

Could some describe to me how to perform the operation that I require. Any help is much appreciated.

A: 

I would guess that the problem is that 42 isn't divisible by 4, and so they get out of alignment if you put several of these structs back to back (e.g. allocate memory for several of them, determining the size with sizeof). Having the size as 44 forces the alignment in these cases as you requested. However, if the internal offset of each struct member remains the same, you can treat the 44 byte struct as though it was 42 bytes (as long as you take care to align any following data at the correct boundary).

One trick to try might be putting both of these structs inside a single union type and only use 42-byte version from within each such union.

Arkku
Note that this "back to back" allocation happens automatically in arrays, which is why the size of the type *must* include those padding bytes to maintain alignment. You can't change array layout with any tricks, and I would not suggest using them anyway.
Roger Pate
+4  A: 

If you're depending on sizeof(yourstruct) being 42 bytes, you're about to be bitten by a world of non-portable assumptions. You haven't said what this is for, but it seems likely that the endianness of the struct contents matters as well, so you may also have a mismatch with the x86 there too.

In this situation I think the only sure-fire way to cope is to use unsigned char[42] in the parts where it matters. Start by writing a precise specification of exactly what fields are where in this 42-byte block, and what endian, then use that definition to write some code to translate between that and a struct you can interact with. The code will likely be either all-at-once serialisation code (aka marshalling), or a bunch of getters and setters.

crazyscot
Ooh, hello Crazyscot. I upvoted your answer before I noticed who you were :-)
Vicky
While I agree with everything else, I'm not sure why you recommend using a char array.
Roger Pate
@Roger: I'm presuming that the OP needs to hold the struct in-memory in the mandated form as well as in a form they can more easily manipulate - unless you're making some other point which I've missed?
crazyscot
Roger Pate
The data struct is essentially a data packet, just before sending I ensure htonl/htons are used on the relevent members, I think that marshalling will ne the right option. I will look at how easy it is to implement as there are about 100 structs that are similar. Thank you very much for you reply
David Ashmore
@Mumbles: If you can use C++ instead of C, you can get it done by writing just a tiny bit of code for each struct (similar to how boost::serialize works). Otherwise (or even in C++, depending), I'd generate the code for your structs so you can use the same input file to generate the serialization functions and always know they're in sync.
Roger Pate
@Roger: Unfortunately this part of the code has to stay in C, I think I am going to have to create serialization functions for my structs, Thank you very much.
David Ashmore
@Mumbles: With 100 structs you'd be well advised to automate that process. At a previous workplace we had a very powerful perl-based code generator which did this for us. It was fiendishly complex, but output for multiple languages and allowed structs to contain other structs...
crazyscot
+2  A: 

This is one reason why reading whole structs instead of memberwise fails, and should be avoided.

In this case, packing plus aligning at 4 means there will be two bytes of padding. This happens because the size must be compatible for storing the type in an array with all items still aligned at 4.

I imagine you have something like:

read(fd, &obj, sizeof obj)

Because you don't want to read those 2 padding bytes which belong to different data, you have to specify the size explicitly:

read(fd, &obj, 42)

Which you can keep maintainable:

typedef struct {
  //...
  enum { read_size = 42 };
} __attribute__((aligned(4),packed)) CHALLENGE;

// ...

read(fd, &obj, obj.read_size)

Or, if you can't use some features of C++ in your C:

typedef struct {
  //...
} __attribute__((aligned(4),packed)) CHALLENGE;
enum { CHALLENGE_read_size = 42 };

// ...

read(fd, &obj, CHALLENGE_read_size)

At the next refactoring opportunity, I would strongly suggest you start reading each member individually, which can easily be encapsulated within a function.

Roger Pate
+2  A: 

What is your true goal?

If it's to deal with data that's in a file or on the wire in a particular format what you should do is write up some marshaling/serialization routines that move the data between the compiler struct that represents how you want to deal with the data inside the program and a char array that deals with how the data looks on the wire/file.

Then all that needs to be dealt with carefully and possibly have platform specific code is the marshaling routines. And you can write some nice-n-nasty unit tests to ensure that the marshaled data gets to and from the struct properly no matter what platform you might have to port to today and in the future.

Michael Burr
+1. Exactly the point I briefly touched on.
Roger Pate
The goal of this struct is to be network packet. I very much like the idea of having an internal structure that is aligned by the compiler so that it fits correctly, and then only construct this packet as and when needed.
David Ashmore
A: 

As I am using linux, I have found that by echo 3 > /proc/cpu/alignment it will issue me with a warning, and fix the alignment issue. This is a work around but it is very helpful with locating where the structures are failing to be misaligned.

David Ashmore