views:

506

answers:

4

This has been driving me nuts for days. I can't get an array to align to 16 if I declare it as static.

Any help greatly appreciated.

Revised Version:

#include <stdio.h>
#include <assert.h>

#define MAX_INPUTS 250

int main()
{
float input[MAX_INPUTS] __attribute__ ((__aligned__(16)));
printf("Address of input: %p\n", input);

printf("Assert1: %x\n", ( ((int) (input))      )        );
printf("Assert2: %x\n", ( ((int) (input)) % 16 )        );
printf("Assert3: %x\n", ( ((int) (input)) % 16 ) == 0   );

assert (     ( ((int) (input))      )        );  
assert (     ( ((int) (input)) % 16 )        );  /* Fails */
assert (     ( ((int) (input)) % 16 ) == 0   );  /* Passes */

return 0;
}

The output is:

Address of input: 0022FB70
Assert1: 22fb70
Assert2: 0
Assert3: 1
Assertion failed: ( ((int) (input)) % 16 ), file aligntest.c, line 16

As one would expect, Assert 2 fails because the address ends in 0. However, with:

static float input[MAX_INPUTS] __attribute__ ((__aligned__(16)));

the output is:

Address of input: 00404028
Assert1: 404028
Assert2: 8
Assert3: 1
Assertion failed: ( ((int) (input)) % 16 ), file aligntest.c, line 16

Assert 2 still fails, although the result is non-zero. When Assert2 is commented out, Assert3 passes (with or without the static declaration) and the program terminates normally.

I'm using MinGw gcc 4.4.0 on an Intel Core 2 Duo, running XP Pro.

A: 
DigitalRoss
+1  A: 

From here:

Note that the effectiveness of aligned attributes may be limited by inherent limitations in your linker. On many systems, the linker is only able to arrange for variables to be aligned up to a certain maximum alignment. (For some linkers, the maximum supported alignment may be very very small.) If your linker is only able to align variables up to a maximum of 8 byte alignment, then specifying aligned(16) in an __attribute__ will still only provide you with 8 byte alignment. See your linker documentation for further information.

I know why the assert isn't happening, it's because the expression is true - not sure exactly why the expression is true, but you should break it down in cases like this. Add these to your debug statements:

printf("Assert1: %x\n", ( ((int) (input))));
printf("Assert2: %x\n", ( ((int) (input)) % 16 ));
printf("Assert3: %x\n", ( ((int) (input)) % 16 ) == 0);

and show us the results.

Also check which gcc version you're running - 4.3.1 and earlier appear to have a problem with alignment. Cygwin appears to have both gcc3 and gcc4 packages, assuming that's what you're using - if not, still check the version.

Update 1: Actually I think @Falaina has nailed it in a comment below. Here's a plausible explanation.

GCC is figuring out from the source code that input is indeed (supposed to be) aligned to 16 bytes. It's smart enough to the drop the asserts altogether and to just print out 1 for the printf's.

However, at the link stage, the linker (not as capable as GCC) is not able to guarantee alignment to 16 bytes, instead opting for 8 (see my quote above). By then it's too late to get the asserts and non-optimized printfs back into the code. So the actual executable will not assert (since they've been taken out) and it will print the optimized 1 rather than calculating it an runtime.

The reason the volatile fixes it in @pmg's answer is because GCC will not optimize the expressions that contain volatile components. It leaves the asserts in and calculates the print arguments at runtime properly.

If this turns out to be the case, this is without a doubt one of the more devious problems I've seen. I hesitate to call it a bug since both gcc and ld are acting as advertised - it's the combination of factors that are screwing things up.

paxdiablo
Assert1: 404028 Assert2: 8I'm using MinGW, and have just now upgraded from:gcc (GCC) 4.3.0 20080305 (alpha-testing) mingw-20080502 togcc (GCC) 4.4.0but to no avail.
Ian Shaw
You get 8 for assert2 and 1 for the original assert still? Please confirm because that's dead wrong. Please add assert 3 and recheck.
paxdiablo
Assert2 fails, even though the printf reports 8.<pre>Address of input: 00404028Assert1: 404028Assert2: 8Assert3: 1Assertion failed: ( ((int) (input)) % 16 ), file aligntest.c, line 25</pre>Assert3 passes (when I comment out Assert2)
Ian Shaw
Just a suggestions. Are you compiling with optimizations? It's possible the compiler is trying to be clever and going "Hey, this variable is aligned to 16 bytes, obviously the address % 16 is 0" and replacing all your checks with 1. Just a thought.
Falaina
You've nailed it, I believe, @Falaina. I'd be happy for you to copy my update 1 into your own answer and get it accepted since it was your insight that made it click. @pmg deserves a few votes too, since that answer supplied another important piece of the puzzle.
paxdiablo
Ah, that's not necessary, plus you provided a very helpful break down. If this is indeed what is occurring, I think I would consider it bug. While GCC (well it's devs) can certainly make the argument "I asked for 16 byte alignment, so I can assume it for the rest of the compilation", but that's certainly less conservative than I'd like my compiler to be, especially in regards to something like assertions.
Falaina
You may as well take the rep. With all my editing of my answer, it's gone community wiki anyway, I'm already at my 200 daily cap, and I don't need any more rep - I'm actually taking a sabbatical now that I've hit 50K but we'll see how long that lasts :-)
paxdiablo
Haha, I"m about to go out so I'll pass, but I'd urge @pmg to to edit your break down into his answer then since he did quite a bit of work in figuring out what worked/didn't.
Falaina
Thank you, Pax, for editing my answer :)
pmg
A: 

The conversion from pointer to array of float to int is not necessarily meaningful. If pointer to array of floats is larger than int you lose some information, which maybe the "low order bits" of the pointer.

Try this:

#include <stdio.h>
#include <string.h>

void print_pointer(void *ptr) {
  unsigned char data[sizeof (void*)];
  size_t k;

  memmove(data, &ptr, sizeof (void*));
  printf("ptr: ");
  for (k=0; k<sizeof (void*); k++) {
    printf(" %02x", data[k]);
  }
  puts("");
}

Do the same for an int

void print_int(int value) { /* ... */ }

and compare your findings.

pmg
I get the same output for both functions.ptr: 28 40 40 00
Ian Shaw
+2  A: 

On my machine at work (Windows Vista, MinGW gcc 4.3.2) your code didn't produce any assembler for the asserts at any optimization level!

To get the asserts to be generated I had to come up with a volatile int variable and compile with -O0 flag.

int main(void) {
  float input[MAX_INPUTS] __attribute__ ((__aligned__(16)));
  static float input_static[MAX_INPUTS] __attribute__ ((__aligned__(16)));
  volatile int addr_as_int;

  printf("Address of input: %p\n", &input);
  addr_as_int = (int)input;
  print_pointer(input);
  print_int(addr_as_int);
  printf("normal int: %08x; int%%16: %02x\n", addr_as_int, addr_as_int%16);
  printf("Assert: %d\n", (addr_as_int % 16) == 0);
  assert((addr_as_int % 16) == 0); /* Passes */

  printf("Address of input_static: %p\n", &input_static);
  addr_as_int = (int)input_static;
  print_pointer(input_static);
  print_int(addr_as_int);
  printf("static int: %08x; int%%16: %02x\n", addr_as_int, (addr_as_int)%16);
  printf("Assert: %d\n", (addr_as_int % 16) == 0);
  assert((addr_as_int % 16) == 0); /* Does not Pass */

  return 0;
}

I have no idea why the compiler chose to remove the asserts from the object file. I quick google search didn't reveal anything interesting.

Update 1 (added by @Pax at suggestion of @Falaina - we all suggest you accept this one if it turns out to be the case):

Actually I think @Falaina has nailed it in a comment to @Pax's answer:

Just a suggestions. Are you compiling with optimizations? It's possible the compiler is trying to be clever and going "Hey, this variable is aligned to 16 bytes, obviously the address % 16 is 0" and replacing all your checks with 1. Just a thought.

Here's the explanation. GCC is figuring out from the source code that input is indeed (supposed to be) aligned to 16 bytes. It's smart enough to the drop the asserts altogether and to just print out 1 for the printfs.

However, at the link stage, the linker is not able to guarantee alignment to 16 bytes, instead opting for 8 because (from @Pax):

Note that the effectiveness of aligned attributes may be limited by inherent limitations in your linker. On many systems, the linker is only able to arrange for variables to be aligned up to a certain maximum alignment. (For some linkers, the maximum supported alignment may be very very small.) If your linker is only able to align variables up to a maximum of 8 byte alignment, then specifying aligned(16) in an __attribute__ will still only provide you with 8 byte alignment. See your linker documentation for further information.

By then it's too late to get the asserts and non-optimized printfs back into the code. So the actual executable will not assert (since they've been taken out) and it will print the optimized 1 rather than calculating it an runtime.

The reason the volatile fixes it in my answer is because GCC will not optimize the expressions that contain volatile components. It leaves the asserts in and calculates the printf arguments at runtime properly.


You can manually align your array if you don't mind declaring it a little bit larger than stricly necessary:

#include <assert.h>
#include <stdio.h>

#define MAX_INPUTS 250

void *force_align(void *base, size_t s, int align) {
  size_t x;
  int k = 0;
  x = (size_t)base;
  while ((k < align / (int)s) && (x % align)) {
    k++;
    x += s;
  }
  if (k == align) return NULL;
#if 0
  printf("%d elements 'discarded'\n", k);
#endif
  return (void*)((size_t)base + k*s);
}

int main(void) {
  #define ALIGNMENT_REQ 16
  #define EXTRA_ALIGN_REQ (ALIGNMENT_REQ / sizeof (float))
  static float misaligned_input[MAX_INPUTS + EXTRA_ALIGN_REQ]
        __attribute__ ((__aligned__(ALIGNMENT_REQ)));
  float *input;

  /* manual alignment, check for NULL */
  assert( (input = force_align(misaligned_input, sizeof *input, ALIGNMENT_REQ)) );

  printf("Address of misaligned input: %p\n", misaligned_input);
  printf("Address of input: %p\n", input);
  printf("Assert1: %x\n", ( ((int) (input))                 )      );
  printf("Assert2: %x\n", ( ((int) (input)) % ALIGNMENT_REQ )      );
  printf("Assert3: %x\n", ( ((int) (input)) % ALIGNMENT_REQ ) == 0 );
  assert ( ( ((int) (input))                 )      );
#if 0
  assert ( ( ((int) (input)) % ALIGNMENT_REQ )      );  /* Fails */
#endif
  assert ( ( ((int) (input)) % ALIGNMENT_REQ ) == 0 );  /* Passes */

  return 0;
}
pmg
@pmg, between this answer and a comment by @Falaina to my answer, you guys have nailed it. The reason you need the volatile - without it, gcc assumes it is aligned(16) and optimizes away the asserts (and just prints 1). But the linker doesn't honor that although it's too late to change the object file by then. See my update to your answer for the gory details.
paxdiablo
Well, I wasn't compiling with any optimization for my test program, but I'll certainly be wanting to, so this makes sense.I find it hard to believe that the linker will align auto variables, but not static. At least, not deliberately. I tried searching to find out what alignment it supports, but struck out. That leaves me wondering whether there are some system specific compilation settings I'm missing. I'll try to see what happens when compiling on another PC.
Ian Shaw
@Ian Shaw - The linker isn't in charge of alignment for auto variables. The compiler can generate assembly that gaurantees alignment for stack variable (for example if I bitwise-AND the stack pointer with -16 before allocating the stack, I"ve gauranteed alignment), the compiler not gaurantee the same for globals (and statics are implement as globals at the end of the day), the best it can do is tell the linker it wants a certain alignment for the variable.
Falaina
Also, I don't know if you've tried yet Ian, but you should definitely attempt to rerun your program with volatile and report back with the results. If it fails correctly at the assertion, then the issue is solved (though you may be out of luck with aligning your variables :(
Falaina
Alternatively you could (1) have a array of structs: "typedef struct {float input;float rubbish} inp_str; inp_str input[250];" and use input[].input to ensure the input's are aligned; or (2) make your array input[MAX_INPUTS*2] and only use the even indexes. But now we're getting into serious kludge territory :-)
paxdiablo
@Falaina - The output of this version works as expected - the assertion fails when the input address % 16 is non-zero:This works for any level of optimisation, the only difference being the address reported for "input".This use of "volatile" resolves the assertion issue, but, sadly, not the main issue of getting an aligned array, as you note.
Ian Shaw
@Pax - I'm not sure your suggestions will work for my application. I'll be wanting to load contiguous blocks of four floats into a 128-bit SSE register, and I think that will not be possible if I pad out the structure with unused elements.
Ian Shaw
My colleague has complied the same file on his PC, and got aligned inputs. We are both using the same linker: GNU ld (GNU Binutils) 2.19.1I am starting to think it must be something about my system.
Ian Shaw