views:

1710

answers:

5

Can someone provide an example were casting a pointer from one type to another fails due to mis-alignment?

In the comments to this answer, bothie states that doing something like

char * foo = ...;
int bar = *(int *)foo;

might lead to errors even on x86 if alignment-checking is enabled.

I tried to produce an error condition after setting the alignment-check flag via set $ps |= (1<<18) in GDB, but nothing happened.

What does a working (ie non-working ;)) example look like?


None of the code snippets from the answers fail on my system - I'll try it with a different compiler version and on a different pc later.

Btw, my own test code looked like this (now also using asm to set AC flag and unaligned read and write):

#include <assert.h>

int main(void)
{
    #ifndef NOASM
    __asm__(
        "pushf\n"
        "orl $(1<<18),(%esp)\n"
        "popf\n"
    );
    #endif

    volatile unsigned char foo[] = { 1, 2, 3, 4, 5, 6 };
    volatile unsigned int bar = 0;

    bar = *(int *)(foo + 1);
    assert(bar == 0x05040302);

    bar = *(int *)(foo + 2);
    assert(bar == 0x06050403);

    *(int *)(foo + 1) = 0xf1f2f3f4;
    assert(foo[1] == 0xf4 && foo[2] == 0xf3 && foo[3] == 0xf2 &&
        foo[4] == 0xf1);

    return 0;
}

The assertion passes without problems, even though the generated code definitely contains the unaligned access mov -0x17(%ebp), %edx and movl $0xf1f2f3f4,-0x17(%ebp).


So will setting AC trigger a SIGBUS or not? I couldn't get it to work on my Intel dual core laptop under Windows XP with none of the GCC versions I tested (MinGW-3.4.5, MinGW-4.3.0, Cygwin-3.4.4), whereas codelogic and Jonathan Leffler mentioned failures on x86...

+2  A: 
char *foo = "....";
foo++;
int *bar = (int *)foo;

The compiler would put foo on a word boundary, and then when you increment it it's at a word+1, which is invalid for a int pointer.

Paul Tomblin
It appears that Core2Duo in 64 bit mode doesn't produce this error.
Paul Tomblin
On x86 it's prefectly valid to access misaligned data as long as you've not set the flag to give an exception on unaligned data. It's just potentially slower.
John Burton
+2  A: 

char *foo is probably aligned to int boundaries. Try this:

int bar = *(int *)(foo + 1);
jdigital
That is exactly what I did - did you check this? What happens that is not supposed to happen?
Christoph
+1  A: 
#include <stdio.h>

int main(int argc, char **argv)
{
  char c[] = "a";

  printf("%d\n", *(int*)(c));
}

This gives me a SIGBUS after setting set $ps |= (1<<18) in gdb, which apparently is thrown when address alignment is incorrect (amongst other reasons).

EDIT: It's fairly easy to raise SIGBUS:

int main(int argc, char **argv)
{
    /* EDIT: enable AC check */
    asm("pushf; "
        "orl $(1<<18), (%esp); "
        "popf;");

    char c[] = "1234567";
    char d[] = "12345678";
    return 0;
}

Looking at main's disassembly in gdb:

Dump of assembler code for function main:
....
0x08048406 <main+34>:   mov    0x8048510,%eax
0x0804840b <main+39>:   mov    0x8048514,%edx
0x08048411 <main+45>:   mov    %eax,-0x10(%ebp)
0x08048414 <main+48>:   mov    %edx,-0xc(%ebp)
0x08048417 <main+51>:   movl   $0x34333231,-0x19(%ebp)   <== BAM! SIGBUS
0x0804841e <main+58>:   movl   $0x38373635,-0x15(%ebp)
0x08048425 <main+65>:   movb   $0x0,-0x11(%ebp)

Anyhow, Christoph your test program fails under Linux raising a SIGBUS as it should. It's probably a Windows thing?


You can enable the Alignment Check bit in code using this snippet:

/* enable AC check */
asm("pushf; "
    "orl $(1<<18), (%esp); "
    "popf;");

Also, ensure that the flag was indeed set:

unsigned int flags;
asm("pushf; "
    "movl (%%esp), %0; "
    "popf; " : "=r"(flags));
fprintf(stderr, "%d\n", flags & (1<<18));
codelogic
I fail to reproduce this with gcc (GCC) 3.4.5 (mingw-vista special r3), I'll try later with a different version...
Christoph
This was compiled with gcc 4.3.2 running kernel 2.6.27 on an Intel Core 2 Duo.
codelogic
Also, it doesn't throw a SIGBUS without setting the flag you mentioned.
codelogic
The reason it might be failing is that it's attempting to dereference an int from 2 byte char[], although like I said, it doesn't happen when the flag isn't set.
codelogic
perhaps it's a weird windows thing - neither gcc3.4.5 nor gcc4.3.0 produces failing code for me...
Christoph
also, I wanted the code to fail due to misalignment, so you should provide enough chars to not fail because of illegal data access...
Christoph
I have no idea why you think that "fairly easy SIGBUS" program would cause a SIGBUS. There is absolutely nothing wrong with it.
Paul Tomblin
Paul, it'll only raise SIGBUS if you set the AC flag like Christoph mentioned in the question, while debugging in gdb.
codelogic
but the program is totally valid. it's a bug in something else then. you don't read unaligned data there
Johannes Schaub - litb
@litb: Yes, the program is valid. However the instruction "movl $0x34333231,-0x19(%ebp)" is performing an assignment of the value 0x34333231 to the address starting at (%ebp - 0x19), which in this case is not aligned, hence the exception.
codelogic
Added an ASM snippet that should enable alignment check without requiring gdb.
codelogic
Is it possible that there's a bug in your code reading the flag?
Christoph
In old architectures (SPARC, Vax) cast of an unaligned char to int would produce an immediate core dump. But those compilers would align a char[] to a word boundary. It looks like gcc on Intel doesn't any more, possibly because it doesn't normally cause a core dump.
Paul Tomblin
@Christoph: not sure, but it seems to be doing the right thing. Prints '262144' only if AC is enabled, otherwise prints 0.
codelogic
@codelogic: there's a pair of parentheses missing - it should read `movl (%%esp), %0`
Christoph
Oops, my test program was already using (%%esp), which is why I was getting expected results. I must have fixed it after updating my answer.
codelogic
+13  A: 

The situations are uncommon where unaligned access will cause problems on an x86 (beyond having the memory access take longer). Here are some of the ones I've heard about:

  1. You might not count this as x86 issue, but SSE operations must deal with properly aligned data or performance is severely impacted (though it does not produce an exception, which is what I thought occurred). Apparently Intel fixed this to make the performance impact less of an issue starting with 'Penryn' CPUs - it's still not as efficient as properly aligned data, but the impact is much less severe. Thanks to kquinn for pointing this out.

  2. interlocked operations must operate on aligned data to ensure they are atomic on multiprocessor systems (see https://blogs.msdn.com/oldnewthing/archive/2004/08/30/222631.aspx)

  3. and another possibility discussed by Raymond Chen is when dealing with devices that have hardware banked memory (admittedly an oddball situation) - https://blogs.msdn.com/oldnewthing/archive/2004/08/27/221486.aspx

  4. I recall (but don't have a reference for - so I'm not sure about this one) similar problems with unaligned accesses that straddle page boundaries that also involve a page fault. I'll see if I can dig up a reference for this.

And I learned something new when looking into this question (I was wondering about the "$ps |= (1<<18)" GDB command that was mentioned in a couple places). I didn't realize that x86 CPUs (starting with the 486 it seems) have the ability to cause an exception when a misaligned access is performed.

From Jeffery Richter's "Programming Applications for Windows, 4th Ed":

Let's take a closer look at how the x86 CPU handles data alignment. The x86 CPU contains a special bit flag in its EFLAGS register called the AC (alignment check) flag. By default, this flag is set to zero when the CPU first receives power. When this flag is zero, the CPU automatically does whatever it has to in order to successfully access misaligned data values. However, if this flag is set to 1, the CPU issues an INT 17H interrupt whenever there is an attempt to access misaligned data. The x86 version of Windows 2000 and Windows 98 never alters this CPU flag bit. Therefore, you will never see a data misalignment exception occur in an application when it is running on an x86 processor.

This was news to me.

Of course the big problem with misaligned accesses is that when you eventually go to compile the code for a non-x86/x64 processor you end up having to track down and fix a whole bunch of stuff, since virtually all other 32-bit or larger processors are sensitive to alignment issues.

Michael Burr
"SSE operations must deal with aligned data" is not necessarily true anymore. On recent Intel CPUs (Penryn and newer, I think), the "aligned" and "unaligned" SSE ops actually do the same thing, which happens to be a bit slower if the access is unaligned.
kquinn
Unaligned SSE reads are actually *a lot* slower.
Crashworks
Yeah, on pre-Penryn CPUs unaligned SSE reads will kill performance. Supposedly on Penryn, though (I don't have one to benchmark), the CPU will just do two aligned reads and use the Penryn "Super Shuffle Engine" to piece them back together into the requested unaligned read, so they're not so slow.
kquinn
Thanks - I've added this SSE information to the article.
Michael Burr
I just benched it -- Penryn unaligned SSE ops still have about three times the latency of an aligned read, although their throughput is better than it was in previous cores. This is consistent with the behavior you describe (and with the way that VMX handles it, eg load load shuffle).
Crashworks
Just curious - ant idea what the hit was pre-Penryn? A wild-ass-guess is fine by me, since I'm just curious.
Michael Burr
7x, if my memory serves me correctly.
Crashworks
Minor correction to "interlocked operations must operate on aligned data to ensure they are atomic on multiprocessor systems". Interlocked operations will work on unaligned data on X86, they just happen to have edge cases that are *MUCH* slower but your code shouldn't crash. FWIW, you don't have to use full alignment for interlocked operations on some PowerPC's as well (for example a certain game system made by Microsoft will handle 64-bit interlocks that are only 32-bit aligned just fine).
Adisak
@Adisak - thanks for the clarification on that point. Definitely an area where if you're depending on that behavior, you'd better know what's what.
Michael Burr
+1  A: 

There is an additional condition, not mentioned, for EFLAGS.AC to actually take effect. CR0.AM must be set to prevent INT 17h from tripping on older OSes predating the 486 that have no handler for this exception. Unfortunately, Windows do not set it by default, you need to write a kernel-mode driver to set it.

Yuhong Bao