tags:

views:

58

answers:

2

Hi,

I've been looking at Linux elf executables on x86, mostly using IDA but also gdb. One thing I've noticed is functions are always loaded at word aligned addresses? Anybody knows the reason of that? I am not aware of any requirement of x86 instructions to start at aligned addresses. And it cannot be due to page alignment cause the page boundary can still be anywhere within the function.

I would appreciate any insight at all.

Thanks.

+1  A: 

For some architectures, the alignment of the data dictates the amount of data that can be copied per operation. For example, trying to copy 32 bits from address 0x4000 might take one 32 bit move operations where copying 32 bits from 0x4001 might take 4 8 bit move operations. Furthermore, using the 32 bit move instruction on the misaligned address might trigger a hardware exception. The hardware exception is handled by copying 8 bits at a time, but is slower than copying from an aligned address.

Edit:

This applies to all data, not just data that will be executed. So function entry points are aligned along with switch targets, string constants, globals, and other data.

drawnonward
This is true, but the question was for code, not data!
Didier Trosset
But code is data and vice versa: It is all just ones and zeros in the end.
graham.reeds
+4  A: 

You are right, instructions do not need to be aligned. On x86 processors, assembly instructions are encoded using variable length codes from 1 to at least 15 bytes.

But instructions are read from a cache usually aligned on 64 bytes, and some parts of the execution pipeline operate faster when code is correctly aligned: decoding, loops, branch prediction, etc.

The best source of info on this are Agner Fog's documents: http://www.agner.org/optimize/

Eric Bainville