views:

744

answers:

5

What is the difference between aligned and unaligned memory access?

I work on an TMS320C64x DSP, and I want to use the intrinsic functions (C functions for assembly instructions) and it has

ushort & _amem2(void *ptr);
ushort & _mem2(void *ptr);

, where _amem2 does an aligned access of 2bytes and _mem2 does unaligned access.

When should I use which?

+2  A: 

_mem2 is more general. It'll work if ptr is aligned or not. _amem2 is more strict: it requires that ptr be aligned (though is presumably slightly more efficient). So use _mem2 unless you can guarantee that ptr is always aligned.

Laurence Gonsalves
How can I guarantee that my data is aligned or not? I appreciate if you can provide a specific simple example.
Can Bal
It depends on where you're getting ptr from. For example, if ptr came from a call to malloc and your system guarantees that the result is aligned then ptr will be aligned. You'll need to check the docs for your malloc to see if it guarantees alignment. If ptr is the address of a field in a struct then it depends on whether the whole struct is aligned, and if your compiler is "packing" or aligning the fields of structs. You'll need to consult your compiler docs to know for sure. Until you're sure, just use _mem2, as it will work whether ptr is aligned or not.
Laurence Gonsalves
You can assume any object created by C or C++ is properly aligned unless you did specifically did something to prevent it (like declare unusual packing). But objects created in assembly code may not be.
Max Lybbert
+4  A: 

Many computer architectures store memory in "words" of several bytes each. For example, the Intel 32-bit architecture stores words of 32 bits, each of 4 bytes. Memory is addressed at the single byte level, however; therefore an address can be "aligned", meaning it starts at a word boundary, or "unaligned", meaning it doesn't.

On certain architectures certain memory operations may be slower or even completely not allowed on unaligned addresses.

So, if you know your addresses are aligned on the right addresses, you can use _amem2(), for speed. Otherwise, you should use _mem2().

Avi
+1  A: 

Many processors have alignment restrictions on memory access. Unaligned access either generates an exception interrupt (e.g. ARM), or is just slower (e.g. x86).

_mem2 is probably implemented as fetching two bytes and using shift and or bitwise operations to make a 16-bit ushort out of them.

_amem2 probably just reads the 16-bit ushort from the specified ptr.

I don't know TMS320C64x specifically but I'd guess it requires 16-bit alignment for 16-bit memory accesses. So you can use _mem2 always but with performance penalty, and _amem2 when you can guarantee that ptr is an even address.

laalto
+3  A: 
Doug
+1  A: 

Aligned addresses are those which are multiples of the access size in question.

  • Access of 4 byte words on addresses that are multiple of 4 will be aligned
  • Access of 4 bytes from the address (say) 3 will be unaligned access

It is very likely that the _mem2 function which will work also for unaligned accesses will be less optimal to get the correct alignments working in its code. This means that the _mem2 function is likely to be costlier then its _amem2 version.

So, when you need performance (particularly when you know that the access latency is high) it would be prudent to identify when you can use the aligned access. The _amem2 exists for this very purpose -- to give you performance when you know the access is aligned.

When it comes to 2 byte accesses, identifying aligned operations is very simple.
If all the access addresses for the operation are 'even' (that is, their LSB is zero), you have 2-byte alignment. This can be easily checked with,

if (address & 1) // is true
    /* we have an odd address; not aligned */
else
    /* we have an even address; its aligned to 2-bytes */
nik