views:

59

answers:

2

From jmorecfg.h:

#define PACK_TWO_PIXELS(l,r)   ((r<<16) | l)
#define PACK_NEED_ALIGNMENT(ptr) (((int)(ptr))&3)
#define WRITE_TWO_PIXELS(addr, pixels) do {     \
         ((INT16*)(addr))[0] = (pixels);        \
         ((INT16*)(addr))[1] = (pixels)>>16;    \
    } while(0)
#define WRITE_TWO_ALIGNED_PIXELS(addr, pixels)  ((*(INT32*)(addr)) = pixels

Can someone explain the difference between WRITE_TWO_PIXELS and WRITE_TWO_ALIGNED_PIXELS? If pixels is a stack allocated uint32_t and addr & 3 == 0, shouldn't they be equivalent?

Thanks.

+2  A: 

In the two macros, the only alignment that matters is the alignment of addr. As written in the question, they are equivalent if addr is 32-bit aligned (meaning its low two bits are zero), but only if the target architecture is also little-endian.

On a big-endian machine, the upper 16 bits of pixels must be written to (INT16*)(addr))[0] and the lower 16 bits to (INT16*)(addr))[1] for them to be equivalent.

Without checking my copy of the libjpeg source code, I'd guess that these definitions are either expected to be modified as part of porting the library, or they are already guarded by a declaration of endianness.

If addr is not 32-bit aligned, then the WRITE_TWO_ALIGNED_PIXELS macro might cause an exception to be thrown on architectures where unaligned access is not permitted. Of course in some cases, unaligned access is permitted, but is much more expensive than two smaller aligned accesses, and on some other architectures, unaligned access is difficult to distinguish from aligned access.

The two macros exist as a reminder to the author of the library to think about alignment, and to standardize the approach to handling misaligned access so that it can be optimized out when building for platforms where it doesn't matter.

RBerteig
"ight cause an exception to be thrown" - even better, apparently on some ARM chips unaligned access doesn't cause a hardware exception, it just silently reads/writes the wrong value. I've only ever used ARMs where it traps, though.
Steve Jessop
@Steve Jessop, Coincidentally, I just helped a coworker debug something where one byte of a packet was zero when it shouldn't be. Turned out to be caused by a 16-bit write to an adjacent odd address, which apparently was silently rounded down to the aligned address, trashing the adjacent field. This is in an ARM core in a proprietary SOC. As far as we know, there were no exceptions thrown, and there is evidence that read got back the value written. The moral is, be careful about misaligned access on any platform, it is *always* a portability issue.
RBerteig
+2  A: 

WRITE_TWO_PIXELS and WRITE_TWO_ALIGNED_PIXELS are equivalent for little endian machines but not for big endian architecture.

[Example edited: thanks to Steve Jessop]

Let, pixels = 0x0A0B0C0D

For big endian machines, WRITE_TWO_PIXELS work as follows:

---------------------
| 0B | 0A | 0D | 0C |
---------------------
  3    2    1    0          <--- addr

where WRITE_TWO_ALIGNED_PIXELS will write as follows:

---------------------
| 0D | 0C | 0B | 0A |
---------------------
  3    2    1    0          <--- addr
Donotalo
Shouldn't the first one be BADC? `(int16)pixels`, which is `0xCD`, at the lower address (which you have on the right), and `(int16)(pixels>>16)` at the higher address (which you have on the left). I may be wrong - having the lower addresses on the right is doing my head in ;-)
Steve Jessop
I think you're right. Endianness is such a confusing thing! I'll update my answer.
Donotalo