views:

83

answers:

0

I have some SIMD code in Altivec processing 32 bit integer values in parallel. In some cases I want to load the integers as little endian, in other cases as big endian (note: this choice is regardless of the native CPU endianess; it is based on what algorithm is running). Doing the actual byte swap is very easy using Altivec's permute operations, as documented by Apple.

The part I'm worried about is that PowerPC allows either big or little endian operation, and so I don't know if I need to byte swap on little endian loads/stores or big endian loads/stores. (Currently my code just always does it for little endian and never swaps for big endian memory ops, which works fine on the 970 I'm currently using since of course it's running big-endian).

From what I can find, PPCs in little-endian mode are relatively rare, but they do exist, and ideally I'd like to have my code work correctly and quickly regardless of mode.

Is there a way of handling big and little endian loads to AltiVec registers regardless of CPU endianness? Are there other issues related to this I should know about? Wikipedia has the (uncited, naturally) statement:

"AltiVec operations, despite being 128-bit, are treated as if they were 64-bit. This allows for compatibility with little-endian motherboards that were designed prior to AltiVec."

which makes me think there may be other nastiness specific to AltiVec in little-endian mode.