ansaurus

Question

How do I reorder vector data using ARM Neon intrinsics?

Answer 1

+1 A:

It looks like you should be able to use the VTRN instruction (e.g. vtrnq_u32) for this. See page 6 of this tutorial.

Paul R 2010-04-11 07:23:26

@Paul: vtrnq_u32 does not help. Actually i need to do a something like VTRN.64, but sadly there is no instruction/intrinsic like VTRN.64.

goldenmean 2010-04-11 07:40:10

@goldenmean: sorry - I see what you mean now - NEON seems to be short of general purpose permute/shuffle operations.

Paul R 2010-04-11 07:46:02

Answer 2

+1 A:

how about something like this:

  int32x4_t q0, q1;

  /* split into 64 bit vectors */
  int32x2_t q0_hi = vget_high_s32 (q0);
  int32x2_t q1_hi = vget_high_s32 (q1);
  int32x2_t q0_lo = vget_low_s32 (q0);
  int32x2_t q1_lo = vget_low_s32 (q1);

  /* recombine into 128 bit vectors */
  q0 = vcombine_s32 (q0_lo, q1_lo);
  q1 = vcombine_s32 (q0_hi, q1_hi);

In theory this should compile to just two move instructions because the vget_high and vget_low just reinterpret the 128 bit Q registers as two 64 bit D registers. vcombine otoh just compiles to one or two moves (depends on register allocation).

Oh - and the order of the integers in the output could be exactly the wrong way around. If so just swap the arguments to vcombine_s32.

Nils Pipenbrinck 2010-04-17 04:08:53

Answer 3

A:

Remember each q register is made up of two d registers, for instance the low part of q0 is d0 and the high part d1. So in fact, this operation is just swapping d0 and d3 (or d1 and d2, it is not entirely clear from your data presentation). There is even a swap instruction to do it in one instruction!

Disclaimer: I don't know Neon intrinsics (I directly code in assembly), though I'd be surprised if this couldn't be done using intrinsics.

Pierre Lebeaupin 2010-05-12 13:03:01

ansaurus

tags:

views:

answers:

How do I reorder vector data using ARM Neon intrinsics?

related questions