views:

91

answers:

2

Hi,

I have a C function in which I have 4 pointers and each of them point to different locations of a large 2D array of floats.

Because the ARM assembly functions can only be passed with 4 parameters (r0 - r3), I'm not able to understand how to pass the pointer to my return value, which will become the 5th parameter to my assembly function.

So, to overcome this, I thought of putting all the 4 pointers into an array of pointers, so that I will have 3 more free spots, using which I can pass a pointer to my return value as well.

But, I don't know how I can extract the four individual pointers from my array of pointers, inside the assembly function. I'm failing in my attempts.

Here is a sample of what I'm trying to do.

Program

#include<stdio.h>

void  _my_arm_asm(float32_t *);

float32_t data_array[100][100];

void main()
{
       float32_t *ptr1, *ptr2, *ptr3, *ptr4;

        ptr1 = \\ data_array[value] + (some value);
        ptr2 = \\ data_array[value] + (some other value);
        ptr3 = \\ data_array[value] + (some other value);
        ptr4 = \\ data_array[value] + (some other value);

       float32_t *array_pointers[4];
       array_pointers[0] = ptr1;
       array_pointers[1] = ptr2;
       array_pointers[2] = ptr3;
       array_pointers[3] = ptr4;

       float32x4_t result;

       _my_arm_asm(array_pointers, &result);

        ....
        ....
        ....
       return 0;


}



.text
    .global _my_arm_asm

_my_arm_asm:
            #r0: Pointer to my array of pointers
            #r1: Pointer to my result

        push   {r4-r11, lr}

        # How to access the array of pointers?

        # I previously tried this, is this the right way to do it?

        # mov r4, #0
        # vld4.32 {d0, d1, d2, d3}, [r0, r4]
        # add r4, r4, #1
        # vld4.32 {d4, d5, d6, d7}, [r0, r4] 
        # add r4, r4, #1
        # vld4.32 {d8, d9, d10, d11}, [r0, r4] 
        # add r4, r4, #1
        # vld4.32 {d12, d13, d14, d15}, [r0, r4] 


        ....
        ....
        ....

        pop    {r4-r11, pc}
+3  A: 

The fifth and further parameters (assuming int-sized parameters) are passed on stack. I.e. the fifth parameter will be accessible as [SP], the sixth as [SP,#4] and so on. Read the Procedure Call Standard for the ARM Architecture for the detailed explanations.
That said, you don't have to use assembly to make use of NEON. Check out NEON intrinsics which allow you to do all operations using plain C code.

Igor Skochinsky
+3  A: 

In general, if more than 4 arguments are passed to a function the excess arguments are passed on the stack.

The ARM EABI specifies how compilers should pass arguments to functions (it also specifies which registers a caller can expect to be unchanged across the function call). Your assembly routine can use the same techniques (and probably should unless you have a good reason not to). If nothing else, that'll mean that your assembly function can be easily called from C.

Chapter 5 (The Base procedure Call Standard) of the "Procedure Call Standard for the ARM Architecture" should have the exact details. It's pretty complex on the face of it (becuase there's a lot of detail on alignment, argument size, etc), but I think for your purposes it boils down to that the 5th argument to the function get's pushed onto the stack.

Of course, as you suggest in your question, you could avoid all that by packing your 4 pointers into a structure and passing a pointer to the struct - in your assembly routine you simple load that struct pointer into a register and use that to in turn load the pointers you really need.

I think that the ARM assembly might look something like:

                 // r0 has the 1st parameter
ldr r4, [r0]     // get array_pointers[0] into r4
// ...

ldr r5, [r0, #4] // get array_pointers[1] into r5
// ...

ldr r6, [r0, #8] // get array_pointers[2] into r6

You could also use a 'load multiple' instruction to get all 4 pointers in one shot, but I'm not sure what you register usage requirements/restrictions might be.

Michael Burr
Michael, yes, I did what you have suggested, I packed all the pointers into a structure and passed it to the assembly function and I could access them using ldr's.
vikramtheone