ansaurus

Question

C and C++: Array element access pointer vs int

Answer 1

+2 A:

I prefer using myarray[ i ] since it is more clear and the compiler has easier time compiling this to optimized code.

When using pointers it is more complex for the compiler to optimize this code since it's harder to know exactly what you're doing with the pointer.

brickner 2010-05-18 12:10:40

I'm building an interpreter and it is about access speed for variables and constant values, idk if that changes the answer

begun 2010-05-18 12:15:12

Your interpreter would still be compiled using c/c++ compiler. This compiler would have easier time compiling standard code that uses access by index using [] operator instead of using pointer arithmetic. The compiler won't perform worse since it can easily optimize your code to work with pointers if it is better. It would also have easier time doing loop unrolling and other compilation optimizations.

brickner 2010-05-18 12:19:03

I don't perform **any** pointer arithmetic in the interpretation phase. All pointers are calculated during the parsing step and remain unchanged then.

begun 2010-05-18 12:21:06

-1: The performance is worse only at compile time, not at runtime. Is this relevant?

Simon 2010-05-18 12:21:51

You never increment a pointer? Are the addresses always constant?

brickner 2010-05-18 12:22:42

The performance can be worse at run time.

brickner 2010-05-18 12:23:14

Yes. The adresses are determined during the parsing step where variable names are assigned spaces in the variable table. Then the variable names are replaced by pointers pointing to the right location in the variable table. The location never changes.

begun 2010-05-18 12:23:58

@brickner, pointer arithmetic and `[]` are the same, complexity-wise, both at compile-time and runtime. `*(a + i)` is identical to `a[i]` (and `i[a]`, for that matter).

Marcelo Cantos 2010-05-18 12:25:39

So it doesn't matter if you use pointers or access using index performance wise. It does matter for readability and maintainability.

brickner 2010-05-18 12:25:53

@Marcelo Cantos, if you do a loop on i and increment the pointer by one at each iteration (ptr++) or access using [i], incrementing the pointer is more complex for the compiler.

brickner 2010-05-18 12:27:03

@brickner: If [] is pointer arithmetic it has to be slower than pointers that aren't changed or?

begun 2010-05-18 12:27:27

@begun, No, because it is still the same number of operations in assembly.

brickner 2010-05-18 12:29:15

@brickner, `ptr++` can be trivially implemented via a single opcode. How is that more complex?

Marcelo Cantos 2010-05-18 12:44:34

@Marcelo Cantos, think about complex code with pointers you want to do loop unrolling for. It's harder to understand the number of iterations the loop has if you use pointer arithmetic.

brickner 2010-05-18 13:14:27

@brickner, loop unrolling is predicated on the loop variable alone and will be completely unaffected by the pointer increment, which will get unrolled along with whatever other code is inside the body. In any event, we're seriously splitting hairs here.

Marcelo Cantos 2010-05-18 13:19:26

Answer 2

A:

Yes. Having a pointer the address won't be calculated by using the initial address of the array. It will accessed directly. So you have a little performance improve if you save the address in a pointer. But the compiler will usually optimize the code and use the pointer in both cases (if you have statical arrays)

For dynamic arrays (created with new) the pointer will offer you more performance as the compiler cannot optimize array accesses at compile time.

Simon 2010-05-18 12:11:00

Unjustified -1. Compiler cannot optimize things known at runtime.

Simon 2010-05-18 12:16:20

-1: simply not true, in general. You can't make sweeping generalisations like this without any evidence. If you take the time to benchmark some example code with different CPUs and different compilers you will find that in many cases the converse of your assertion is true.

Paul R 2010-05-18 12:16:35

@Downvoter: Please state what is wrong with this answer so I know what not to do...

begun 2010-05-18 12:17:13

So accessing an array by index is faster than/or as fast as accessing it by pointer, even for dynamical arrays whose size is known at runtime?

Simon 2010-05-18 12:18:21

MSalters 2010-05-18 12:22:43

Answer 3

A:

Yes.. when storing myarray[i] pointer it will perform better (if used on large scale...)

Why??

It will save you an addition and may be a multiplication (or a shift..)

Many compilers may optimize that for you in case of static memory allocation. If you are using dynamic memory allocation, the compiler will not optimize it, because it is in runtime!

Betamoo 2010-05-18 12:11:24

-1: this is an overly broad and mostly inaccurate generalisation

Paul R 2010-05-18 12:14:59

Answer 4

+3 A:

It will probably make no difference at all. The compiler will usually be smart enough to know when you are using an expression more than once and create a temporary itself, if appropriate.

Marcelo Cantos 2010-05-18 12:12:27

Answer 5

+3 A:

Compilers can do surprising optimizations; the only way to know is to read the generated assembly code.

With GCC, use -S, with -masm=intel for Intel syntax.

With VC++, use /FA (IIRC).

You should also enable optimizations: -O2 or -O3 with GCC, and /O2 with VC++.

Bastien Léonard 2010-05-18 12:12:51

Answer 6

+1 A:

There should not be much different but by using indexing you avoid all types of different pitfalls that the compiler's optimizer is prone to (aliasing being the most important one) and thus I'd say the indexing case should be easier to handle for the compiler. This doesn't mean that you should take care of aforementioned things before the loop, but pointers in a loop generally just adds to the complexity.

nj 2010-05-18 12:26:40

Answer 7

A:

There will be no substantial difference. Premature optimization is the root of all evil - get a profiler before checking micro-optimizations like this. Also, the myarray[i] is more portable to custom types, such as a std::vector.

DeadMG 2010-05-18 12:36:47

Answer 8

A:

Okay so your questions is, whats faster:

int main(int argc, char **argv)
{
  int array[20];

  array[0] = 0;
  array[1] = 1;

  int *value_1 = &array[1];
  printf("%d", *value_1);
  printf("%d", array[1]);
  printf("%d", *(array + 1));
}

Like someone else already pointed out, compilers can do clever optimization. Of course this is depending on where an expression is used, but normally you shouldn't care about those subtle differences. All your assumption can be proven wrong by the compiler. Today you shouldn't need to care about such differences.

For example the above code produces the following (only snippet):

mov     [ebp+var_54], 1 #store 1
lea     eax, [ebp+var_58] # load the address of array[0]
add     eax, 4 # add 4 (size of int)
mov     [ebp+var_5C], eax
mov     eax, [ebp+var_5C]
mov     eax, [eax]
mov     [esp+88h+var_84], eax
mov     [esp+88h+var_88], offset unk_403000 # points to %d
call    printf
mov     eax, [ebp+var_54]
mov     [esp+88h+var_84], eax
mov     [esp+88h+var_88], offset unk_403000
call    printf
mov     eax, [ebp+var_54]
mov     [esp+88h+var_84], eax
mov     [esp+88h+var_88], offset unk_403000
call    printf

evilpie 2010-05-18 12:38:32

Answer 9

+5 A:

For this code:

int main() {
    int a[100], b[100];
    int * p = b;
    for ( unsigned int i = 0; i < 100; i++ ) {
        a[i] = i;
        *p++ = i;
    }
    return a[1] + b[2]; 
}

when built with -O3 optimisation in g++, the statement:

a[i] = i;

produced the assembly output:

mov    %eax,(%ecx,%eax,4)

and this statement:

*p++ = i;

produced:

mov    %eax,(%edx,%eax,4)

So in this case there was no difference between the two. However, this is not and cannot be a general rule - the optimiser might well generate completely different code for even a slightly different input.

anon 2010-05-18 12:39:24

Answer 10

A:

Short answer: the only way to know for sure is to code up both versions and compare performance. I would personally be surprised if there was a measureable difference unless you were doing a lot of array accesses in a really tight loop. If this is something that happens once or twice over the lifetime of the program, or depends on user input, it's not worth worrying about.

Remember that the expression a[i] is evaluated as *(a+i), which is an addition plus a dereference, whereas *p is just a dereference. Depending on how the code is structured, though, it may not make a difference. Assume the following:

int a[N]; // for any arbitrary N > 1
int *p = a;
size_t i;

for (i = 0; i < N; i++)
  printf("a[%d] = %d\n", i, a[i]);

for (i = 0; i < N; i++)
  printf("*(%p) = %d\n", (void*) p, *p++);

Now we're comparing a[i] to *p++, which is a dereference plus a postincrement (in addition to the i++ in the loop control); that may turn out to be a more expensive operation than the array subscript. Not to mention we've introduced another variable that's not strictly necessary; we're trading a little space for what may or may not be an improvement in speed. It really depends on the compiler, the structure of the code, optimization settings, OS, and CPU.

Worry about correctness first, then worry about readability/maintainability, then worry about safety/reliability, then worry about performance. Unless you're failing to meet a hard performance requirement, focus on making your intent clear and easy to understand. It doesn't matter how fast your code is if it gives you the wrong answer or performs the wrong action, or if it crashes horribly at the first hint of bad input, or if you can't fix bugs or add new features without breaking something.

John Bode 2010-05-18 13:35:35

ansaurus

tags:

views:

answers:

C and C++: Array element access pointer vs int

related questions