ansaurus

Question

Slow writing to array in C++

Answer 1

+10 A:

It looks like the compiler is optimizing that loop out entirely in the first case.

The total effect of the loop is a no-op, so the compiler just removes it.

Anon. 2010-02-10 02:33:45

aha! I thought as much, is there a way to disable this optimization for benchmarking or is that it?

Laurence Dawson 2010-02-10 02:35:19

It largely depends on what compiler and IDE you're using. Check the help file/man page for what you need to do to disable optimizations.

Anon. 2010-02-10 02:38:50

try using total after the loop; or add -O0 if you are using gcc.

Tristan Su 2010-02-10 02:43:21

Print it out at the end, or else turn this into a function and return the value from the function

ux9i 2010-02-10 02:43:24

@ux9i - Printing comes with it's own performance problems, since I/O operations are horrendously slow. It's a poor way to benchmark things.

Chris Lutz 2010-02-10 02:46:17

Duh, don't time the printing part. The printing part is there only to stop the optimization of the timed part, by providing a "data sink"

MSalters 2010-02-10 10:02:53

Answer 2

+1 A:

I would suspect that what you are seeing is an effect of virtual memory and possibly paging. The malloc call is going to allocate a decent sized chunk of memory that is probably represented by a number of virtual pages. Each page is linked into process memory separately.

You may also be measuring the cost of calling malloc depending on how you timed the loop. In either case, the performance is going to be very sensitive to compiler optimization options, threading options, compiler versions, runtime versions, and just about anything else. You cannot safely assume that the cost is linear with the size of the allocation. The only thing that you can do is measure it and figure out how to best optimize once it has been proven to be a problem.

D.Shawley 2010-02-10 04:18:53

Answer 3

+9 A:

The first example can be implemented using just CPU registers. Those can be accessed billions of times per second. The second example uses so much memory that it certainly overflows L1 and possibly L2 cache (depending on CPU model). That will be slower. Still, 15 ms/100.000 writes comes out to 1.5 ns per write - 667 Mhz effectively. That's not slow.

MSalters 2010-02-10 10:08:53

+1 for pointing out cpu registers. In the second case, the code is stepping through memory, writing bytes (pulling in pages of memory, forcing caches to flush, ... then writing only a single value). In the former case, it can easily be pure L1. This response should be more upvoted.

anon 2010-02-10 10:16:47

great answer, thanks

Laurence Dawson 2010-02-11 19:43:23

Answer 4

+2 A:

It's very simple. In first case You have just 3 variables, which can be easily stored in GPR (general purpose registers), but it doesn't mean that they are there all the time, but they are probably in L1 cache memory, which means thah they can be accessed very fast.

In second case You have more than 100k variables, and You need about 400kB to store them. That is deffinitely to much for registers and L1 cache memory. In best case it could be in L2 cache memory, but probably not all of them will be in L2. If something is not in register, L1, L2 (I assume that your processor doesn't have L3) it means that You need to search for it in RAM and it takes muuuuuch more time.

Tomek Tarczynski 2010-02-10 10:19:35

ansaurus

tags:

views:

answers:

Slow writing to array in C++

related questions