views:

283

answers:

4

I am coding a breakout clone. I had one version in which I only had one level deep of structures. This version runs at 70 fps.

For more clarity in the code I decided the code should have more abstractions and created more structs. Most of the times I have two two three level deep of structures. This version runs at 30 fps.

Since there are some other differences besides the structures, I ask you: Does using a lot of structs in C can slow down the code significantly?

For example on the second version, I am using:

struct Breakout
{
   Ball ball;
   Paddle paddle;
   Level* levels;
}

struct Level
{
   Bricks* bricks;
}

So, I am using lots of times breakout.levels[level_in_play].bricks[i].visible for example. Will this be a possible cause?

Thanks.

+7  A: 

Doing a lot of pointer dereferences can be a performance hit. When you split a big struct up into smaller structs, two things happen:

  1. Accessing a member of a sub-struct requires an additional pointer dereference and memory fetch, which is slightly slower, and
  2. You can reduce the locality of reference, which causes more cache misses and page faults and can drastically reduce performance.

The locality of reference one is probably what is biting you here. If possible, try to allocate related structs in the same malloc block, which increases the likelihood that they will be cached together.

Daniel Pryden
Add in some profiling, and this is spot on.
John Percival Hackworth
Have you ever really found the dereference of a pointer to be the cause of a performance problem? Sure it will cause a tiny, almost infinitesimal amount of extra work, but I don't think pointer dereferencing will ever become a problem unless you are performing it a gazillion times for every other operation. Profiling would be the only way to avoid premature pessimization.
Allbite
@Allbite: Apparently I wasn't clear enough. It's not the dereference that can cause the performance hit (at least, not likely -- I'm sure there are edge cases where it matters), it's the fact that you're potentially putting closely related pieces of data at different places in memory. Of course profiling the application is important, but locality of reference problems can be hard to spot with a profiler running, since the profiler is already affecting performance and locality of reference just by being attached.
Daniel Pryden
+4  A: 

In the first place, it's easy and tempting to guess what the problem is. The sneaky thing about guesses is - they are sometimes right. But why guess, when you can find out for drop-dead sure what's taking the time. I recommend this approach.

That said, here's my guess. malloc and free, if you single-step through them at the assembly language level, are probably doing a lot more than you thought. I only allocate memory for structures if I know I will not be doing it at particularly high frequency. If I must allocate/deallocate them dynamically, at high frequency, it helps to have a free list of used copies, so I can just grab them off the list rather than going to malloc all the time.

Nevertheless, take some stackshots. Chances are you can fix a series of problems and make it a whole lot faster.

Mike Dunlavey
+3  A: 

Adding extra layers of dereferencing can cause a little (very little) amount of slowdown overhead. The reason is, each -> that the compiler sees means it has to do an extra memory lookup and offset. For instance, c->b->a requires the compiler to load pointer c into memory, reference it, offset to b, dereference that, offset to a, dereference that, then load a into memory. That's quite a bit of memory work. Doing c.b.a requires the initial load of c, a single add, then direct load of a from memory. That is 2 loads vs 5.

Unless this type of work is being done a ton in small, tight loops, it won't amount to squat for time. If you are doing this in heavy inner loops though (and your compiler isn't helping you), then it could add up. For those cases, consider caching the lowest level struct pointer and working from there.

That said, anytime you bring up performance, step one is to profile. Without a profile, you are guessing. You have made an assertion that struct derefencing is the root of your performance, but without an up to date and valid profile (on a release build) you are guessing and probably wasting time.

Michael Dorgan
+1 - In my experience it is unlikely although not impossible that structs are causing a big slowdown. Profile.
Steve Rowe
+1  A: 

Not "lots of structs" in an of itself, other that the potential for a greater number of memory accesses. "lots of indirection" however is more likely to be a cause. You have to consider how many memory accesses are generated in order to get to the actual data. Data proximity and size may also affect caching, but that is much harder to analyse.

Also since you mentioned in a comment that you are performing dynamic memory allocation, the time taken to find an allocate a block is non-deterministic and variable. If you are repeatedly allocating and freeing blocks during execution of the algorithm (rather than pre-allocating at initialisation for example), this can cause both degradation and variability in performance.

If you have a profiling tool, profile the code to see where the performance hit occurs.

Clifford