views:

268

answers:

1

While profiling my app using Pix, I noticed that the GPU is passing (in DX10 mode) most of its time in idle waiting for a resource not available. (and is always in row with the CPU (for example if the CPU is processing frame X, the GPU is also processing frame X) for this problem)

Some note :

1) The app is GPU limited (the CPU is basically idle (20% of CPU usage in the most heavy scene))

My questions are :

1) How do I have to interpret these results? In Pix every frame on the GPU side I see 2-3 little red bar (as far as i know means resource unavailable) and after them a medium/big gray bar (that means GPU idle). The CPU on another side has some operations, a big empty bar and then some other operations (is waiting for something?)

Another note, when the GPU is idle generally the CPU is working. (The contrary is not valid obviously)

2) What calls can make the resource become unavailable?

A MAP with DISCARD is considerated a blocking call?
A query to get the DESC of an object?
Sharing a Shader Effect is considered a contention?
What others?

My general frame is :

41 DrawPrimitives/DrawIndexedPrimitives (most object are instanced)
7/8 Locks on a vertex buffer with discard
9 change of pixel shader/vertex shader
1 setrendertarget

Thanks!

P.S. Screenshot of pix

http://img191.imageshack.us/img191/6800/42594100.jpg

If I use a single draw call (with the same gpu load (for example a particle engine with x particles or an instanced object)) instead of the full game I get a full blue bar and the GPU correctly 2-3 frame behind the CPU...

EDIT : I'm focusing more and more on the Effect Framework that probably is the reason of this problem. I share one effect between more objects to save memory and time to create them. Is this safe to assume without contention?

+1  A: 

What comes to mind with the provided information:

  • Do you use double buffering with vsync? Maybe they are both waiting for the backbuffer to become available. Try triple buffering or immediate presentation.
  • Have you tried locking your vertex buffer with a NOOVERWITE circular strategy instead of 8 times DISCARD? Maybe there is too much memory pressure for the GPU to reallocate a new buffer for your discard. Also, some hardware doesn't allow discarding the same vertex buffer more that X times before it gets to render it's stuff.
  • Since you are sharing the same effect, are the parameters also shared?
Coincoin
1) Tried all types, immediate/double/triple buffering. No change2) I discard a buffer once per frame at max. (there are 8 buffer)3) Mhn....nope, the parameters are separate as they are separate logical class.
feal87