views:

201

answers:

3

When I set glStencilFunc( GL_NEVER, . . . ) effectively disabling all drawing, and then run my [shader-bound] program I get no performance increase over letting the fragment shader run. I thought the stencil test happened before the fragment program. Is that not the case, or at least not guaranteed? Replacing the fragment shader with one that simply writes a constant to gl_FragColor does result in a higher FPS.

+1  A: 

Take a look at the following outline for the DX10 pipeline, it says that the stencil test runs before the pixel shader:

http://3.bp.blogspot.com/_2YU3pmPHKN4/Sz_0vqlzrBI/AAAAAAAAAcg/CpDXxOB-r3U/s1600-h/D3D10CheatSheet.jpg

and the same is true in DX11:

http://4.bp.blogspot.com/_2YU3pmPHKN4/S1KhDSPmotI/AAAAAAAAAcw/d38b4oA_DxM/s1600-h/DX11.JPG

I don't know if this is mandated in the OpenGL spec but it would be detrimental for an implementation to not do the stencil test before running the fragment program.

Andreas Brinck
According to this page: http://www.opengl.org/wiki/Rendering_Pipeline_Overview#Pipeline the stencil and depth tests can run before the fragment shader if the depth is not modified in the fragment shader. However, this does not appear to be happening in my case (I am not writing to gl_FragDepth in the fragment shader).
david
+1  A: 

It's actually a bit of both. Per fragment operations should happen after the fragment program as you can see in this OpenGL ES 2.0 pipeline diagram. However, many modern graphic cards have an early z test that discards fragments earlier as long as you don't write to the depth in the fragment shader.

Here is a paper from AMD/ATI that talks about such tests. I remember reading that the spec allows early tests as long as doing them before the shader produces the same result as doing them after, which is why you wouldn't want to modify the depth or discard a fragment in the shader. This thread on OpenGL forums has some interesting discussion about it.

Firas Assaad
+1  A: 

In addition to fragment depth modification, there are a few other things that can prevent the depth/stencil test from happening before the fragment shader. If z-writes are enabled, then any method of aborting the fragment in the shader will do this, such as alpha-test or the discard shader instruction.

If the GPU wants to do the stencil/z test in the same operation as the z/stencil write, it has to wait until after the fragment shader executes so that it knows the fragment is allowed to write to the z-buffer. This may vary between different cards though. At least it should be easy to tell if it's your current problem.

Alan