views:

135

answers:

3

I'm using OpenGL ES 1.1 to render a large view in an iPhone app. I have a "screenshot"/"save" function, which basically creates a new GL context offscreen, and then takes exactly the same geometry and renders it to the offscreen context. This produces the expected result.

Yet for reasons I don't understand, the amount of time (measured with CFAbsoluteTimeGetCurrent before and after) that the actual draw calls take when sending to the offscreen buffer is more than an order of magnitude longer than when drawing to the main framebuffer that backs an actual UIView. All of the GL state is the same for both, and the geometry list is the same, and the sequence of calls to draw is the same.

Note that there happens to be a LOT of geometry here-- the order of magnitude is clearly measurable and repeatable. Also note that I'm not timing the glReadPixels call, which is the thing that I believe actually pulls data back from the GPU. This is just a mesaure of the time spent in e.g. glDrawArrays.

I've tried:

  • Render that geometry to the screen again just after doing the offscreen render: takes the same quick time for the screen draw.
  • Render the offscreen thing twice in a row-- both times show the same slow draw speed.

Is this an inherent limitation of offscreen buffers? Or might I be missing something fundamental here?

Thanks for your insight/explanation!

A: 

Could offscreen rendering be forcing the GPU to flush all its normal state, then do your render, flush the offscreen context, and have to reload all the normal stuff back in from CPU memory? That could take a lot longer than any rendering using data and frame buffers that stays completely on the GPU.

hotpaw2
Thanks for the thought. I made an edit to the above after testing it with the offscreen draw sandwiched between two screen draws-- still fast for both screen draws, crawls offscreen.
quixoto
How about multiple offscreen draws back-to-back? (Render N copies into the same framebuffer without saving.) Are they all equally slow?
hotpaw2
Yes, two back to back are equally slow. And I'm even omitting the `glReadPixels` call, which I believe would be the thing that actually causes the bytes to come back to the CPU. This is literally just the time spent in the draw calls.
quixoto
@hotpaw2: updated question with these findings also. thanks.
quixoto
A: 

I'm not an expert on the issue but from what I understand graphics accelerators are used for sending data off to the screen so normally the path is Code ---vertices---> Accelerator ---rendered-image---> Screen. In your case you are flushing the framebuffer back into main memory which might be hitting some kind of bottleneck in bandwidth in the memory controller or something or other.

Novikov
Thanks for the idea. I don't think this is likely, because the gap in performance is SO great here. The memory bandwidth isn't the limiting factor.
quixoto
@quixoto, GPU memory bandwidth or CPU memory bandwidth isn't the issue. The CPU bandwidth is high and the GPU memory bandwidth is high. The issue is the bandwidth between CPU memory and GPU memory, which appears from all reports to be maybe an order of a magnitude slower (there might be some slow software process or hardware conversion in the middle). If it's a hardware bottleneck, there's no way around it, since there's no other way to get pixels from here to there.
hotpaw2
@hotpaw2: Thoughts on how could I more effectively diagnose this?
quixoto
+1  A: 

Your best bet is probably to sample both your offscreen rendering and window system rendering each running in a tight loop with the CPU Sampler in Instruments and compare the results to see what differences there are.

Also, could you be a bit more clear about what exactly you mean by “render the offscreen thing twice in a row?” You mentioned at the beginning of the question that you “create a new GL context offscreen”—do you mean a new framebuffer and renderbuffer, or a completely new EAGLContext? Depending on how many new resources and objects you’re creating in order to do your offscreen rendering, the driver may need to do a lot of work to set up these resources the first time you use them in a draw call. If you’re just screenshotting the same content you were putting onscreen, you shouldn’t even need to do any of this—it should be sufficient to call glReadPixels before -[EAGLContext presentRenderbuffer:], since the backbuffer contents will still be defined at that point.

Pivot
@Pivot: Your discussion on these topics is consistently useful and welcome. I'm creating a whole new context, render buffer, and framebuffer for the offscreen render. It's not a "pure" screenshot, the background is a different color in the screenshot (actually simpler-- less geometry), so I need to render it offscreen. I assumed I needed to create a whole new context, etc, to do this; could it be done safely with the same context and a fresh render and/or framebuffer?
quixoto
You definitely don’t need to create a new context for this. If the dimensions of your screenshot match the dimensions of your onscreen view, you don’t need a new framebuffer or renderbuffer, either, since the results of your drawing don’t appear onscreen unless you explicitly swap. Just render to your regular view, call glReadPixels instead of -[EAGLContext presentRenderbuffer:], and be sure to clear your buffers completely before you start drawing onscreen again if you want to make sure nothing is left over.
Pivot
Thanks for all the great info and direction. Haven't nailed this one totally down yet, but this is the right start.
quixoto