views:

1172

answers:

4

I have an Open GL ES game on the iPhone. My framerate is pretty sucky, ~20fps. Using the Xcode OpenGL ES performance tool on an iPhone 3G, it shows:

Renderer Utilization: 95% to 99%

Tiler Utilization: ~27%

I am drawing a lot of pretty large images with a lot of blending. If I reduce the number of images drawn, framerates go from ~20 to ~40, though the performance tool results stay about the same (renderer still maxed). I think I'm being limited by the fill rate of the iPhone 3G, but I'm not sure.

My questions are: How can I determine with more granularity where the bottleneck is? That is my biggest problem, I just don't know what is taking all the time. If it is fillrate, is there anything I do to improve it besides just drawing less?

I am using texture atlases. I have tried to minimize image binds, though it isn't always possible (drawing order, not everything fits on one 1024x1024 texture, etc). Every frame I do 10 image binds. This seem pretty reasonable, but I could be mistaken.

I'm using vertex arrays and glDrawArrays. I don't really have a lot of geometry. I can try to be more precise if needed. Each image is 2 triangles and I try to batch things were possible, though often (maybe half the time) images are drawn with individual glDrawArrays calls. Besides the images, I have ~60 triangles worth of geometry being rendered in ~6 glDrawArrays calls. I often glTranslate before calling glDrawArrays.

Would it improve the framerate to switch to VBOs? I don't think it is a huge amount of geometry, but maybe it is faster for other reasons?

Are there certain things to watch out for that could reduce performance? Eg, should I avoid glTranslate, glColor4g, etc?

I'm using glScissor in a 3 places per frame. Each use consists of 2 glScissor calls, one to set it up, and one to reset it to what it was. I don't know if there is much of a performance impact here.

If I used PVRTC would it be able to render faster? Currently all my images are GL_RGBA. I don't have memory issues.

One of my fullscreen textures is 256x256. Would it be better to use 480x320 so the phone doesn't have to do any scaling? Are there any other general performance advice for texture sizes?

Here is a rough idea of what I'm drawing, in this order:

1) Switch to perspective matrix. 2) Draw a full screen background image 3) Draw a full screen image with translucency (this one has a scrolling texture). 4) Draw a few sprites. 5) Switch to ortho matrix. 6) Draw a few sprites. 7) Switch to perspective matrix. 8) Draw sprites and some other textured geometry. 9) Switch to ortho matrix. 10) Draw a few sprites (eg, game HUD).

Steps 1-6 draw a bunch of background stuff. 8 draws most of the game content. 10 draws the HUD.

As you can see, there are many layers, some of them full screen and some of the sprites are pretty large (1/4 of the screen). The layers use translucency, so I have to draw them in back-to-front order. This is further complicated by needing to draw various layers in ortho and others in perspective.

I will gladly provide additional information if reqested. Thanks in advance for any performance tips or general advice on my problem!

Edit:

I added some logging to see how many glDrawArrays calls I am doing, and with how much data. I do about 20 glDrawArray calls per frame. Usually about 1 to up to 6 of these has about 40 vertices each. The rest of the calls are usually just 2 vertices (one image). I'm just using glVertexPointer and glTexCoordPointer.

+1  A: 

The biggest performance killer on the iPhone platform is the number of draw calls and state changes. If you're doing more than 20ish draw calls or state changes, you're going to run into a performance wall.

Batching and texture atlases are your friend.

Yann Ramin
Hi, thanks. I edited the end of my question with additional information about the number of calls I'm making. It seems to be about at the extent of the reasonable range you mentioned. I will look into batching more of the individual images, but I don't have high hopes that it can be improved further.
NateS
@NateS: If you think you're getting fill rate limited, try PVRTC.
Yann Ramin
In this case, that probably isn't the best advice. Reducing the number of draw calls is done to reduce the amount of CPU time spent within the OpenGL implementation. Given that the renderer is fully utilized, draw call batching will likely not improve this case.Instruments or shark can be used to profile the CPU usage, and determine how much CPU time is unused, and how much is used within the OpenGL driver.
Frogblast
After much more optimization to reduce draw calls, I gained ~6fps, which is significant. This is surprising because I didn't expect much gain since I felt I was fill limited.
NateS
+1  A: 

Look to Apple's "Best Practices for Working with Texture Data" and "Best Practices for Working with Vertex Data" sections of the OpenGL ES Programming Guide for iPhone OS. They highly recommend (as do others) that you use PVRTC for compressing your textures, because they can offer an 8:1 or 16:1 compression ratio over your standard uncompressed textures. Aside from mipmapping, you seem to be doing the other recommended optimization of using a texture atlas.

You do not appear to be geometry-limited, because (as I discovered in this question) the Tiler Utilization statistic seems to indicate how much of a bottleneck is being caused by geometry size. However, the iPhone 3G S (and third-generation iPod touch and iPad) support hardware-accelerated VBOs, so you might give those a shot and see how they affect performance. They might not have as much of an effect as compressing textures would, but they're not hard to implement.

Brad Larson
+3  A: 

Given that the Renderer Utilization is basically at 100%, that indicates that the bottleneck is filling, texturing, and blending pixels. Techniques intended to optimize vertex processing (VBOs and vertex formats) or CPU usage (draw call batching) will likely not help, as they will not speed up pixel processing.

Your best bet is to reduce the number of pixels that you are filling, and also look at different texture formats that make better use of the very limited memory bandwidth available on the first generation devices. Prefer the use of PVRTC textures wherever possible, and 16bit uncompressed textures otherwise.

Frogblast
Thanks. Switching the full screen textures to PVR gained ~4fps, which is nice. I knew they used less memory, but wasn't sure if there would be any performance gain. Also, I changed the images I could to GL_NEAREST and gained another 2-3fps.
NateS
Update: using CADisplayLink also helped a bit.
NateS
A: 

In a similar situation (porting a 2D Adventure game to the iPad). My 3GS version was running more or less locked at 60FPS, put it on iPad dropped it (and my jaw) to 20fps.

Turns out ONE of the little gotchas involved is that the PVR cards hate GL_ALPHA_TEST; on the PC that actually has a slight positive effect (especially on older intel chips), but they're death on fillrate on the iPhone. Changing that to

 glDisable(GL_ALPHA_TEST);

gave me an immediate 100% boost in FPS (up to 40 FPS). Not bad for one line of code.. :)

Allan

Allan Simonsen