If glReadPixels is too slow for you, then glCopyTexImage2D and glCopyTexSubImage2D aren’t going to be a whole lot faster. On platforms with support for framebuffer objects, like iOS, the recommended (i.e. faster) way to get GPU-rendered image data into a texture is to use that texture as the color attachment for a framebuffer object and render directly into it. That said, if you still want to pursue this method, here’s what you need to do to fix it:
First, you’re passing bad arguments to glCopyTexImage2D. The third argument, internalformat, should probably be GL_RGBA instead of 0. If you had called glGetError after calling glCopyTexImage2D, you would probably have gotten GL_INVALID_OPERATION. See the OpenGL ES 1.1 man pages for glCopyTexImage2D and glCopyTexSubImage2D.
Second, as you’ve already observed, glCopyTexImage2D requires its width and height arguments to be power-of-two as well. The correct way to deal with this is to allocate a texture image using glTexImage2D (you can pass NULL for pixels here), then use glCopyTexSubImage2D to copy your framebuffer contents into a rectangle. Note that glCopyTexSubImage2D doesn’t take an internalformat argument—because it’s updating a subrectangle of a texture, it uses the texture’s existing format.
For the record, glGetTexImage doesn’t exist in OpenGL ES 1.1 or 2.0, which is why you’re getting an implicit declaration.