I've been trying to set up some sort of geometry batching for a week or so, there isn't a ton of information online as to how other people have implemented this. Basically I just want to 'catch' every draw call, sort the corresponding meshes by texture, and then draw all meshes that share a texture in one go.
What I've been doing is going through each vertex in the mesh, transform it by the Model-View matrix (to put it into world space), and then store that vertex in a larger array to await some later rendering. This works, but it runs terribly slowly... as it seems like I'm doing in software what openGL would be doing in hardware (all the matrix transforms). Is there some other way to do batching that doesn't require you to do the transforms by hand? Can I say "hey, GL, here are a bunch of vertices and here's how they should be transformed" and then send it on its merry way?
I should mention I'm doing this on iPhone, so I'm bound by openGLES and the limited hardware. I've watched the Stanford ngmoco presentation on optimizations, and I've been following this guide to model my own texture batcher.
Here's an example of what I'm doing. This is for skinned meshes... I use PowerVR's .pod format which exports an interleaved array of the vertex information.
TextureBatcher * tb = [TextureBatcher getSharedTextureBatcher];
// The next line gives me the indices of the verts used by this batch
GLushort * indices = (GLushort*) (mesh.sFaces.pData + (uint) &((GLushort *)0)[3 * mesh.sBoneBatches.pnBatchOffset[batchNum]]);
[tb addIndices:indices Count:i32Tris * 3];
NSMutableSet * alreadyVisitedIndices = [NSMutableSet setWithCapacity:i32Tris*3];
for(int i = 0; i < i32Tris*3; i++){
if([alreadyVisitedIndices containsObject:[NSNumber numberWithInt:indices[i]]]){
continue;
} else {
GLfloat * verts = (GLfloat*)(mesh.pInterleaved + (uint)mesh.sVertex.pData + (indices[i]*mesh.sVertex.nStride));
GLfloat * normals = (GLfloat*)(mesh.pInterleaved + (uint)mesh.sNormals.pData + (indices[i]*mesh.sNormals.nStride));
GLfloat * uvs = (GLfloat*)(mesh.pInterleaved + (uint)mesh.psUVW[uvSet].pData + (indices[i]*mesh.psUVW[uvSet].nStride));
PVRTVec4 vert = PVRTVec4(verts[0], verts[1], verts[2], 1);
PVRTVec4 normal = PVRTVec4(normals[0], normals[1], normals[2], 0); //w=0 to skip translation
GLfloat u = uvs[0];
GLfloat v = uvs[1];
PVRTVec4 newVert = PVRTVec4(0.f);
PVRTVec4 newNormal = PVRTVec4(0.f);
if(bSkinning){
for(int j = 0; j < mesh.sBoneIdx.n; j++){
PVRTMat4 mat = boneMatrices[(mesh.pInterleaved + (uint)mesh.sBoneIdx.pData + (indices[i]*mesh.sBoneIdx.nStride))[j]];
GLfloat weight = ((GLfloat*)(mesh.pInterleaved + (uint)mesh.sBoneWeight.pData + (indices[i]*mesh.sBoneWeight.nStride)))[j];
PVRTMat4 weightedMatrix = mat * weight;
newVert += weightedMatrix * vert;
// TODO this should use the inverse transpose but whatever it works.
newNormal += weightedMatrix * normal;
}
[tb addVertex: newVert Normal: newNormal U: u V: v];
}
else {
[tb addVertex: (mModelView * vert) Normal: (mModelView * normal) U: u V: v];
}
[alreadyVisitedIndices addObject:[NSNumber numberWithInt:indices[i]]];
}