The most practical approach seems to be to ignore most of OpenGL functionality that is not directly applicable (or is slow, or not hardware accelerated, or is a no longer a good match for the hardware).
OOP or not, to render some scene those are various types and entities that you usually have:
Geometry (meshes). Most often this is an array of vertices and array of indices (i.e. three indices per triangle, aka "triangle list"). A vertex can be in some arbitrary format (e.g. only a float3 position; a float3 position + float3 normal; a float3 position + float3 normal + float2 texcoord; and so on and so on). So to define a piece of geometry you need:
- define it's vertex format (could be a bitmask, an enum from a list of formats; ...),
- have array of vertices, with their components interleaved ("interleaved arrays")
- have array of triangles.
If you're in OOP land, you could call this class a Mesh.
Materials - things that define how some piece of geometry is rendered. In a simplest case, this could be a color of the object, for example. Or whether lighting should be applied. Or whether the object should be alpha-blended. Or a texture (or a list of textures) to use. Or a vertex/fragment shader to use. And so on, the possibilities are endless. Start by putting things that you need into materials. In OOP land that class could be called (surprise!) a Material.
Scene - you have pieces of geometry, a collection of materials, time to define what is in the scene. In a simple case, each object in the scene could be defined by:
- What geometry it uses (pointer to Mesh),
- How it should be rendered (pointer to Material),
- Where it is located. This could be a 4x4 transformation matrix, or a 4x3 transformation matrix, or a vector (position), quaternion (orientation) and another vector (scale). Let's call this a Node in OOP land.
Camera. Well, a camera is nothing more than "where it is placed" (again, a 4x4 or 4x3 matrix, or a position and orientation), plus some projection parameters (field of view, aspect ratio, ...).
So basically that's it! You have a scene which is a bunch of Nodes which reference Meshes and Materials, and you have a Camera that defines where a viewer is.
Now, where to put actual OpenGL calls is a design question only. I'd say, don't put OpenGL calls into Node or Mesh or Material classes. Instead, make something like OpenGLRenderer that can traverse the scene and issue all calls. Or, even better, make something that traverses the scene independent of OpenGL, and put lower level calls into OpenGL dependent class.
So yes, all of the above is pretty much platform independent. Going this way, you'll find that glRotate, glTranslate, gluLookAt and friends are quite useless. You have all the matrices already, just pass them to OpenGL. This is how most of real actual code in real games/applications work anyway.
Of course the above can be complicated by more complex requirements. Particularly, Materials can be quite complex. Meshes usually need to support lots of different vertex formats (e.g. packed normals for efficiency). Scene Nodes might need to be organized in a hierarchy (this one can be easy - just add parent/children pointers to the node). Skinned meshes and animations in general add complexity. And so on.
But the main idea is simple: there is Geometry, there are Materials, there are objects in the scene. Then some small piece of code is able to render them.
In OpenGL case, setting up meshes would most likely create/activate/modify VBO objects. Before any node is rendered, matrices would need to be set. And setting up Material would touch most of remaining OpenGL state (blending, texturing, lighting, combiners, shaders, ...).