ansaurus

Question

Speeding up an IO bound OpenGL application

Answer 1

+4 A:

I know nothing about opengl, but won't data compression be a natural workaround here? Isn't there support for integer types or 16-bit floats? Also other color representations than 3 floats per point?

Shelwien 2010-08-02 23:05:34

Data compression sounds good, but I don't know if I can decompress on the GPU side with opengl. I'll try indexing the colors, thanks

Xzhsh 2010-08-03 07:06:25

Well, even real compression might be possible with GPGPU, but I just meant that glVertexPointer apparently supports GL_SHORT and glColorPointer - GL_BYTE. Do you really need float precision there?Although of course indexing the colors is even better.

Shelwien 2010-08-03 08:32:00

Answer 2

+5 A:

If you can, implement color map using 1D texture. You'll only need 1 texture coordinate instead of 3 colors and it will make vertices 128-bit aligned too.

EDIT: You just need to create texture from your colormap and use glTexCoordPointer instead of glColorPointer (and changing vertex color values for texture coordinates in [0, 1] range of course). Here's linearly interpolated 6 texel colormap:

// Create texture
GLuint texture;
glGenTextures(1, &texture);
glBindTexture(GL_TEXTURE_1D, texture);
glTexParameteri(GL_TEXTURE_1D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_1D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);

// Load textureData
GLubyte colorData[] = {
    0xff, 0x00, 0x00,
    0xff, 0xff, 0x00,
    0x00, 0xff, 0x00,
    0x00, 0xff, 0xff,
    0x00, 0x00, 0xff,
    0xff, 0x00, 0xff
};
glTexImage1D(GL_TEXTURE_1D, 0, GL_RGB, 6, 0, GL_RGB, GL_UNSIGNED_BYTE, colorData);
glEnable(GL_TEXTURE_1D);

Ivan Baldin 2010-08-02 23:32:07

Thanks for the answer. I'm not really sure how to use 1D textures as a colormap, is there something I can read to learn? Thanks in advance (sorry, I'm a bit of a opengl noob :D)

Xzhsh 2010-08-03 07:03:38

Thanks again for the help

Xzhsh 2010-08-09 22:11:14

Answer 3

+1 A:

If you're willing to deal with the latency you can double-(or more!)buffer your VBOs, transferring geometry into one buffer while rendering from another:

while(true)
    {
    draw_vbo( cur_vbo_id );
    generate_new_geometry();
    load_vbo( nxt_vbo_id );
    swap( cur_vbo_id, nxt_vbo_id );
    }

EDIT: You also might try interleaving your vertexes instead of using one VBO per component.

genpfault 2010-08-03 00:19:24

Latency isn't an issue, but I'm not sure I see how double buffering VBOs will help speed up transfer times. Is there some overhead per VBO? And thanks, I'll try interleaving tomorrow

Xzhsh 2010-08-03 07:02:50

Answer 4

A:

You say it's I/O bound. That implies you've profiled it and seen it spending 50% or more of its time waiting for I/O.

If so, that's what you've got to concentrate on, not the graphics.

If not, then some of the other answers sound good to me. Regardless, profile, don't guess. This is the method I use.

Mike Dunlavey 2010-08-03 01:05:31

By I/O bound he means I/O on the PCI-express bus of the GPU, not HDD or network. But you're still right though.

Calvin1602 2010-08-03 06:40:42

I'm fairly sure it's IO bound because I am just using the same frame played over and over for testing. It speeds up to 60fps if I just comment out the memcpy :\.

Xzhsh 2010-08-03 06:59:53

@Calvin1602: Thanks for the correction, so edited.

Mike Dunlavey 2010-08-03 10:51:54

@Xzhsh wtf ? I don't understand anymore. Which memcpy are you talking abour ?

Calvin1602 2010-08-03 11:29:09

Err I'm sorry, I meant the bufferdata call. What I did to test this was i loaded the same frame data from memory in an initialization function, then I continually loaded the same one every draw frame. If I load the frame data only once, it can go up to 100 fps, but if I load the data every frame (which I'll need to do when I get more data), the fps goes down to something like 10.

Xzhsh 2010-08-03 19:31:56

@Xzhsh: So frame time goes from 10ms to 100. That says in the slow case 90% of the time is going into that bufferdata call. What I would do is pause (by Ctrl-C) and, with 90% probability, you will catch it in the act of spending that time, and you will see exactly why. Once you know exactly why, you may very well get an idea how to make it faster.

Mike Dunlavey 2010-08-03 19:37:14

@Xzhsh: If you haven't heard of that technique, it doesn't surprise me, you being in the home of gprof. Anyway, here's more on the subject: http://stackoverflow.com/questions/1777556/alternatives-to-gprof/1779343#1779343

Mike Dunlavey 2010-08-03 19:40:31

Answer 5

A:

Some pointers:

store as much data as possible on the graphic card and load only what is really needed (pretty obvious)
use lod levels in trees (kd- or octtrees) and calculate as much as possible up front
compression on disc is useful too in order to overcome io bottlenecks

Florian 2010-08-03 12:21:54

ansaurus

tags:

views:

answers:

Speeding up an IO bound OpenGL application

related questions