PDA

View Full Version : VBO performance problem with multiple threads



lalakis
04-30-2009, 05:05 AM
Hello to everybody,

I have around 1000 vbos of 4000 vertexes each.

If i have one thread that updates all the vbos and then draws all , i get 42 fps in 8800 (50 fps in Quadro 5600)

If i have two threads, one that just updates the vbos, and the other thread just draws the vbos i get 32 fps ( 39 in quadro)

Method 1(fast):



Thread1:
for vbo in all vbos{
Bind vbo
if(vbo needs update){
Update vbo
}
Draw vbo
}

Method 2(slow):



Thread1:
for vbo in all vbos{
if(vbo needs update){
Bind vbo
Update vbo
}
}

Thread2:
for vbo in all vbos{
if(vbo is updated){
Bind vbo
Draw vbo
}
}

Any possible explanation for this performance drop?

Thanks in advance

Lord crc
04-30-2009, 07:42 AM
Afaik sharing contexts isn't a performance winner. Instead you should bind and map the VBO in thread 1, hand the pointer to thread 2 which updates it, then signal back to thread 1 which unmaps and eventually draws it. This way thread 2 knows nothing about opengl and is just concerned with filling the data into a buffer.

In addition you should probably keep a pool of available VBOs, so that you can still draw from the "current" ones while the secondary thread is filling the new ones. Once it's done, swap the VBOs and put the old one back into the pool.

LangFox
04-30-2009, 08:25 PM
There is an article in Game Programming Gems 7 or ShaderX 7, which said you can use three buffers for multi-thread rendering.

data thread: find any buffer that isn't under rendering and isn't just filled, feed it data.

render thread: find the last filled buffer to render.

Then these two thread will not affect each other.

lalakis
05-04-2009, 02:55 AM
I have tried this but i don't win too much. My update time is small compared to Map/Unmap time.

I will run some test again with this method. My main goal is to reduce the time that the draw thread will wait for data to be put in vbo (without reduction in draw performance).

I also tried the new GL_NV_vertex_unified_memory which improves performance, but again the 2 threads update has lower performance...

Lord crc
05-04-2009, 06:01 AM
If you fill the entire buffer, you should call BufferData with a null pointer first, to indicate that you're going to overwrite the entire thing. This way the driver doesn't have to synchronize things when you call Map.