PBO for async glGetTexImage, random slowdowns..

I’m trying to find a way to procedurally generate a texture and read it back to the cpu asynchronously (not slowing down rendering).

I’m using FBO to render to the texture, so far so good. Then I create a PBO, perform the glGetTexImage call into that, do some cpu work for a while, and later I map the PBO buffer and read it back.

It all works… except that every once in a while (a couple of times per second) the call to glGetTexImage seems to be blocking the cpu, when the PBO specs says it should return immediately.

Here’s the relevant part of the code:


if (m_dataPBO == 0)
  glGenBuffers(1, &m_dataPBO);

glBindBuffer(GL_PIXEL_PACK_BUFFER_ARB, m_dataPBO);
bindFBOtexture(GL_TEXTURE_2D, m_obj);

/// this call never takes any time
glBufferData(GL_PIXEL_PACK_BUFFER_ARB, m_width * m_height * m_buffer->getBytesPerPixel(), NULL, GL_STREAM_READ);

/// this call usually takes little time (0.2-0.3 ms) but ocasionally takes a lot of time (30-40 ms) despite using PBO
glGetTexImage(GL_TEXTURE_2D, 0, baseFormat, baseType, GL_BUFFER_OFFSET(0));

glBindBuffer(GL_PIXEL_PACK_BUFFER_ARB, 0);

doCPUWork(..)

/// mapping and reading back to the cpu
glBindBuffer(GL_PIXEL_PACK_BUFFER_ARB, m_dataPBO);
TUByte *src = (TUByte *)glMapBuffer(GL_PIXEL_PACK_BUFFER_ARB, GL_READ_ONLY);
MMemcpy(m_buffer->lock(), src, m_buffer->getBytesPerPixel() * m_width * m_height);
glUnmapBuffer(GL_PIXEL_PACK_BUFFER_ARB);
m_buffer->unlock();
glBindBuffer(GL_PIXEL_PACK_BUFFER_ARB, 0);

I see this behavior both on an ATI Radeon X1950 XT and a NV GF 8800 GTX.

Any idea that could explain this behavior and fix it ? Thanks…

Y.

Use FBO and glReadPixels. It works fine.

Without PBO ? Then I’ll get a constant 20-40 ms slowdown each time this happens. The framerate won’t be smooth anymore.

No… use FBO and glReadPixels with PBO. Before glReadPixels set glReadBuffer to GL_COLOR_ATTACHMENT0.
So…

  1. bind fbo
  2. render
  3. bind PBO
  4. glReadBuffer(GL_COLOR_ATTACHMENT0)
  5. glReadPixels
  6. unbind PBO
  7. unbind FBO

do some CPU work

  1. bind PBO
  2. map buffer
  3. copy data
  4. unmap buffer
  5. unbind PBO.

Works like charm on NVidia… I dont know how it works on ATI.
btw… Be aware of older NV driver version… they have some magic limit of 46MB after that next map buffer takes 20ms. After that everything gets back to normal, until next 46MB has been transfered.
Newer drivers doesnt suffer from this “feature”.

Interesting. Thanks for the hints, my NV driver is quite recent, like one month old, so I don’t think it is the problem. But I will try the glReadPixels idea.

How did you discover about this 46MB magic limit ? Did NVidia release some explanations about it ?

Y.

Well, someone at this forum notice that, then I make test app and confirm that issue. Later, someone fome NVidia check that thread, and post comment “it will be fixed in next driver release”… and its fixed!

I didnt remember names, but thanx to NVidia because they read this forum and fix bugs.

I think this may be related to my problem. I tried running latest beta driver (177.13) and release driver (173.14.12) for 64 bit linux, but I still have the problem.

http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=246014#Post246014

Here’s the thread yooyo was talking about:

http://www.opengl.org/discussion_boards/…8139#Post238139

Can someone from Nvidia tell me if the fix that was mentioned is in the 64 bit Linux drivers yet?

Thanks.

The October 7, 2008 Nvidia 64 bit Linux driver fixed my issue…