Seeking maximum performance with PBOs

Greetings OpenGL Jedi,

I’m wondering wish one is the fastest for updating a pixel buffer, glMapBuffer or glBufferData?

Init:


UInt *l_pixelBufferId = in_texture->GetPixelBufferId();

if (l_pixelBufferId[0] == 0)
{
	glGenBuffers(1, &l_pixelBufferId[0]);
	glBindBuffer(GL_PIXEL_UNPACK_BUFFER, l_pixelBufferId[0]);
	glBufferData(GL_PIXEL_UNPACK_BUFFER, l_image->GetSize(), 0, GL_STREAM_DRAW); /// NULL pointer reserves only memory space.
	glBindBuffer(GL_PIXEL_UNPACK_BUFFER, 0);
}

if (l_pixelBufferId[1] == 0)
{
	glGenBuffers(1, &l_pixelBufferId[1]);
	glBindBuffer(GL_PIXEL_UNPACK_BUFFER, l_pixelBufferId[1]);
	glBufferData(GL_PIXEL_UNPACK_BUFFER, l_image->GetSize(), 0, GL_STREAM_DRAW); /// NULL pointer reserves only memory space.
	glBindBuffer(GL_PIXEL_UNPACK_BUFFER, 0);
}

Update 1:


static Int l_index = 0;
Int l_nextIndex = 0;

l_index = (l_index + 1) % 2;
l_nextIndex = (l_index + 1) % 2;

UInt *l_pixelBufferId = in_texture->GetPixelBufferId();

glBindBuffer(GL_PIXEL_UNPACK_BUFFER, l_pixelBufferId[l_index]);

glTexSubImage2D(l_dimension, in_level, in_x, in_y, l_image->GetWidth(), l_image->GetHeight(), l_image->GetImageFormat(), l_image->GetDataType(), 0);

glBindBuffer(GL_PIXEL_UNPACK_BUFFER, l_pixelBufferId[l_nextIndex]);
glBufferData(GL_PIXEL_UNPACK_BUFFER, l_image->GetSize(), l_image->GetPixels(), GL_STREAM_DRAW);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, 0);

Update 2:


glBufferData(GL_PIXEL_UNPACK_BUFFER, l_image->GetSize(), NULL, GL_STREAM_DRAW);

UChar *l_buffer = (UChar*) glMapBuffer(GL_PIXEL_UNPACK_BUFFER, GL_WRITE_ONLY);

if (l_buffer)
{
	l_buffer = l_image->GetPixels(); /// Not working and that is why I started this thread.
	glUnmapBuffer(GL_PIXEL_UNPACK_BUFFER); /// Release pointer to mapping buffer.
}

Would we get any beneficial gain in using glMapBuffer in this case?

Update 2:

“l_buffer” is a pointer. You set it to the value from glMapBuffer. Then you change the pointer value with one from “l_image->GetPixels()”. This does not change the contents of the pointer.

Actually l_image->GetPixels() is movie data that is being updated elsewhere by another thread (not necessarily between glMapBuffer and glUnmapBuffer). That’s one reason why I’m wondering if glMapBuffer is of any use for this particular case.

Your “Update 2” case lacks a memcpy from the memory pointed to by l_image->GetPixels() to the mapped buffer (i.e. memory pointed to by l_buffer).
You can’t tell the driver that the new buffer content is now at some location determined by your application by assigning to l_buffer - which I think is what you are trying to do.

If you want to avoid the copying, the video decoding should write directly into a mapped buffer. You probably need multiple buffers then to avoid slowdowns though.

Yes, I thought the whole point of using PBOs was precisely not to copy the data. Being force to write the buffer at a precise time makes things way more complicated because updating the data doesn’t mean it is ready to be rendered. It will have to update and then render in a sequence… it is not very efficient or I’m missing something. Update 1 seems faster than a glTexSubImage2D but to play video at 1080p I need the extra horse power I can possibly get… I think what I’m asking is what would be the ultimate setup to render video and would I get any benefices by using glMapBuffer?

Mapping a buffer is useful if the code you use to produce the image can write that image data directly into the mapped pointer. If not, you may as well just use glBufferSubData.

I think what I’m asking is what would be the ultimate setup to render video and would I get any benefices by using glMapBuffer?

I’m not knowledgeable enough to suggest the best setup, but if you use multiple buffers, you should be able to have some mapped to be filled by the decoder thread while others are unmapped for display. This adds a little bit of latency, but perhaps it is small enough to not be a problem for you.