PBO and glTexSubImage2D -> slow

Hi, I’m trying to copy a brush on a texture and use glTexSubImage2D and PBO. Its really slow. The driver is 6629 under linux on a nv40. Here is the code:

init:
glGenBuffers(1, &_texture_buffer);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_EXT, _texture_buffer);
glBufferData(GL_PIXEL_UNPACK_BUFFER_EXT, BRUSH_SIZE, _brush, GL_STATIC_DRAW);

rendering:
glBindBuffer(GL_PIXEL_UNPACK_BUFFER_EXT, _texture_buffer);
glTexSubImage2D(GL_TEXTURE_2D, 0, offset_x, offset_y, 8, 8, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8_REV, 0);

If the texture is small than its fast but if the texture is big its really slow. I have the feeling, that there is a copy down the bus.

tnx, marco

You need to use GL_BGRA for fast PBO texture uploads on NVIDIA GPUs.

I don’t know much about PBO and how it relates to the nVidia Pixel Data Range (PDR) extension – but I do have experience with PDR and linux. So maybe the follwing will help with PBO or maybe you could use PDR instead…Anyway, .
What I found was this:

AGP has to be set up and working properly (of course)
The host memory where the glTexSubImage gets the data should be allocated with glXAllocateMemoryNV()
The allocated memory should be “enabled” with the glPixelDataRangeNV() function
And you have to call glEnable(GL_READ_PIXEL_DATA_RANG_NV) (or semething like that)…

And use the GL_BGRA_EXT format.

After doing all that, I went from about 130 MB/sec to about 700-800 M/sec

If you want source code that does all this, just ask.

HTH
-Steve

tnx for the help. GL_BRGA was the solution number one. I know use the RenderTexture class from Mark Harris and it’s really fast. But its good to know that GL_BRGA is faster. But why? I have worked on SGI’s and there are AFIAK not this problems. The PBO code is working really well. I hope PBO’s will be ARB extensions or go in the core.