PDA

View Full Version : PBO in Catalyst 7.1 !



gybe
01-10-2007, 02:45 PM
I was so surprise today when I saw the extension GL_ARB_pixel_buffer_object
in the new Catalyst driver. Quickly I wrote a simple test application where I do an asynchronous ReadPixel. I add some error check, to make sure I don’t receive any error message and I compute the time it takes to do the read pixel.

Here is my code:

glFinish();
glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, m_unImageBuffer);
QueryPerformanceCounter((LARGE_INTEGER *)&lnStart);
glReadPixels( 0, 0, m_unWindowWidth, m_unWindowHeight, GL_RGBA , GL_UNSIGNED_BYTE, BUFFER_OFFSET(0));
QueryPerformanceCounter((LARGE_INTEGER *)&lnEnd);
glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, 0);

m_fReadPixelTime = (float) (((double)lnEnd - (double)lnStart) / (double)m_nFrequency);


For a 512x512 back buffer it takes 8 ms to do the read pixel. I get similar performance without the PBO. Am I doing anything wrong? Cause Im not sure the read pixel is asynchronous. I know they just release it, but the ARB extension has been approved in December 2004, it’s not like they didn’t have time to make it right. Should we expect ATI to fix it or it’s a hardware limitation? Im using a X1900 XT.

gybe
01-10-2007, 05:48 PM
I just run the same test application on a GeForce 6600 and it takes 1.5 ms for the readpixel.

Nicolas Lelong
01-10-2007, 10:03 PM
Finally, PBO on ATI :)

Did you try a glReadPixels with GL_BGRA instead of GL_RGBA ? From what I remember, it would be more likely to be optimized...

just my 2 cents ...

gybe
01-11-2007, 04:42 AM
Originally posted by Nicolas Lelong:
Finally, PBO on ATI :)

Did you try a glReadPixels with GL_BGRA instead of GL_RGBA ? From what I remember, it would be more likely to be optimized...

just my 2 cents ... Yeah I also tried BGRA, since they read in BGRA in the extension spec, but I get the same performance.

Ozo
01-12-2007, 06:48 PM
That's sad, because Asynchronous ReadPixel is a very interesting OpenGL function. I was recently looking into it, hoping that it would be available soon. From what I can see, I'll have to keep waiting ...

Is there any other way of doing asynchronous ReadPixel with an ATI driver?

ATI: Any hope that you fix this in a future release?

Ozo.

Zengar
01-12-2007, 11:48 PM
You can still use it, can't you? PBO is not hard to implement even if hardware doesn't support it (which is clearly not the case of ATI). But if they fail to deliver well-working implementation, it is still not a reason not to use the extension. ATI just has to keep up. On the other side, fisr implementations tend to be slow.

Ozo
01-13-2007, 05:05 AM
I can still use it, but the ReadPixel won't be done asynchronously. I would need the ReadPixel to be processed in the background (some DMA transfer with almost no CPU impact), ... while the draw thread continues to process and send commands to the GPU.

Maybe Gybe can quickly do more tests on ATI : Is 8ms a fix cost, or does the ReadPixel time scale with the size of the surface you read (ex 1024x1024)?

Ozo

gybe
01-13-2007, 08:17 AM
Originally posted by Ozo:
Maybe Gybe can quickly do more tests on ATI : Is 8ms a fix cost, or does the ReadPixel time scale with the size of the surface you read (ex 1024x1014)? It's not a fix cost, it scales with the size of the surface to transfer. In 1600x1200 it takes more than 15 ms.


Originally posted by Zengar:
You can still use it, can't you? PBO is not hard to implement even if hardware doesn't support it (which is clearly not the case of ATI). But if they fail to deliver well-working implementation, it is still not a reason not to use the extension. ATI just has to keep up. On the other side, fisr implementations tend to be slow. Transfers are not asynchronous. We can just do a normal readpixel and we will get the same performance. Actually in my case PBO readpixel is 1ms slower than normal readpixel(maybe because of the transfer video mem to video mem).

I just hope it's possible to do asynchronous ReadPixel with an ATI GPU. I hope somebody from ATI can confirm that.

Overmind
01-13-2007, 08:30 AM
It's hard to believe that asynchronous transfer is not possible on any hardware.

It should be possible to implement it purely in the driver without any special hardware features. Perhaps it just takes some time to get it right, these asynchronous things can get pretty complicated ;)