PDA

View Full Version : PBO + FBO performance



dukey
03-12-2009, 08:07 AM
I have an FBO which i am rendering offscreen to.

Anyway i want to read the contents of this FBO into main memory so I can use it.

I've tried ..

glGetTexImage <- SLOW

then

glReadBuffer(GL_COLOR_ATTACHMENT0_EXT);
glReadPixels()

which is slow.

Then i've tried creating a PBO

something like this


glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB,10000);

glPushAttrib (GL_PIXEL_MODE_BIT);
glReadBuffer (GL_COLOR_ATTACHMENT0_EXT);
glReadPixels (0,0,textureWidth,textureHeight,GL_BGR_EXT,GL_UNSI GNED_BYTE,BUFFER_OFFSET(0));
pfn_glPopAttrib ();

//aviCapture->captureFrame(buffer);

glUnmapBufferARB(GL_PIXEL_PACK_BUFFER_ARB);
glBindBufferARB (GL_PIXEL_PACK_BUFFER_ARB, 0);

anyway that works, but its exactly the same speed as just using readpixels with the glReadBuffer(GL_COLOR_ATTACHMENT0_EXT), in some cases it might actually be slower.

My program works at about 75fps normally, with read pixels with PBO attached i get about 18fps. Just using read pixels the conventional way i get also 18fps.

What am I doing wrong ? I appear to be getting no fps improvement at all.

I am using Vista + a quadro 3400/400 card which is something like an nvidia 6800 card.

yooyo
03-12-2009, 08:51 AM
Normal glReadPixel operation is blocking call. But when you use glReadPixels with PBO then it is nonblocking call. What you have to do is to create two PBO buffers... once per frame copy data from first PBO to sysmem (or codec) and use second PBO to start glReadPixels. Then swap PBO buffers. In next frame do the same.

You will get framebuffer data with one frame behind but it will not stall your CPU.

dukey
03-12-2009, 04:47 PM
hmm
i tried this


if(primaryBuffer) {
pfn_glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB,10000 );
buffer = (UCHAR*) pfn_glMapBufferARB(GL_PIXEL_PACK_BUFFER_ARB, GL_READ_ONLY);

aviCapture->captureFrame(buffer);

pfn_glUnmapBufferARB(GL_PIXEL_PACK_BUFFER_ARB);

pfn_glBindBufferARB (GL_PIXEL_PACK_BUFFER_ARB,10001);
pfn_glReadPixels (0,0,textureWidth,textureHeight,GL_BGR_EXT,GL_UNSI GNED_BYTE,BUFFER_OFFSET(0));
pfn_glBindBufferARB (GL_PIXEL_PACK_BUFFER_ARB,0);

primaryBuffer = false;
}
else {
pfn_glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB,10001 );
buffer = (UCHAR*) pfn_glMapBufferARB(GL_PIXEL_PACK_BUFFER_ARB, GL_READ_ONLY);

aviCapture->captureFrame(buffer);

pfn_glUnmapBufferARB(GL_PIXEL_PACK_BUFFER_ARB);

pfn_glBindBufferARB (GL_PIXEL_PACK_BUFFER_ARB,10000);
pfn_glReadPixels (0,0,textureWidth,textureHeight,GL_BGR_EXT,GL_UNSI GNED_BYTE,BUFFER_OFFSET(0));
pfn_glBindBufferARB (GL_PIXEL_PACK_BUFFER_ARB,0);

primaryBuffer = true;
}

but i still only get 18fps. No speed increase at all just over using glReadPixels :( Without the calls above i get 75fps.

ZbuffeR
03-12-2009, 05:48 PM
75, 18 fps ? You are vsynced, not good for benchmarking...

yooyo
03-12-2009, 06:29 PM
Try to bench w/o aviCapture->captureFrame(buffer) calls.
Do you haver any gl errors. Call glGetError before and after readback.

And finally.. you say its a NV4x based GPU. What about chipset? Is it Intel or SIS, VIA? Is it AGP or PCI-X?

dukey
03-12-2009, 07:18 PM
its PCI-E
intel chipset

and i was benching without the avicapture calls

Lord crc
03-12-2009, 08:26 PM
No expert in this, but some wild thoughts:

- I see you're using GL_BGR_EXT, however isn't it likely that the framebuffer has an alpha channel (even though you didn't ask for one)? Also, is that the native format for the framebuffer (and not GL_RGBA)? Afaik, in either case the driver would have to convert the texture before presenting it to you. I'd try different pixel formats and see if speed improves.

- Are you bandwidth limited, ie if you decrease the size of the ReadPixel call, does the speed increase?

dukey
03-13-2009, 04:55 AM
my FBO i only requested RGB format, but I could try requesting RGBA and yes if the resolution is lower the frame rate increases.

yooyo
03-13-2009, 05:32 AM
Very strange.. did you tried that on another machine? Do you have example to reproduce problem, so I can test it here? Do you use dualview mode?

dukey
03-13-2009, 05:54 AM
i dont have any other machines to test it on currently,
i do use dualview, i'll try disabling that. Worth a shot.

I'm thinking perhaps my gfx card is just too old.

Brolingstanz
03-13-2009, 06:14 AM
sm4 hardware can be had for around 50 clams...

dukey
03-15-2009, 07:09 PM
one of the nvidia SDK demos is a PBO texture performance demo
http://developer.download.nvidia.com/SDK/9.5/Samples/DEMOS/OpenGL/TexturePerformancePBO.zip

Anyway ... on my quadro card, downloading textures theres very little difference in speed between using just straight glReadPixels and using it with a PBO(<5%). That normal ? There is a difference uploading textures to the gfx card with multi PBO, but the performance difference is 10-15% maybe. Not huge.

I was expecting somewhat more.

yooyo
03-16-2009, 09:51 AM
This is something related to your hw setup.. chipset, driver or gfx card. Try to borrow proper gfx card, or try your code on another machine.

dukey
03-16-2009, 11:02 AM
what sort of performance difference do you get on your h/w with the above demo ?

Don't Disturb
03-16-2009, 11:37 AM
On my 8800GT, download rates are:
glTexSubImage: 1162
PBO: 1760
Multi PBO: 1625

Don't Disturb
03-16-2009, 11:42 AM
For readback, it's actually slightly slower using PBO:
glReadPixels: 1190
PBO Readback: 1045

dukey
03-16-2009, 12:04 PM
with the default program settings (ie PBO source)

my readback rates are

94 with readpixels
94 with readpixels + PBO

which is somewhat surprising. What's more surprising is your card has 10x the readback speed of mine :D