PDA

View Full Version : Pixel Buffer Objects are very slow no?



ManOfSpace
05-23-2004, 07:22 PM
Is it just me, or are they very slow for 2d rendering of any sort?

I have a image 'pipeline' set up that lets me switch from regular glDrawpixels(I.e from a system mem bank) to glDrawPixels from a binded pixel buffer object.

Firstly, glDrawPixels is abnormally slow for a g5, but going on what I've read up on, I never expected it to be fast. But pixel buffer objects not only do not speed things up, they slow things down at least 50%.

I really don't want to go the texture route as this means wasting vid mem to make images conform to texture sizes across older cards.

Are pBuffers helpful for speeding up 2d? IT's the only area I've left untouched, after pixel buffer objects I figured it was pointless...is it?

yooyo
05-24-2004, 04:42 AM
You have to use accelerated formats (BGR or BGRA instead of RGB or RGBA).

See this page: http://developer.nvidia.com/object/General_FAQ.html#p1

ManOfSpace
05-24-2004, 05:35 AM
Hm, thanks for the tip, but doesn't make any difference. I can't render 25 128x128 images per frame without crippling it to around 2fps using pixel buffers. get around 28fps using regular system mem.

Definitely not my card, as it can throw about thousands of images in languages like blitz which use directx7 for their 2d. Is it just a limitation of gl that means it'll never match directx for 2d speed even using pixel buffers?

yooyo
05-24-2004, 07:37 AM
Can you post some code...

yooyo

ManOfSpace
05-24-2004, 10:06 AM
I use this to create the buffer, uploading a image already loaded into system memory.
glGenBuffersObj(1,tempBank)
out\pBufId=PeekInt(tempBank,0)

glBindBufferObj(GL_PIXEL_PACK_BUFFER_ARB,tempBank)
glBufferDataObj(GL_PIXEL_PACK_BUFFER_ARB,BankSize( out\rgba),out\rgba,GL_STATIC_DRAW_ARB)
glBindBufferObj(GL_PIXEL_PACK_BUFFER_ARB,0)

and then render time I use,
glBindBufferObj(GL_PIXEL_UNPACK_BUFFER_ARB,img\pBu fId)
glDrawPixels2 img\w,img\h,GL_BGRA_EXT ,GL_UNSIGNED_BYTE,0
glBindBufferObj(GL_PIXEL_UNPACK_BUFFER_ARB,0)

to render the image. This runs at around 2fps..even system images runs at 28 using bgra(Which is still pretty awful)

Korval
05-24-2004, 10:58 AM
PBO is not meant to be a 2D rendering system; use textured quads for that. PBO is meant to allow for async pixel transfer operations.

yooyo
05-24-2004, 11:12 AM
Korval is right. Instead of glDrawPixles, use glTexSubImage2D and upload frame to texture.
I suppose you have "non power of two" video file so you can use NV_texture_rectangle or EXT_texture_rectangle to keep memory waste.

Remember when you use PBO or PDR all image data transfer are async. This mean after glTexSubImage2D call driver initiate DMA transfer and return immediatly.

When you try to copy new frame in PBO buffer maybe previous frame are not yet uploaded and overwriting PBO data can damage your texture.

yooyo

Claytonious
05-26-2004, 07:45 AM
You're still going to find that glTexSubImage2D() on a textured quad is FAR slower than what you're accustomed to doing with DirectX. DirectX allows you to blit 2D pixels directly to video memory which is very nice. For some unknown reason, OpenGL has always had slow implementations of glDrawPixels, which I can't understand. So you have to settle for the hack of using a textured quad, which means you have to waste (lots) of time calling glTexSubImage2D just to render 2D images on your display. Using glTexSubImage2D with textured quads will be a lot faster than glDrawPixels, but it will be nothing like the speed you get in DirectX when directly blitting to video mem.

This is the one area where OpenGL is clearly inferior to DirectX and it appears that no one cares to fix it (by making glDrawPixels fast, for example.)

yooyo
05-26-2004, 08:11 AM
I really don't care about DX capabilities. Using OpenGL and PDR I can upload up to ~1.8GB/sec on AGP8x systems and NV hardware.

yooyo

ToolChest
05-26-2004, 10:05 AM
Originally posted by Claytonious:
You're still going to find that glTexSubImage2D() on a textured quad is FAR slower than what you're accustomed to doing with DirectX...Ok Iíll biteÖ

Based on information Iíve read here itís my understanding that frame buffer read/writes are slow due to synchronous transport over the agp bus causing pipeline stalls. Is this correct?

If so how would the graphics api have any substantial affect on the problem or is this bs?

l_belev
05-29-2004, 06:39 AM
Originally posted by Claytonious:
... For some unknown reason, OpenGL has always had slow implementations of glDrawPixels, which I can't understand. ... glDrawPixels is not slow at all if you carefully set up the renderer states in the right manner beforehand. The problem is that glDrawPixels is not a plain blit to the framebuffer, but it generates fragments which pass through the entire fragment pipeline (texturing, the depth/alpha/etc. tests and so on) in exactly the same way as the fragments generated by the rasterization of the points/lines/polygons. This is according the OpenGL spec, but most hardware out there isn't capable of such operation, so most commonly the drivers do it in software. But if you turn off all the per-fragment operations and the texturing (including any other exotic stuff as fragment programs, etc.) then the operation becomes a simple blit, for which the driver is capable, and you get really decent speeds.
It was a while ago when I tested this under GF2. Then I examined the conditions for the operation to be accelerated for GF2 but I lost the list. As I said, it included no-texturing and no-tests. When the conditions are met, the speed is not less than any directx/whatever other way of doing the same blit.
Lucho