PDA

View Full Version : White Paper about Vertex Buffer Objects at nvidia website



Zak McKrakem
11-04-2003, 11:57 AM
Just for your info. I have seen that there is a new paper about VBO @ developer.nvidia.com

Elixer
11-04-2003, 01:49 PM
What, no direct link? Argh! http://www.opengl.org/discussion_boards/ubb/wink.gif

SirKnight
11-04-2003, 02:19 PM
Originally posted by Elixer:
What, no direct link? Argh! http://www.opengl.org/discussion_boards/ubb/wink.gif



LET'S GET HIM! http://www.opengl.org/discussion_boards/ubb/wink.gif

-SirKnight

SirKnight
11-04-2003, 02:24 PM
Actually I just looked over this paper and it seems to be pretty good. I'd like to see papers like this come out much sooner though. Like about the time VBO was first introduced. The presentations at that time confused the crap outa me and I never quite "got it" at first. Having this paper at that time would have got me going on VBO a lot sooner and quicker.


-SirKnight

Ostsol
11-04-2003, 02:52 PM
*shrugs* For the basics I found that the spec was extremely helpful -- much, much more so than the specs for -any- other extension. Those examples at the bottom showed me everything I needed to know to immediately replace my implementation of ATI_vertex_array_object.

Elixer
11-04-2003, 08:13 PM
Whew... took me this long to find the article. http://www.opengl.org/discussion_boards/ubb/wink.gif

Interesting note about PBO and the fact that this document is still labled 'Nvidia confidential' Where has this document been hiding? http://www.opengl.org/discussion_boards/ubb/smile.gif

davepermen
11-04-2003, 10:02 PM
yeah, i love to see the pbo information.. essencially explains them, too..

can't wait http://www.opengl.org/discussion_boards/ubb/biggrin.gif

knackered
11-04-2003, 10:53 PM
Why dave? what are you planning to do with the pbo extension that gets you so excited?

davepermen
11-05-2003, 12:34 AM
playing with it.. what else? http://www.opengl.org/discussion_boards/ubb/biggrin.gif

and i'm specially waiting for the vendor independent port of an app of a friend. and he needs PBO to do this. and i can't wait to see his work running on my hw..

Christian Schüler
11-05-2003, 01:43 AM
Originally posted by Zak McKrakem:
Just for your info. I have seen that there is a new paper about VBO @ developer.nvidia.com

This is some cool info. I didn't expect glVertexArray() to be the sinner, performance wise.

BTW, VBO rocks. I've been using since the first time it was exposed in a driver.

Hampel
11-05-2003, 11:58 PM
About the mentioned GL_PIXEL_PACK_BUFFER and GL_PIXEL_UNPACK_BUFFER as new targets for PBO: are both really necessary? I think, a target GL_PIXEL_BUFFER with a usage pattern of *_DRAW or *_READ should do the same thing?

Hampel

Cyranose
11-06-2003, 12:35 AM
Originally posted by cschueler:
This is some cool info. I didn't expect glVertexArray() to be the sinner, performance wise.


It's not too surprising, but I'm glad they're being direct about it. OTOH, it goes against the claim that VBO mitigates the memory management problem. Expensive binding still pushes us to batch many objects into common buffers instead of having many independent VBOs. So yeah, we can allocate a dozen VBOs instead of one big [partitioned] VAR chunk, but not thousands of VBOs like some have suggested.

I'll have to test this, but it also seems to me after reading this that if Map/Unmap potentially demotes a buffer back to system memory, I may be better off keeping my parallel system RAM and AGP/Video copies of vertex data (i.e., for efficient arbitrary reads and faster draws). I'd hoped that specifying COPY (READ and DRAW) would do this dual-buffer trick internally, but I don't get the sense that's the case. Or am I reading too much into this?

Avi

knackered
11-06-2003, 02:40 AM
I'll have to test this, but it also seems to me after reading this that if Map/Unmap potentially demotes a buffer back to system memory
Yes, I'm a little worried about this too for the same reason - I keep an application copy of the vertex data in system memory too - for arbitary reading and enabling the app to make changes to a single copy of the vertex data while the various renderers copy from that common store.
I think I'd have prefered it if, when the map call is made, we could specify a system memory pointer for the buffer to 'unmap' (copy) from once the unmap call is made - instead of being given a portion of memory to copy into.
Wouldn't that potentially eliminate a redundant copy?

Christian Schüler
11-06-2003, 03:03 AM
> but it also seems to me after reading
> this that if Map/Unmap potentially
> demotes a buffer back to system memory,
> I may be better off keeping my parallel
> system RAM and AGP/Video copies

For me, Map/Unmap together with STREAM_DRAW works as advertised. I can hit the transform limit with 16 bytes / vertex over AGP 4x, interleaving CPU calculation and drawing calls.

for( ... )
{
glBufferData( ..., 0, STREAM_DRAW )
pointer = glMapBuffer( ... );
CPU_Calculation( pointer );
glUnmapBuffer();

glDrawElements( ... );
}

It's the most simple scheme to use. Better even than the DirectX DISCARD/NOOVERWRITE.

cass
11-06-2003, 06:35 AM
Originally posted by Hampel:
About the mentioned GL_PIXEL_PACK_BUFFER and GL_PIXEL_UNPACK_BUFFER as new targets for PBO: are both really necessary? I think, a target GL_PIXEL_BUFFER with a usage pattern of *_DRAW or *_READ should do the same thing?

Hampel

These are just separate binding points. They have nothing to do with the usage hint.

It's true that you don't "need" separate binding points, but they make the design cleaner, IMO.

NitroGL
11-06-2003, 08:36 AM
Originally posted by knackered:
Why dave? what are you planning to do with the pbo extension that gets you so excited?

Perhaps that it's a DECENT replacement for the crappy render texture extension we have now? Gah, I *HATE* WGL_ARB_render_texture.

Adrian
11-06-2003, 09:15 AM
From the whitepaper "Use glDrawRangeElements instead of glDrawElements"

Hmm, I use glMultiDrawElementsEXT so I could really do with a glMultiDrawRangeElementsEXT http://www.opengl.org/discussion_boards/ubb/smile.gif

Korval
11-06-2003, 10:56 AM
I think I'd have prefered it if, when the map call is made, we could specify a system memory pointer for the buffer to 'unmap' (copy) from once the unmap call is made - instead of being given a portion of memory to copy into.

Isn't that what glBufferSubData does? It takes a pointer and copies that into the VBO.

My concern stems from my current use of VBO's. I do a lot of streaming from the hard disk. The behavior that I would like to see is for the VBO's to remain in video memory at all times as I frequently (maybe every 5-20 seconds) upload new information to them. When I do an upload, it will be to VBO's that are not in actual rendering use.

I don't think STREAM_DRAW is appropriate for this circumstance. And DYNAMIC_DRAW, on nVidia hardware, seems to want to use AGP memory rather than video. The only thing that's left seems to be STATIC_DRAW, but will this cause problems (as the driver expects the data to be uploaded only once)?

Korval
11-06-2003, 10:58 AM
Perhaps that it's a DECENT replacement for the crappy render texture extension we have now? Gah, I *HATE* WGL_ARB_render_texture.

How would PBO be any faster than copying to a texture? And, therefore, how would this be a decent alternative to ARB_render_texture, which (at least on ATi cards) is faster?

Cyranose
11-06-2003, 11:15 AM
Originally posted by cschueler:
For me, Map/Unmap together with STREAM_DRAW works as advertised. I can hit the transform limit with 16 bytes / vertex over AGP 4x, interleaving CPU calculation and drawing calls.

for( ... )
{
glBufferData( ..., 0, STREAM_DRAW )
pointer = glMapBuffer( ... );
CPU_Calculation( pointer );
glUnmapBuffer();

glDrawElements( ... );
}


Interesting. I'm guessing your access flags to glMapBuffer are WRITE_ONLY, which wouldn't need to force a demotion. You might see what happens if you change that to READ_WRITE, even without actually reading the memory.

The question for me is whether the whole buffer gets permanently demoted, or whether there now exist two parallel copies, one readable and one in faster memory. of course, that implies that unMap does a copy back to the fast buffer when you're done editing.

I guess the other thing to test is glBufferSubData--whether, if there are N small changes to a buffer it pays to do N glBufferSubData calls or just big one for the whole buffer and where the cutoff N is. Anyone tried that?

Avi

Tom Nuydens
11-06-2003, 12:10 PM
Originally posted by Korval:
The behavior that I would like to see is for the VBO's to remain in video memory at all times as I frequently (maybe every 5-20 seconds) upload new information to them. When I do an upload, it will be to VBO's that are not in actual rendering use. (SNIP) The only thing that's left seems to be STATIC_DRAW, but will this cause problems (as the driver expects the data to be uploaded only once)?

STATIC_DRAW is just a hint. The code still has to work even if you update the buffer seven times per frame, although the updates might be slower than when you create your VBO as STREAM or DYNAMIC.

If however you're only going to update once every 5-20 seconds and while you're not using the VBO, that's potentially hundreds of frames in between updates. I think that qualifies as "static enough" http://www.opengl.org/discussion_boards/ubb/smile.gif

-- Tom

NitroGL
11-06-2003, 01:40 PM
Originally posted by Korval:
How would PBO be any faster than copying to a texture? And, therefore, how would this be a decent alternative to ARB_render_texture, which (at least on ATi cards) is faster?

It's not about speed, it's about flexibility. The current render texture extension is a freakin' mess, and the PBO extension would "fix" any platform issues (at least I would hope so).

Humus
11-06-2003, 02:52 PM
You don't happend to be thinking of super_buffers rather than PBO?

Zengar
11-06-2003, 10:38 PM
Originally posted by Korval:
How would PBO be any faster than copying to a texture? And, therefore, how would this be a decent alternative to ARB_render_texture, which (at least on ATi cards) is faster?

Hm... context change penalty?

OldMan
11-10-2003, 01:50 AM
Originally posted by knackered:

I'll have to test this, but it also seems to me after reading this that if Map/Unmap potentially demotes a buffer back to system memory
Yes, I'm a little worried about this too for the same reason - I keep an application copy of the vertex data in system memory too - for arbitary reading and enabling the app to make changes to a single copy of the vertex data while the various renderers copy from that common store.
I think I'd have prefered it if, when the map call is made, we could specify a system memory pointer for the buffer to 'unmap' (copy) from once the unmap call is made - instead of being given a portion of memory to copy into.
Wouldn't that potentially eliminate a redundant copy?

That is a little bit more complicated than that. As operating System Developer I am inclinated to say that probably the mapping is not a copy. If the maping made a copy to a pointer you give.. it would not be called mapping.

I beleive it a two level adressing system.. like in logical->phisical memory mapping.
Or like the VIA sirtual interfaces for high speed network interfaces (Myrinet and so). In fact.. if anyone wants to give alook.. I think VIA buffers are pretty much the same thing.

I may be wrong (my impressions are of SO developer), but that way would justify the name (mapping) and would provide much better performance.