Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 5 of 5 FirstFirst ... 345
Results 41 to 47 of 47

Thread: VBOs strangely slow?

  1. #41
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948

    Re: VBOs strangely slow?

    Imagine you have more data than will fit on a GPU and you can't display a "Loading" screen as the character moves--_very_ quickly.
    So you are streaming. Then it isn't a static buffer, is it

    In order to do a streaming world, you have to have some memory set aside for doing streaming into. And since you're streaming to the GPU, this would include buffer objects.

    These buffer objects, just like the streaming space in main memory, are not currently in use. They're not currently being rendered from. So there's no need to orphan them. Just upload data to them, and when you need them, display them. If you need more time, then extend the boundaries of the streaming blocks.

    Even across a PCIe bus, you can expect 1GB/sec transfer speeds. So in approximately 1 second, you can replace the entire contents of your GPU's memory.

    So just make sure that you pad your streaming time by, say, 0.5 seconds. If you are streaming X segments from disk, and it takes on average 1.5 seconds to get that data from disk, make sure that your application has a 2 second window between the time it provoked the streaming and the time it starts using it.

    I think I can figure out when nothing is ready.
    How? If you think stalls are being created from a buffer object not being finished uploading, how do you know that a smaller buffer is finished uploading?

  2. #42
    Intern Contributor
    Join Date
    May 2008
    Posts
    94

    Re: VBOs strangely slow?

    Thanks Alfonse. That's a good way of looking at it. I'll have to figure out how to page ahead for my application.

  3. #43
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,124

    Re: VBOs strangely slow?

    Quote Originally Posted by Rob Barris
    So you don't have to juggle "multiple VBO's", you just need to keep blasting away at the one VBO while letting the driver swap in new chunks of storage as needed.

    Client never needs to fence, or check GPU progress, or block on map.

    Write&draw, write&draw, repeat til VBO full, orphan and rewind cursor, repeat. CPU gets to drop off all of its data and draw requests and potentially go on to do other tasks without a care as to how many orphaned buffers (storage blocks) wind up in flight or how fast the GPU is retiring them.
    Client arrays for VBOs. Slick. Thanks for the detailed write-up!

  4. #44
    Member Regular Contributor
    Join Date
    Nov 2003
    Location
    Germany
    Posts
    293

    Re: VBOs strangely slow?

    hi,
    just for me understanding correctly:

    orphane a buffer means glBufferData(.., NULL); or map using invalidation?

  5. #45
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,124

    Re: VBOs strangely slow?

    Quote Originally Posted by Chris Lux
    orphane a buffer means glBufferData(.., NULL); or map using invalidation?
    Yes. Per Rob in previous post:
    Quote Originally Posted by Rob Barris
    ...orphan current storage by doing a new glBufferData using the fixed size chosen for the VBO, and a NULL pointer.

  6. #46
    Member Regular Contributor
    Join Date
    Apr 2006
    Location
    Irvine CA
    Posts
    299

    Re: VBOs strangely slow?

    There are two kinds of invalidation that MapBufferRange can do, and they have very different purposes.

    One is tied to MAP_INVALIDATE_BUFFER_BIT in the access parameter to MapBufferRange. This essentially means "orphan". So in the usage I described, you could set this bit when you go back to offset 0 and get the same effect as BufferData(NULL).

    The other is a bit more subtle, and it is tied to MAP_INVALIDATE_RANGE_BIT. This may seem a bit redundant, but it is important. It explicitly tells the driver up front "the range I am mapping - it does not need to contain valid data that I can read" - it is a signal to the driver that it is free to replace every single byte in that range with whatever is in your CPU-visible mapped buffer area upon unmap (or explicit flush).

    The freedom this provides to the driver, if you have also set the WRITE bit but not the READ bit, is that it can hand back a pointer to completely uninitialized scratch memory - which may well be driver allocated for write-through uncached access etc. By opting into invalidation of the range, you eliminate any need for the driver to put a copy of valid data in that range prior to returning the pointer. If an implementor wanted to keep system-memory images of buffers to a minimum, this would let that driver provide scratchpad memory for maps using these bits (write + invalidate-range) - and then transfer those bits to the final destination later, perhaps via DMA.

    Restated more simply, think of MAP_INVALIDATE_RANGE_BIT as a "promise to write the whole range, nothing but the range, and never read from the range" bit.

  7. #47
    Junior Member Newbie
    Join Date
    Jul 2010
    Posts
    1

    Re: VBOs strangely slow?

    Great info in this topic, thanks everyone! Quick question...on OpenGL implementations without MapBufferRange support (i.e. OpenGL ES), are there any good alternative ways to implement the dynamic vertex ring buffer that Rob suggested? It sounds like glBufferSubData has some pitfalls in the general case (and with the style of workload described). Is it best to stick with standard non-VBO vertex arrays in this case? Thanks!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •