Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 6 of 6 FirstFirst ... 456
Results 51 to 53 of 53

Thread: Bindless Stuff

  1. #51
    Junior Member Regular Contributor
    Join Date
    Sep 2003
    Location
    Ireland
    Posts
    136

    Re: Bindless Stuff

    Quote Originally Posted by Alfonse Reinheart
    Then if task1 + task2 take a long time and task3 + task4 can be completed while task1 + task2 are still running
    GPUs are required to do things in order. Thus, task3 is guaranteed to complete after task2.
    Is it more accurate to say that GPUs are required to do things in order if necessary? So task 3 could complete before task 2 and taks 1 provided it didn't depend on the state of any of the resources used by them?

  2. #52
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,793

    Re: Bindless Stuff

    Is it more accurate to say that GPUs are required to do things in order if necessary?
    Not in any terms that the user is allowed to see. The GPU cannot report completion of task 3 before tasks 1 and 2 complete. So even if it does reorder things, you are forcibly insulated from the effects of that.

  3. #53
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    2,892

    Re: Bindless Stuff

    In the spirit of Rob's sharing, I'll throw this out there.

    It's interesting to apply Rob's "streaming VBO" technique to static geometry too, and then take advantage of temporal coherence. That is, if you've uploaded a batch before and you haven't orphaned its buffer yet, then... yep, you guessed it. You don't need to upload it again. Just launch the batch from the old location, again.

    In the ideal case (static/near-static scene, static/near-static viewpoint), you end up with perf that's pretty darn close to NVidia display lists (or bindless preloaded VBOs). Now that is sweet! Worst case, it's about client arrays perf, which isn't shabby. Afterall, you gotta get the data to the GPU at least once (though if GL4, you allegedly could use ARB_copy_buffer/bg thread to accelerate that).

    You can think of all kinds of ways to improve upon this to maximize perf (maximize cache "hits").

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •