Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 8 of 10 FirstFirst ... 678910 LastLast
Results 71 to 80 of 92

Thread: Official feedback on OpenGL 4.1 thread

  1. #71
    Super Moderator Frequent Contributor Groovounet's Avatar
    Join Date
    Jul 2004
    Posts
    934

    Re: direct_state_access_memory?

    An OpenGL 4.1 review: http://www.g-truc.net/post-0320.html

  2. #72
    Junior Member Newbie
    Join Date
    Jan 2009
    Location
    Poland
    Posts
    26

    Re: direct_state_access_memory?

    I have used these new features such as separable shader objects, pipeline objects and direct state access for some time already and I have to say the API is sooo much cleaner now, also it's very easy to plug it into an OOP wrapper now. GJ everyone involved!

  3. #73
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    985

    Re: direct_state_access_memory?

    I don't know whether anybody has already mentioned but we cannot talk about OpenGL 4.1 as a superset of DX11 until we get the following extensions in core:

    GL_EXT_shader_image_load_store
    GL_EXT_shader_atomic_counters

    These are needed for example to implement DX11 linked list buffers.
    They also come handy to create virtually any dynamic data structure on the GPU.

    Btw, enabling RW textures also needs the spec to specify how depth testing is working as currently it is so that the GL officially puts the depth test after fragment shader but allows it to run before it (earlyZ).
    This wasn't a problem previously as the only output of a shader was the fragment color and depth, but now as even a discarded fragment can write to buffers, there is need for a way to specify whether we would like to enable or disable earlyZ.
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

  4. #74
    Super Moderator Frequent Contributor Groovounet's Avatar
    Join Date
    Jul 2004
    Posts
    934

    Re: direct_state_access_memory?

    For linked list it's GL_NV_shader_buffer_store

  5. #75
    Junior Member Newbie
    Join Date
    Oct 2010
    Posts
    1

    Re: direct_state_access_memory?

    GL_NV_shader_buffer_store is great, but it does not have all the functionalities of its DX11 equivalent.
    In particular append/consume buffers. They can be emulated with atomics to a buffer in global memory but i am not sure how efficient that is compared to the DX11 implementation.

  6. #76
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    985

    Re: direct_state_access_memory?

    Quote Originally Posted by Groovounet
    For linked list it's GL_NV_shader_buffer_store
    Actually GL_EXT_shader_image_load_store is much more than GL_NV_shader_buffer_store + GL_NV_shader_buffer_load together.
    It provides RW buffers, atomic counters and explicit early/late depth/stencil tests.

    Off-topic, but besides these, what is also very interesting to me is GL_AMD_conservative_depth. It allows for the driver to sometimes perform early Z even in case the shader changes the Z value.
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

  7. #77
    Super Moderator Frequent Contributor Groovounet's Avatar
    Join Date
    Jul 2004
    Posts
    934

    Re: direct_state_access_memory?

    It provides RW images but nothing about buffers.
    Well, I guess the linked list could be implemented with image1d and actually I wonder about the difference between the too performance wise.

    Also GL_AMD_conservative_depth early-z is actually part of GL_EXT_shader_image_load_store and take the form:
    layout(early_fragment_tests) in;

  8. #78
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948

    Re: direct_state_access_memory?

    Also GL_AMD_conservative_depth early-z is actually part of GL_EXT_shader_image_load_store and take the form:
    layout(early_fragment_tests) in;
    That's not the same thing. They solve two different problems.

    The EXT_shader_image_load_store version commands the implementation to put the depth test before the fragment shader.

    AMD_conservative_depth is all about retaining the behavior of early depth test transparently. With conservative depth, you're effectively making a contract: you're changing the depth, but you are doing so in such a way that some depth tests will always work.

    With AMD_conservative_depth, you're making a contract with OpenGL, not telling OpenGL what to do. The spec specifically states that if you violate the contract, the results of rendering are undefined.

    Which is better depends on how much control you want over these kinds of things vs. how much freedom the implementation should have to make your stuff faster. For example, if you force the implementation to do early-depth test, you can do some crazy things like writing depth values that would have failed the depth test had the test actually happened with those values. This is a full-fledged feature that you would use to create a special effect.

    At the same time, forcing early-depth test does nothing for optimizing the case where you're biasing a fragment's depth smaller than the actual depth value. You have to do the depth test last; you can't force early-depth. However, if the hardware is capable of doing both early and late depth tests, then an implementation of AMD_conservative_depth would allow this with the "depth_less" version. The early depth test could cull some fragments that are obvious failures, while the late depth test could catch a few fragments here and there. Most of the fragments missed by the early test will actually be drawn, so little performance (outside of the double depth test) is lost.

    This is purely an optimization; the meaning of the depth test is preserved in these cases.

  9. #79
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    985

    Re: direct_state_access_memory?

    Agree with Alfonse Reinheart. GL_AMD_conservative_depth does not specify whether to use early or late depth test but rather it tells what can the GL assume about the depth modification done in the shader to provide optimum performance by using early depth test as often as possible.

    Just think about the following. The GPUs usually use a hierarchical Z buffer to reject multiple fragments during depth test at once. This is usually set up to one direction, like the next mip in the hierarchy always contains the maximum of the corresponding depth values from the previous mip level (nowadays GPUs often use a two direction hierarchical Z buffer). In such a case if you specify "depth_greater" then you may get early Z even though you modify the depth value by increasing it in the fragment shader.

    About GPU linked lists, it needs only a 2D RW image, a RW buffer and atomic counters (e.g. in case of order independent transparency). Actually there is no too much difference between images and buffers thus actually 1D image instead of a buffer will work as good and fast as a buffer on today's hardware. Think about, RW images are actually buffers, you don't have filtering and stuff like that, you can use only exact texel fetches like in case of buffer textures. So no real difference.
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

  10. #80
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    985

    Re: direct_state_access_memory?

    I have another thing that I would like to be included in OpenGL 4.2, namely:

    GL_ARB_draw_indirect2

    Such an extension does not exist yet, however it would be good to have one that will include the followings:

    1. First instance ID field. This is already mentioned in the GL_ARB_draw_indirect and it would define the first element ID to use from an instanced array.
    2. MultiDrawArraysIndirect and MultiDrawElementsIndirect. The original extension says there would be little to no use of such a functionality. However, if hardware would be able to fetch the data transparently, this would allow for drawing loads of individual objects with a single draw call enabling full GPU based scene management. In order to properly introduce this, I would recommend also the introduction of indirect buffer objects as well as a VertexAttribPointer-like function to define the stride of the data in the indirect buffer.

    What do you think about it?
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •