Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 2 of 3 FirstFirst 123 LastLast
Results 11 to 20 of 30

Thread: sampler-Variables in Uniform-Blocks

  1. #11
    Junior Member Regular Contributor
    Join Date
    Nov 2012
    Location
    Bremen, Germany
    Posts
    149
    I still do not get the Point. There is a finite set of API-commands that Change the data inside a uniform-block. There is a finite set of commands that change buffer-data. One could easily prevent writes to a buffer bound to a uniform block containing opaques or deal with those cases - be it with loss of Performance for any readbacks that may be required.

  2. #12
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    985
    Once again:

    1. At the time you modify your buffer data (e.g. using glBufferSubData or though a mapped pointer) the driver doesn't know which uniform block you'll bind your buffer to thus it cannot know which buffer addresses will actually represent a sampler.
    2. A sampler variable is an opaque type. You do set it to the texture unit index using glUniform*, but a texture unit index is an API concept and the actual data that a sampler variable holds can vary from hardware to hardware and it might be way bigger data than one that fits into a 32 bit integer.
    3. What happens if the buffer is written by the GPU using image stores, shader storage block writes or transform feedback and then it is immediately used as a uniform buffer afterwards. How could you "patch" the sampler values then? Should the CPU wait on the first pass being done on the GPU, parse the buffer, patch it and then start the second step on the GPU? That would be horribly inefficient.
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

  3. #13
    Advanced Member Frequent Contributor
    Join Date
    Apr 2009
    Posts
    590
    Quote Originally Posted by hlewin View Post
    I still do not get the Point. There is a finite set of API-commands that Change the data inside a uniform-block. There is a finite set of commands that change buffer-data. One could easily prevent writes to a buffer bound to a uniform block containing opaques or deal with those cases - be it with loss of Performance for any readbacks that may be required.
    I think the main point you have missed is the fundamental point I made: what a sampler is and how a GPU accesses it is completely determined by a GPU. What I think you see is this: a sampler object is an integer, that integer holds what texture unit to use. What texture to use and how it is sampled is "stored" by the texture unit. That is the interface that GL exposes, but that may or may not be at all what happens inside an implementation. The only guarantee one happens is that when one calls glUniform1i passing the uniform location of a sampler is that a GL implementation will make sure that the data used is whatever texture is bound at the named unit at call time and filtered as according to what is bound to that texture unit. Internally, except for the driver to track, what happens is likely nothing until the actual draw call. When a draw call is finally issued, a GL implementation likely then looks at what is bound to the named texture unit and sets the GPU state for all of those goodies. The best analogy I can give you is this:

    On CPU (not GPU), there is an array, indexed by texture unit, storing what data and how to filter that data.
    On CPU (not GPU), as part of program state, each sampler stores an index into that array.
    On CPU (not GPU), a GL implementation then looks at that index, and sets GPU state by the values of that array. In addition, it likely also does additional work to make sure the data for the texture is resident in VRAM.

    What you are thinking, I think, is that the array is stored on GPU and the GPU architecture is flexible enough to look at that array. That may or may not be the case at all. Indeed, even the NVIDIA extension requires one to -by hand- make sure the data is resident, but beyond that it just wants a 64-bit address to be happy.

    Lets go one wild, ugly step further. Suppose that a GPU's architecture does NOT have a dedicated discreet piece of hardware to do filtering. Suppose that the filtering is done by doing stuff to the assembly of the shader? Such a GPU is still ok for GL, since the driver would then for each shader store a map of shaders keyed by texture filtering. This might sound wild, but it may not be totally wild. Indeed as a related, but not really, an example, NVIDIA Tegra2 adds additional shader code to a fragment shader based upon blending state.

  4. #14
    Junior Member Regular Contributor
    Join Date
    Nov 2012
    Location
    Bremen, Germany
    Posts
    149
    I do not think I've missed the Point. I just don't care about implementation-issues that come up when making decisions about how I think the spec should look like.
    All that ever again is come up with is implementation issues that read like:
    "You cannot do that when calling glBufferData, you would have to do it when calling glBindBuffer as well. So it is impossible." I don't write the specs. I do not implement opengl. I make suggestions.
    The Argument is simple:
    - Samplers can be set by a number as texture Units are enumerable -> uniform1i
    - Data buffers can be watched for changes: One can enumerate the possibilites that can change buffer Contents.
    - Buffer-bindings are well defined: it is known where which part of the buffer is bound to.
    From this it follows that an integer representing a sampler can get fiddled out of the buffer either if it's data changes or if it gets bound.

    As for Performance issues: an api does not Need to prevent conditions in which an Operation would be slow by making the Operation impossible. If a buffer bound to a uniform-block is written to, that is a bad practice. Who would bind a uniform-block-bound-buffer containing opaques to an image-unit and render to it expecting full performace operation? That might be possible. That might be impossible. I do not know about the hardware-Details. One could even prevent using the same buffer in incompatible contexts if needed. That would result in an error either if binding the buffer to an uniform-block with opaques or as Memory-pool for an image, as pixel-(un)-pack buffer and so on. That are Details I'm not concerned with. What I'm concerned with by making this suggestion is that I think the api is missing some Point that makes it less practicable to use in certain use-cases.

  5. #15
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948
    Buffer-bindings are well defined: it is known where which part of the buffer is bound to.
    Consider this:

    Code :
    GLuint uniformBuffer;
    glGenBuffers(1, &uniformBuffer);
    glBindBuffer(GL_UNIFORM_BUFFER, uniformBuffer);
     
    GLuint bufferData = 5; //Use texture image unit 5.
    glBufferData(GL_UNIFORM_BUFFER, sizeof(bufferData), bufferData, GL_STATIC_DRAW);

    This is the only thing OpenGL sees. How is OpenGL to know, at the time glBufferData is called, which uniform block this buffer is going to be used with? There is no glCompileShader, glLinkProgram, glUseProgram, or similar function in this code. This code is perfectly legal to call before any shaders have been compiled. There is no uniform block yet. So how does OpenGL know, from this code alone, that this 4-byte block should be interpreted as a sampler uniform?

    OpenGL simply has no way of knowing that any particular upload of data to a buffer object is destined for any particular uniform block. And without that knowledge, OpenGL cannot determine at the time data is uploaded what is and is not opaque.

    If a buffer bound to a uniform-block is written to, that is a bad practice.
    That's the part you don't understand: a buffer is never bound to a uniform block. The association between a uniform block and a buffer object is implicit. It's done by separate state, one part in the context, and one part in the program. Without both, OpenGL has no idea how a buffer object will be used. And until you actually render, OpenGL can't be sure that any particular buffer binding state is not merely temporary.

    And there's nothing in the API that requires the use of a program when uploading data to a buffer object. Without that knowledge, there is no way to understand what particular data means.

    an api does not Need to prevent conditions in which an Operation would be slow by making the Operation impossible.
    If you can do something, then that something should be reasonably fast. And if it's not possible to make something reasonably fast, then the user shouldn't be able to do it at all. That's good API design.

    To do otherwise creates performance traps for the user, where simple and obvious uses of the API are terribly slow without any warning. That makes the API harder to use for no real benefit; users have to have some arbitrary knowledge, outside of what a function does, to know what is the proper way to use the API.

    OpenGL already has too many of these performance traps as it is (especially around buffer objects); it doesn't need more of them.

    That are Details I'm not concerned with. What I'm concerned with by making this suggestion is that I think the api is missing some Point that makes it less practicable to use in certain use-cases.
    In short: you want the feature, and you don't care if it's actually possible to implement, or what the performance implications of implementing it will be, or how it will affect the useability of the API. Fortunately, the ARB does care about these things, which is why it doesn't exist and won't in the near future. At least, not this way.

    As I said earlier on, it's best that you not change opaque uniform settings to begin with. You should set them once, and leave them that way; this is how most code is written. It's easier to bind a texture to the right texture image unit than to change which unit is used by a shader. So even if this were done, only a fraction of users would actually need it. Bindless texturing is mostly about eliminating the glBindTexture overhead, as well as potentially determining which texture to use in the shader itself. So even users of that aren't using it for the reasons you're talking about.

  6. #16
    Junior Member Regular Contributor
    Join Date
    Nov 2012
    Location
    Bremen, Germany
    Posts
    149
    That's the part you don't understand: a buffer is never bound to a uniform block. The association between a uniform block and a buffer object is implicit.
    As I understand the spec blockBindings say:Get the data from that buffer if needed. So - if a block-binding gets established after the first code-fragment, it can be checked if the block contains uniforms that require special handling. That is: the opaque types. This takes place before a shader can use the data contained in the block. The other way around it is the same: If a block binding has been established and bufferData() is called to update data - that can be checked. Once again, before any shader uses the data.
    You are concerned with the internal data-type of the opaque i guess. That is INTERNAL data and could be moved to a special memory-location that the user does not even know of - existentially.
    thought of as
    block {
    sampler2D s; // this would be an integer to the c-interface and does not get used at all by the gpu
    };
    hidden_block_of_data: for example the nvidia 64 bit unsigneds
    So if it gets requested that data for block should be pulled from a buffer everything stays the way it is except that the hardware-state gets updated if necessary and the hidden-block gets updated with data.
    I could call glGetBufferSubData( offsetof(s), &s); glUniform1i(hidden_block_location, s) whenever I establish a blockBinding or update the buffer data.

    EDIT:
    Fortunately, the ARB does care about these things,
    You seem to be quite familar with the ARB. Do you have a seat?

    EDIT:
    If you can do something, then that something should be reasonably fast. And if it's not possible to make something reasonably fast, then the user shouldn't be able to do it at all. That's good API design.
    That is API-design that expects that the user does not know what he is doing.
    If uniformBlocks with opaques needed to be handled differently then that could be done.
    Describing that would take one or two extra-lines in the spec.

    Could be done as follows: "If a block-binding is established or the contents of a buffer bound to an uniform block is updated any objects with an opaque data-types are made effective instantly, not just when a shader reads data from the block. Buffers bound to uniform-blocks containing opaque types cannot be bound to <whatever-can-be-modified-by-shaders> at the same time."
    Last edited by hlewin; 01-30-2013 at 02:11 PM.

  7. #17
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948
    OK, I'm going to say this one more time, and then I'm done:

    There is no such thing as a "buffer bound to a uniform block." Buffers are not bound to uniform blocks. They are bound to the context, to indices in the GL_UNIFORM_BUFFER binding point. Programs are bound to the context. The association between bound buffer objects and program uniform blocks is implicit. Uniform blocks in programs reference an index in the GL_UNIFORM_BUFFER binding point.

    Therefore, the only time OpenGL knows when a buffer object is unquestionably to be used for a specific uniform block is when you render. And never before that point. Therefore:

    1: There is no way to detect when this happens at data upload time. You can't catch opaque indices and convert them into something else at the time the user uploads data to the buffer. So your glBufferData intercept stuff is out.

    2: Detecting this at render time means either storing the buffer in CPU-accessible memory or doing costly GPU-CPU readbacks when you render. Either way, you're losing performance. Guaranteed.

    In short, there is no way to make this anywhere nearly as fast as just using glUniform1i. So what exactly the point?

    Or, to put it another way, what is the compelling use case for this feature besides "I want to do it?"

    You seem to be quite familar with the ARB. Do you have a seat?
    No, but I saw what they dropped with GL 3.1. And I've seen what they added. And, generally speaking, the modern ARB doesn't add APIs that can be misused easily.

    That is API-design that expects that the user does not know what he is doing.
    No, this API design expects that the user doesn't have magical, unspecified knowledge of what happens to be fast and what happens to be slow.

  8. #18
    Junior Member Regular Contributor
    Join Date
    Nov 2012
    Location
    Bremen, Germany
    Posts
    149
    In short, there is no way to make this anywhere nearly as fast as just using glUniform1i. So what exactly the point?
    The point is a use case where the performance-hit due to a readback would be neglicable. That is where the ease of use and genericity would have priority. For example if initializing all state-variables of a shader.
    One can define uniform-blocks, do a little preprocessing and get a c-structure that can be used to mirror an uniform-block.
    Everything works fine until you get to the opaque types. That means that one cannot define one data-block per shader, mirror it's data and simply upload the whole block whenever the shader/program gets bound. Same thing with the offset-alignment requirements for bind buffer range. They make things near unusable. Consider the following
    Code :
    VertexShaderVariables{
    //...
    }
    FragmentShaderVariables{
    //...
    }
    Each of those can be mirrored with c-structures. But when one then tries to
    Code :
    struct ProgramVariables {
       struct VertexShaderVariables vsVars;
       struct FragmentShaderVariables fsVars;
    };
    one cannot simply create one buffer and use bindbufferrange to map the sub-structures to uniform-blocks. Just because of the alignment-requirements. That sucks. One could easily live with that readbacks can occur and so on which would lead to the conclusion that updating particular variables is faster than replacing whole program states. The offset alignment requirement should be hidden from the user. The implementation should split the buffer as needed should that be necessary. The user should not have to care about such things.
    Would the api define things that way,maybe the hardware vendors would organize things in a way that reduces any performance hits. The difference between hardware and software is not that big when it comes to adapting to the needs present.

  9. #19
    Junior Member Regular Contributor
    Join Date
    Nov 2012
    Location
    Bremen, Germany
    Posts
    149
    Code :
    Or, to put it another way, what is the compelling use case for this feature besides "I want to do it?"
    I had to look this up. Of course there is none. I'm a user of OpenGL - not am implementor, But - would I have to decide which api to use for my next projects I'll have a very close look on D3D as an alternative because of such things- and that despte of the fact that I have a linux-background which means preferring to write things easily portable. Of course that decision will be made in half-knowledge as I guess the pitfalls and limitations will crop up during implementation. But I guess as d3d is not an open-standard like opengl it does not hinge behind because the specs may only contain common-ground,that is, things that all vendors see as "no problem". If that means that directly rendering to program-variable blocks is impossible i have no problem with that. I don't like extensions from particular vendors as the functionality is missing on other hardware then. Using d3d hardware not supporting that stuff disappears from the market because things get slow because of software-emulation. At least I guess so.

  10. #20
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948
    Everything works fine until you get to the opaque types. That means that one cannot define one data-block per shader, mirror it's data and simply upload the whole block whenever the shader/program gets bound.
    Then stop pretending that opaque types are uniforms like any others. Don't set them every time a shader is bound. Set them once, during initialization. Leave them set to those values.

    Then you can change all the uniforms you want on a per-object basis without incident. You shouldn't need to be changing texture image units and such.

    That's exactly why we have layout(binding) syntax; so that we can set these things in the shader and not have to ever set them in our code.

    Your mistake is wanting to set these uniforms at all.

    one cannot simply create one buffer and use bindbufferrange to map the sub-structures to uniform-blocks. Just because of the alignment-requirements.
    Sure you can; you just can't do it that way. You can have each block's data in the same buffer, but you can't do it by putting them all in one struct. You have to manually put them into a buffer.

    Just because you can't do it the way you want doesn't mean it can't be done.

    One could write some generic code that would take an arbitrary boost::tuple of structs and create or update a buffer object based on them. C++11 makes this rather much easier with variadic templates, though Boost.Fusion makes it possible on pre-variadic compilers. It's not too difficult to do; just time-consuming to write.

    The offset alignment requirement should be hidden from the user. The implementation should split the buffer as needed should that be necessary. The user should not have to care about such things.
    The reason the offset alignment is exposed is to ensure maximum performance. It gives implementations the freedom to do things the fastest way possible, which requires imposing upon users that they do things a certain way. What you want makes things slower.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •