Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 3 of 3 FirstFirst 123
Results 21 to 28 of 28

Thread: unbound sampler declaration andd GLSL shader code

  1. #21
    Senior Member OpenGL Pro
    Join Date
    Apr 2010
    Location
    Germany
    Posts
    1,128
    Quote Originally Posted by fred_em
    Even if my condition is "if (gl_FragCoord.x > 10000)" which is not a dynamic-uniform (as you said), I don't see any reason why the GPU would choose to evaluate the wrong block of code.
    You got that twisted. Conditional statements which are dynamically uniform and evaluate to false will not trigger execution of the code. Since your statement isn't dynamically uniform, potentially both branches are executed as GClements pointed out and if then excuted code triggers undefined behavior which isn't handled graciously by the implementation, you get into trouble.

    Here's why your condition isn't dynamically uniform:

    - it's not a constant expression, so the trivial case is out
    - gl_FragColor is not a uniform but a built-in fragment shader input:

    Code :
    in vec4 gl_FragCoord;

    The GLSL Spec states:

    A fragment-shader expression is dynamically uniform if all fragments evaluating it get the same resulting value.
    It should be obvious that this is generally not (if ever)the case for an expression involving gl_FragCoord - even though it will always evaluates to false.
    Last edited by thokra; 06-26-2013 at 09:43 AM.

  2. #22
    Junior Member Regular Contributor
    Join Date
    Jul 2010
    Posts
    132
    Quote Originally Posted by thokra View Post
    You got that twisted. Conditional statements which are dynamically uniform and evaluate to false will not trigger execution of the code.
    I didn't get it twisted. My problem came from the samplers not being bound, regardless of whether I was using dynamically evaluted uniforms or something else (I was indeed using a dynamically evaluated 'isBound' uniform to prove that the problem indeed came from the unbound samplers). That was my original problem and question, let's not forget that.

    Now, the discussion slipped towards GPU execution with your reply:

    Quote Originally Posted by thokra
    The cause is that under certain conditions, all branches of a conditional statement are evaluated, no matter what
    and the one of GClements. I am learning something new, as I didn't know the GPU could execute code in the wrong code path. However I can't take that for granted because you say it. Can you prove me, can you explain me the reason, the rationale behind which the GPU would execute the wrong code path? Especially since it doesn't have a branch prediction unit?

    Quote Originally Posted by thokra View Post
    ..., potentially both branches are executed as GClements pointed out and if then excuted code triggers undefined behavior which isn't handled graciously by the implementation, you get into trouble.
    How can the GPU execute both branches? How is this possible?

    From http://docs.nvidia.com/cuda/pdf/ptx_isa_3.1.pdf:

    "If threads of a warp diverge via a data-dependent conditional branch, the warp serially executes each branch path taken, disabling threads
    that are not on that path, and when all paths complete, the threads converge back to the
    same execution path"

    I understand the code is not executed when it shouldn't. Fortunately not! imageStores() would better not be executed!
    Last edited by fred_em; 06-27-2013 at 12:10 AM.

  3. #23
    Member Regular Contributor malexander's Avatar
    Join Date
    Aug 2009
    Location
    Ontario
    Posts
    319
    The issue is that OpenGL just defines the GLSL specification, and not the underlying implementation. So it's quite possible that leaving a sampler unbound will work on some platforms and not others. Even driver updates could potentially change the behaviour of your program if the compiler changes significantly enough (I have had this happen in other cases).

    For example, one implementation might recompile the shader to optimize out uniform conditionals. Another might serialize the cases, and a third might run both cases for all pixels/vertices but only commit results for the ones that passed the conditional test. All are valid compiler strategies. You just don't know the exact details of how the shader is compiled and executed, so it's possible that bad things might happen if you leave a sampler unbound.

  4. #24
    Member Regular Contributor
    Join Date
    Jun 2013
    Posts
    491
    Quote Originally Posted by fred_em View Post
    I am learning something new, as I didn't know the GPU could execute code in the wrong code path. However I can't take that for granted because you say it. Can you prove me, can you explain me the reason, the rationale behind which the GPU would execute the wrong code path? Especially since it doesn't have a branch prediction unit?
    The rationale is that GPUs typically use a "Single Instruction, Multiple Data" (SIMD) architecture (similar to e.g. MMX or SSE, or std::valarray in C++). Each (non-uniform) variable in a shader is actually an array of (typically 32 or 64) such variables. Each operation (addition, multiplication, etc) is performed on entire arrays, element-wise. GLSL models this as multiple shader invocations running in parallel.

    Any condition may be true for some elements and false for others. To implement branching, the GPU evaluates both branches but any side-effects (e.g. assignment, imageStore()) are limited to elements for which the condition is true.

    Quote Originally Posted by fred_em View Post
    How can the GPU execute both branches? How is this possible?

    From http://docs.nvidia.com/cuda/pdf/ptx_isa_3.1.pdf:

    "If threads of a warp diverge via a data-dependent conditional branch, the warp serially executes each branch path taken, disabling threads
    that are not on that path, and when all paths complete, the threads converge back to the
    same execution path"

    I understand the code is not executed when it shouldn't. Fortunately not! imageStores() would better not be executed!
    Both branches are executed. "Threads" on the wrong path are "disabled", i.e. side-effects (such as assignments or imageStore()s) are ignored. However, while stores are side-effects, loads aren't. In the wrong branch, the result of any load will ultimately be ignored. But if the load triggers an error, that may propagate for the remainder of the shader's execution.

    The problem is that a condition which involves only uniforms shouldn't be treated as "data-dependent". However, older (GLSL 3.x) hardware treats all conditions in the same manner, data-dependent or not. Newer (GLSL 4.x) hardware can perform a "real" branch in cases where a condition has the same value for all elements.

  5. #25
    Junior Member Regular Contributor
    Join Date
    Jul 2010
    Posts
    132
    Quote Originally Posted by GClements View Post
    Both branches are executed.
    How? My isBound uniform value is invariably 0, under all circumstances. No thread is trying to access the wrong branch.

    Quote Originally Posted by GClements View Post
    "Threads" on the wrong path are "disabled"
    *Which ones*? What are these threads you are talking about? Again, my isBound uniform value is invariably 0, under all circumstances.

    If I have 30 fragments to process in a triangle, the GPU will create 1 warp of 32 threads and block off the last two threads (32-2=30 fragments to process). The last two threads will still see that isBound=0 and as a result, they won't take the wrong path.

    Quote Originally Posted by GClements View Post
    , i.e. side-effects (such as assignments or imageStore()s) are ignored. However, while stores are side-effects, loads aren't. In the wrong branch, the result of any load will ultimately be ignored. But if the load triggers an error, that may propagate for the remainder of the shader's execution.

    The problem is that a condition which involves only uniforms shouldn't be treated as "data-dependent". However, older (GLSL 3.x) hardware treats all conditions in the same manner, data-dependent or not. Newer (GLSL 4.x) hardware can perform a "real" branch in cases where a condition has the same value for all elements.
    I got that part earlier, believe me. But that does not answer the two questions, above.
    Last edited by fred_em; 06-27-2013 at 10:45 AM.

  6. #26
    Advanced Member Frequent Contributor
    Join Date
    Apr 2010
    Posts
    754
    How? My isBound uniform value is invariably 0, under all circumstances. No thread is trying to access the wrong branch.
    Yes, but the hardware simply executes both branches and disables side effects on the not-taken one. As far as I understand it the reason is that there is only a single instruction pointer for a whole warp, so individual threads can not have their own control flow. Now, in your case something could detect that all threads in the warp actually branch the same way and there really is no need to execute the not-taken branch at all, but that is purely an optimization not a correctness thing (at least for the case where both branches are error free).

    On a more practical note: Unless a high profile game developer runs into the same problem, I would not hold my breath for vendors to fix it, even if it is a legitimate bug

  7. #27
    Junior Member Regular Contributor
    Join Date
    Jul 2010
    Posts
    132
    Quote Originally Posted by carsten neumann View Post
    Yes, but the hardware simply executes both branches and disables side effects on the not-taken one. As far as I understand it the reason is that there is only a single instruction pointer for a whole warp, so individual threads can not have their own control flow. Now, in your case something could detect that all threads in the warp actually branch the same way and there really is no need to execute the not-taken branch at all, but that is purely an optimization not a correctness thing (at least for the case where both branches are error free).
    Given that all threads always branch the same way, how come the GPU tries to execute both branches?

    You know what... don't feel obliged to answer here I am on the brink of drawing the conclusion that I am too tired. Also the bold words are not there to annoy anybody, really.

    Quote Originally Posted by carsten neumann View Post
    On a more practical note: Unless a high profile game developer runs into the same problem, I would not hold my breath for vendors to fix it, even if it is a legitimate bug
    If samplers must be bound at all times I am OK with it. I wouldn't call it a bug, just a lack of precision in the spec.
    Last edited by fred_em; 06-28-2013 at 08:16 AM.

  8. #28
    Senior Member OpenGL Pro
    Join Date
    Jan 2012
    Location
    Australia
    Posts
    1,117
    If samplers must be bound at all times I am OK with it.


    This thread started with a shader freeze which you thought is caused by a in-active path that would access an unbound sampler if called.

    I run lots of shaders with un-attached samplers when I know that shader logic will not try to fetch from the sampler. The shaders do not freeze. Also even if I do access the sampler I just get junk colours.
    This is on both nVidia and AMD drivers.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •