prideout
10-15-2011, 08:42 PM
Some questions about the barrier() instruction:
(1) It implies that all control shader invocations within a given patch do not necessarily have the same program counter, which goes against my intuition. Given the 32 limit on patch size, I assumed that each patch is processed in a single warp.
(2) The only reason you'd need to synchronize threads is if the shader had RW access to a shared memory space. The "patch" qualifier can be applied only to "out" variable, not to temporaries. Wouldn't it be useful to apply "patch" to temporaries?
(3) What's the equivalent to barrier() in D3D hull shaders?
(4) When I tried to write a highly efficient control shader that makes use of patch-level shared memory (i.e., patch out) and barrier(), I ran into driver issues with both major vendors. Has anyone out there had any better luck than me?
(1) It implies that all control shader invocations within a given patch do not necessarily have the same program counter, which goes against my intuition. Given the 32 limit on patch size, I assumed that each patch is processed in a single warp.
(2) The only reason you'd need to synchronize threads is if the shader had RW access to a shared memory space. The "patch" qualifier can be applied only to "out" variable, not to temporaries. Wouldn't it be useful to apply "patch" to temporaries?
(3) What's the equivalent to barrier() in D3D hull shaders?
(4) When I tried to write a highly efficient control shader that makes use of patch-level shared memory (i.e., patch out) and barrier(), I ran into driver issues with both major vendors. Has anyone out there had any better luck than me?