Name INTEL_fragment_shader_ordering Name Strings GL_INTEL_fragment_shader_ordering Contact Slawomir Grajewski, Intel (slawomir.grajewski 'at' intel.com) Contributors Tim Foley, Intel Brent Insko, Intel Tomasz Janczak, Intel Marco Salvi, Intel Larry Seiler, Intel Tomasz Poniecki, Intel Status Draft. Version Last Modified Date: 08/29/2013 INTEL Revision: 3 Number OpenGL Extension #441 Dependencies This extension is written against the OpenGL 4.4 (Core Profile) Specification. This extension is written against version 4.40 (revision 6) of the OpenGL Shading Language Specification. OpenGL 4.2 or ARB_shader_image_load_store is required;GLSL 4.20 is required. Overview Graphics devices may execute in parallel fragment shaders referring to the same window xy coordinates. Framebuffer writes are guaranteed to be processed in primitive rasterization order, but there is no order guarantee for other instructions and image or buffer object accesses in particular. The extension introduces a new GLSL built-in function, beginFragmentShaderOrderingINTEL(), which blocks execution of a fragment shader invocation until invocations from previous primitives that map to the same xy window coordinates (and same sample when per-sample shading is active) complete their execution. All memory transactions from previous fragment shader invocations are made visible to the fragment shader invocation that called beginFragmentShaderOrderingINTEL() when the function returns. Including the following line in a shader can be used to control the language features described in this extension: #extension GL_INTEL_fragment_shader_ordering : where is as specified in section 3.3. New preprocessor #defines are added to the OpenGL Shading Language: #define GL_INTEL_fragment_shader_ordering 1 New Procedures and Functions None. New Tokens None. Additions to the OpenGL 4.4 (Core Profile) Specification Modify section 2.11.13 Shader Memory Access (add new text in last bullet on p. 143) The exception is when the built-in function beginFragmentShaderOrderingINTEL() is used. This function creates a region in a fragment shader in which memory operations complete and are made visible in rasterization order for fragments that overlap in xy window coordinates. (add new paragraph after second paragraph on p. 144) The built-in function beginFragmentShaderOrderingINTEL() may be used to provide ordering of reads and writes performed by fragment shader invocations that overlap in xy window coordinates. Calling beginFragmentShaderOrderingINTEL() guarantees that any memory transactions issued by shader invocations from previous primitives, mapped to same xy window coordinates (and same sample when per-sample shading is active), complete and are visible to the shader invocation that called beginFragmentShaderOrderingINTEL(). Modifications to the OpenGL Shading Language Specification, Version 4.40 (revision 6) Modify Section 8.17 Shader Memory Control Functions (add to the table, p. 178) Syntax ------ void beginFragmentShaderOrderingINTEL() This function is available only in fragment shaders. Description ----------- The beginFragmentShaderOrderingINTEL() function blocks fragment shader execution until completion of all shader invocations from previous primitives that map to the same window xy coordinates (and same sample if appropriate). All memory transactions from previous fragment shader invocations mapped to same xy window coordinates are made visible to the current fragment shader invocation when this function returns. When per-fragment shading is active, the ordering guarantee applies regardless of whether previous and current invocations cover overlapping or disjoint sets of samples under same fragment of window xy coordinates. When per-sample shading is active, the ordering guarantee applies to shader invocations mapped to the same sample number under same window xy coordinates. This function has no effect on shader execution for fragments with non-overlapping window xy coordinates. The beginFragmentShaderOrderingINTEL() function can be called under control flow, including non-uniform control flow. Synchronization and ordering guarantees are valid only if the conditional branch is taken during execution. It is valid for a fragment shader to dynamically execute multiple calls to beginFragmentShaderOrderingINTEL(). In such cases, the first executed call will block as described above, but subsequent calls will never block. Note there is no explicit built-in function to signal the end of the region that should be ordered. Instead, the region that will be ordered logically extends to the end of fragment shader execution. GLSL code example ----------------- layout(binding = 0, rgba8) uniform image2D image; vec4 main() { ... compute output color if (color.w > 0) // potential non-uniform control flow { beginFragmentShaderOrderingINTEL(); ... read/modify/write image // ordered access guaranteed } ... no ordering guarantees (as varying branch might not be taken) beginFragmentShaderOrderingINTEL(); ... update image again // ordered access guaranteed } Errors None. New State None. New Implementation Dependent State None. Issues None. Revision History Rev. Date Author Changes ---- -------- -------- ----------------------------------------- 1 07/30/13 S.Grajewski Initial revision. 2 08/26/13 T.Janczak removed per-image synchronization 3 08/29/13 T.Janczak incorporate internal review feedback