Name NV_shader_storage_buffer_object Name Strings GL_NV_shader_storage_buffer_object Contact Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com) Status Complete Version Last Modified Date: February 11, 2014 NVIDIA Revision: 2 Number 422 Dependencies OpenGL 4.0 (Core or Compatibiity Profile) is required. This extension is written against the OpenGL 4.2 Specification (Compatibility Profile). NV_gpu_program4 and NV_gpu_program5 are required. ARB_shader_storage_buffer_object is required. This specification interacts with NV_shader_atomic_float. Overview This extension provides assembly language support for shader storage buffers (from the ARB_shader_storage_buffer_object extension) for all program types supported by NV_gpu_program5, including compute programs added by the NV_compute_program5 extension. Assembly programs using this extension can read and write to the memory of buffer objects bound to the binding points provided by ARB_shader_storage_buffer_object. New Procedures and Functions None. New Tokens None. Additions to Chapter 2 of the OpenGL 4.2 (Compatibility Profile) Specification (OpenGL Operation) (All modifications are relative to Section 2.X, GPU Programs, from the NV_gpu_program4 specification, as modified by NV_gpu_program5.) Modify Section 2.X.2, Program Grammar (add the following grammar rules to the NV_gpu_program5 base grammar for shader storage buffers) ::= | | ::= "ATOMB" "," "," ::= "LDB" "," ::= "STB" "," ::= ::= "STORAGE" | "STORAGE" ::= "=" ::= "=" "{" "}" ::= | "," ::= ::= ::= "program" "." "storagelen" | "program" "." "storagelen" ::= "program" "." "storagelen" ::= | ::= ::= | | ::= "program" "." "storage" | "program" "." "storage" Modify Section 2.X.3.3, Program Parameters Shader Storage Buffer Property Bindings Binding Components Underlying State ------------------------- ---------- ------------------------------- program.storagelen[a] (x,-,-,-) program storage buffer a, variable-size array length program.storagelen[a..b] (x,-,-,-) program storage buffer a..b, variable-size array length If a program parameter binding matches "program.storagelen[a]", the "x" component of the program parameter variable is filled an unsized array length associated with the shader storage buffer binding point . If the program is generated internally by the implementation when compiling a GLSL shader, this length is the number of elements that can be stored in an unsized array at the end of the associated GLSL shader storage block. If there is no shader block associated with shader storage buffer binding point , or if the associated block does not end with an unsized array, the "x" component will hold the integer value zero. If the program is an assembly program specified via ProgramStringARB or LoadProgramNV, the "x" component will hold the integer value zero. The "y", "z", and "w" components of the variable are undefined. Additionally, for program parameter array bindings, "program.storagelen[a..b]" is equivalent to specifying unsized array lengths for storage buffer binding points through , in order. A program using any of these bindings will fail to load if is greater than . Add New Section 2.X.3.Y, Shader Storage Buffers, after Section 2.X.3.6, Program Parameter Buffers Shader storage buffers are arrays of basic machine units from which data can be read or written using the LDB and STB instructions. Shader storage buffers also support atomic memory operations using the ATOMB instruction. The GL provides an implementation-dependent number of shader storage buffer binding points to which buffer objects can be attached. Shader storage buffer contents can be changed either by updating the contents of bound buffer object, by changing the buffer object attached to a binding point, or by using the ATOMB or STB instructions in a shader to modify contents of buffer object memory. Shader storage buffer bindings are established by calling BindBufferBase, BindBufferOffsetEXT, or BindBufferRange with a target of SHADER_STORAGE_BUFFER, as documented in the ARB_shader_storage_buffer extension. The number of shader storage buffer binding points is given by the value of the constant MAX_SHADER_STORAGE_BUFFER_BINDINGS. There is a limit on the maximum number of basic machine units in a buffer object that can be accessed using any single parameter buffer binding point, given by the implementation-dependent constant MAX_SHADER_STORAGE_BLOCK_SIZE. Buffer objects larger than this size may be used, but the results of accessing portions of the buffer object beyond the limit are undefined. Shader storage buffer variables may only be used as operands in the ATOMB, LDB, and STB instructions; they may not be used by used as results or operands in general instructions. Shader storage buffer variables must be declared explicitly via the grammar rule. Shader storage buffer bindings can not be used directly in executable instructions. Shader storage buffer variables may be declared as arrays, but all bindings assigned to the array must use the same binding point(s) and must increase consecutively. In explicit shader storage variable declarations, the bindings in Table X.2 starting with "program.storage[a..b]" may be used, indicating that the variable spans multiple buffer binding points. Such variables must be accessed as an arrays, with the first index specifying an offset into the range of buffer object binding points. A buffer index of zero identifies binding point ; an index of --1 identifies binding point . If such a variable is declared as an array, a second index must be provided to identify the individual array element. A program will fail to compile if such bindings are used when or is negative or greater than or equal to the number of buffer binding points supported for the program type, or if is greater than . Binding Components Underlying State ----------------------------- ---------- ----------------------------- program.storage[a][c] (x,x,x,x) shader storage buffer a, element c program.storage[a][c..d] (x,x,x,x) shader storage buffer a, elements c through d program.storage[a] (x,x,x,x) shader storage buffer a, all elements program.storage[a..b][c] (x,x,x,x) shader storage buffers a through b, element c program.storage[a..b][c..d] (x,x,x,x) shader storage buffers a through b, elements c through d program.storage[a..b] (x,x,x,x) shader storage buffers a through b, all elements Table X.2: Shader Storage Bindings. and indicate buffer numbers, and indicate individual elements. If a shader storage buffer binding matches "program.storage[a][c]", the shader storage buffer variable is associated with element of the buffer object bound to binding point . If no buffer object is bound to binding point , or the bound buffer object is not large enough to hold an element , reads from the binding return zero and writes to the binding have no effect. The binding point must be a nonnegative integer constant. For shader storage array declarations, "program.storage[a][c..d]" is equivalent to specifying elements through of the buffer object bound to binding point in order. For shader storage array declarations, "program.storage[a]" is equivalent to specifying the entire buffer -- elements 0 through -1, where is either the size of the array (if declared) or the implementation-dependent maximum shader storage buffer object size limit (if no size is declared). When bindings beginning with "program.storage[a..b]" are used in a variable declaration, they behave identically to corresponding beginning with "program.storage[a]", except that the variable is filled with a separate set of values for each buffer binding point from to inclusive. Modify Section 2.X.4, Program Execution Environment (add to the opcode table) Instr- Modifiers uction F I C S H D Out Inputs Description ------ - - - - - - --- -------- -------------------------------- ATOMB - - X - - - s v,su atomic transaction to storage buffer LDB - - X X - F v su load from storage buffer STB - - - - - - - v,su store to storage buffer Modify Section 2.X.4.5, Program Memory Access, from NV_gpu_program5 (modify first paragraph) Programs may load from or store to buffer object memory via the ATOM (atomic global memory operation), ATOMB (atomic storage buffer memory operation), LDB (load from storage buffer), LDC (load constant), LOAD (global load), STB (store to storage buffer), and STORE (global store) instructions. (Add to "Section 2.X.6, Program Options" of the NV_gpu_program4 extension, as extended by NV_gpu_program5:) + Shader Storage Buffer Operations (NV_shader_storage_buffer) If a program (of any type, including compute programs) specifies the "NV_shader_storage_buffer" option, it may use the "ATOMB", "LDB", and "STB" opcodes to perform atomic memory options, loads, and stores to shader storage buffers, and "STORAGE" to declare shader storage buffer variables. If the option is not specified, a program will fail to load if it contains "ATOMB", "LDB", or "STB" opcodes, or "STORAGE" declarations. (add the following subsection to section 2.X.8, Program Instruction Set.) Section 2.X.8.Z, ATOMB: Atomic Memory Operation (Storage Buffer Memory) The ATOMB instruction performs an atomic memory operation by reading from shader storage buffer memory specified by the second operand, computing a new value based on the value read from memory and the first (vector) operand, and then writing the result back to the same memory address. The memory transaction is atomic, guaranteeing that no other write to the memory accessed will occur between the time it is read and written by the ATOMB instruction. The result of the ATOMB instruction is the scalar value read from memory. The second operand used for the ATOMB instruction must correspond to a shader storage variable declared using the "STORAGE" statement; a program will fail to load if any other type of operand is used for the second operand of an ATOMB instruction. The ATOMB instruction has two required instruction modifiers. The atomic modifier specifies the type of operation to be performed. The storage modifier specifies the size and data type of the operand read from memory and the base data type of the operation used to compute the value to be written to memory. atomic storage modifier modifiers operation -------- ------------------ -------------------------------------- ADD U32, S32, U64, F32 compute a sum MIN U32, S32 compute minimum MAX U32, S32 compute maximum IWRAP U32 increment memory, wrapping at operand DWRAP U32 decrement memory, wrapping at operand AND U32, S32 compute bit-wise AND OR U32, S32 compute bit-wise OR XOR U32, S32 compute bit-wise XOR EXCH U32, S32, U64, F32 exchange memory with operand CSWAP U32, S32, U64 compare-and-swap Table X.Y, Supported atomic and storage modifiers for the ATOM instruction. Not all storage modifiers are supported by ATOMB, and the set of modifiers allowed for any given instruction depends on the atomic modifier specified. Table X.Y enumerates the set of atomic modifiers supported by the ATOMB instruction, and the storage modifiers allowed for each. tmp0 = VectorLoad(op0); result = BufferMemoryLoad(op1, storageModifier); switch (atomicModifier) { case ADD: writeval = tmp0.x + result; break; case MIN: writeval = min(tmp0.x, result); break; case MAX: writeval = max(tmp0.x, result); break; case IWRAP: writeval = (result >= tmp0.x) ? 0 : result+1; break; case DWRAP: writeval = (result == 0 || result > tmp0.x) ? tmp0.x : result-1; break; case AND: writeval = tmp0.x & result; break; case OR: writeval = tmp0.x | result; break; case XOR: writeval = tmp0.x ^ result; break; case EXCH: break; case CSWAP: if (result == tmp0.x) { writeval = tmp0.y; } else { return result; // no memory store } break; } BufferMemoryStore(op1, writeval, storageModifier); ATOMB performs a scalar atomic operation. The , , and components of the result vector are undefined. ATOMB supports no base data type modifiers, but requires exactly one storage modifier. The base data types of the result vector, and the first (vector) operand are derived from the storage modifier. Section 2.X.8.Z, LDB: Load from Storage Buffer Memory The LDB instruction generates a result vector by fetching data from shader storage buffer memory identified by the first operand, as described in Section 2.X.4.5. The single operand for the LDB instruction must correspond to a shader storage buffer variable declared using the "STORAGE" statement; a program will fail to load if any other type of operand is used in an LDB instruction. result = BufferMemoryLoad(op0, storageModifier); LDB supports no base data type modifiers, but requires exactly one storage modifier. The base data type of the result vector is derived from the storage modifier. Section 2.X.8.Z, STB: Store to Storage Buffer Memory The STB instruction writes the contents of the first vector operand to shader storage buffer memory identified by the second operand, as described in Section 2.X.4.5. This instruction generates no result. The second operand for the STB instruction must correspond to a shader shader storage buffer variable declared using the "STORAGE" statement; a program will fail to load if any other type of operand is used in an STB instruction. tmp0 = VectorLoad(op0); BufferMemoryStore(op1, tmp0, storageModifier); STB supports no base data type modifiers, but requires exactly one storage modifier. The base data type of the vector components of the first operand is derived from the storage modifier. Additions to Chapter 3 of the OpenGL 4.2 (Compatibility Profile) Specification (Rasterization) None. Additions to Chapter 4 of the OpenGL 4.2 (Compatibility Profile) Specification (Per-Fragment Operations and the Frame Buffer) None. Additions to Chapter 5 of the OpenGL 4.2 (Compatibility Profile) Specification (Special Functions) None. Additions to Chapter 6 of the OpenGL 4.2 (Compatibility Profile) Specification (State and State Requests) None. Additions to the AGL/GLX/WGL Specifications None. GLX Protocol None. Dependencies on NV_shader_atomic_float If NV_shader_atomic_float is not supported, the ADD and EXCH atomic operations in the ATOMB instructions do not support the "F32" storage modifier. Errors None. New State None, but shares buffer bindings and other state with the ARB_shader_storage_buffer_object extension. New Implementation Dependent State None, but shares implementation-dependent state with the ARB_shader_storage_buffer_object extension. Issues None. Revision History Revision 2, February 11, 2014 (pbrown) - Fix typos in the descriptions of the "program.storage" bindings. Revision 1, May 9, 2012 (pbrown) - Internal spec development.