Difference between revisions of "Image Load Store"

From OpenGL.org
Jump to: navigation, search
(Memory coherency: First pass at memory coherency.)
(Image variables)
(5 intermediate revisions by the same user not shown)
Line 6: Line 6:
  
 
'''Image load/store''' is the ability of [[Shader]]s to more-or-less arbitrarily read from and write to images.
 
'''Image load/store''' is the ability of [[Shader]]s to more-or-less arbitrarily read from and write to images.
 +
 +
{{stub}}
  
 
== Overview ==
 
== Overview ==
Line 13: Line 15:
 
This can allow for a number of powerful features, including relatively cheap order-independent transparency.
 
This can allow for a number of powerful features, including relatively cheap order-independent transparency.
  
If you think that this is a great feature, remember that there is no such thing as a free lunch. The cost of using image load/store is in [[#Memory coherency|user-specified memory coherency.]] By using image load/store, you take up the responsibility to manage what OpenGL would manage for you using regular texture reads/[[Framebuffer Object|FBO]] writes.
+
If you think that this is a great feature, remember that there is no such thing as a free lunch. The cost of using image load/store is that all of its write operations are [[Memory Model#Incoherent memory access|not automatically coherent]]. By using image load/store, you take up the responsibility to manage what OpenGL would manage for you using regular texture reads/[[Framebuffer Object|FBO]] writes.
  
 
== Image variables ==
 
== Image variables ==
  
=== Formats and compatibility ===
+
Image variables are variables that have one of the following {{code|image}} types. The image types are based on the type of the source [[Texture]] for the image. Not all texture types have a separate image type. Image variables must be declared with the [[Uniform (GLSL)|{{code|uniform}} storage qualifier]].
  
== Basic load store ==
+
Like samplers, image variables represent either floating-point, signed integer, or unsigned integer [[Image Format]]s. The prefix used for the image variable name denotes which, using standard GLSL conventions. No prefix means floating-point, a prefix of {{code|i}} means signed integer, and {{code|u}} means unsigned integer.
  
== Atomic operations ==
+
For the sake of clarity, when you see a ''g'' preceding "image" in an image name, it represents any of the 3 possible prefixes. The image variables are:
  
== Memory coherency ==
+
{| class="wikitable"
 +
! Image Type
 +
! Corresponding Texture Type
 +
|-
 +
| {{code|''g''image1D}}
 +
| {{enum|GL_TEXTURE_1D}}
 +
|-
 +
| {{code|''g''image2D}}
 +
| {{enum|GL_TEXTURE_2D}}
 +
|-
 +
| {{code|''g''image3D}}
 +
| {{enum|GL_TEXTURE_3D}}
 +
|-
 +
| {{code|''g''imageCube}}
 +
| [[Cubemap Texture|{{enum|GL_TEXTURE_CUBE_MAP}}]]
 +
|-
 +
| {{code|''g''image2DRect}}
 +
| [[Rectangle Texture|{{enum|GL_TEXTURE_RECTANGLE}}]]
 +
|-
 +
| {{code|''g''image1DArray}}
 +
| [[Array Texture|{{enum|GL_TEXTURE_1D_ARRAY}}]]
 +
|-
 +
| {{code|''g''image2DArray}}
 +
| [[Array Texture|{{enum|GL_TEXTURE_2D_ARRAY}}]]
 +
|-
 +
| {{code|''g''imageCubeArray}}
 +
| [[Cubemap Texture#Cubemap array textures|{{enum|GL_TEXTURE_CUBE_MAP_ARRAY}}]] (requires GL 4.0 or {{extref|texture_cube_map_array}})
 +
|-
 +
| {{code|''g''imageBuffer}}
 +
| [[Buffer Texture|{{enum|GL_TEXTURE_BUFFER}}]]
 +
|-
 +
| {{code|''g''image2DMS}}
 +
| [[Multisample Texture|{{enum|GL_TEXTURE_2D_MULTISAMPLE}}]]
 +
|-
 +
| {{code|''g''image2DMSArray}}
 +
| {{enum|GL_TEXTURE_2D_MULTISAMPLE_ARRAY}}
 +
|-
 +
|}
 +
There are no "shadow" variants.
  
Rendering is a [[Synchronization|very asynchronous process, but the OpenGL specification defines all of these operations to happen sequentially]]. Therefore, hardware and drivers jump through a few hoops in order to ensure sanity for the user.
+
=== Memory qualifiers ===
  
For example, uploading data to an image will often be deferred to whenever the driver wants to get around to it. If you then render with that texture, the rendering operation itself will be deferred to when the texture upload is done. Similarly, if you write to images bound to a [[Framebuffer Object]] in one rendering operation, you can immediately issue a rendering operation that reads from those images (and writes to some ''other'' images, of course). The OpenGL implementation will automatically delay the second rendering command until the first has completed and flush internal caches so that texture reads will see the written data properly. And so forth.
+
Image variables can be declared with a [[Type_Qualifier_(GLSL)#Memory_qualifiers|number of qualifiers]] that have different meanings for how the variable is accessed.
  
This is brought up here because, by using image load/store, you are ''signing away your right to all of this.'' You must now manage this all yourself.
+
{{memory qualifiers}}
  
When you write to an image via an image store operation, any subsequent reads from almost anywhere are ''not'' guaranteed to see them. And by "almost anywhere", this includes:
+
== Formats ==
  
* Reads from the texture via {{apifunc|glGetTexImage}}, or if it is bound to an FBO, {{apifunc|glReadPixels}}.
+
Image variables can have a format [[Type_Qualifier_(GLSL)#Layout_qualifiers|layout qualifier]]. It defines the image format used for reading from that variable. Image variables used only for reading do not need a format qualifier, but if they are used for writes or certain atomic operations, they ''must'' have a format qualifier. Also, if they are declared with {{code|readonly}}, then they must have the format qualifier.
* Texture reads via [[Sampler (GLSL)|samplers]].
+
* If the image was a buffer texture, ''any'' form of reading from that buffer, such as using it for a [[Vertex Buffer Object]] or [[Uniform Buffer Object]].
+
  
In short, you get almost nothing. Everything is asynchronous, and OpenGL will not protect you from this fact. All of these can be alleviated, but [[#Ensuring visibility|only ''specifically'' at the request of the user]]. It will not happen automatically.
+
The format qualifier specifies how data read from the image looks in OpenGL. It allows the image's actual [[Image Format]] to ''differ'' from the format qualifier in the shader. This allows the GLSL shader to reinterpret the meaning of the data in the image. Writes can also be re-interpreted in this manor, but only if a format is specified. If no format is specified, then the writes will be interpreted as if the format matched the image's actual image format.
  
=== Guarantees ===
 
  
Despite the above, there are some protections that OpenGL provides. What follows is a list of things that the specification does require image load/store operations to guarantee about when data will be accessible.
 
  
First, within a single shader invocation, if you write something to an image variable, it will always be visible to that variable for reading. You need not do anything special to make this happen. However, it is possible that, between writing and reading, another invocation may have stomped on that value.
+
=== Compatibility ==
  
Second, if a shader invocation is being executed, then the shader invocations necessary to execute it must have taken place. For example, in a fragment shader, you can assume that the vertex shaders to compute the vertices for the primitive being rasterized have completed. So you may read any image values written by them.
+
The valid image formats
  
{{warning|This only applies to the shader invocations ''directly responsible'' for this shader invocation. Being in a fragment shader does not mean that ''all'' vertex shaders in a rendering command have completed. Only the ones needed for this particular fragment shader invocation have been executed.}}
+
== Images in the context ==
  
{{note|Geometry shaders have a caveat here. A GS may write multiple vertices and primitives. Therefore, you may only assume that the GS executed ''just far enough'' to write enough vertices needed to render the fragment shader's primitive.}}
+
The way to associate an image variable in GLSL works very similar to the way of associating [[Sampler (GLSL) #Binding textures to samplers|samplers with textures]].
  
Third, sometimes a fragment shader is executed for the sole purpose of computing derivatives for other shaders. All image store and atomic operations will be ignored by that invocation.
+
== Basic load store ==
  
=== Invocation order and count ===
+
== Atomic operations ==
  
One problem with the above is what defines "subsequent invocations". OpenGL allows implementations a ''lot'' of leeway on the ordering of shader invocations, as well as the number of invocations. Here is a list of the rules:
+
== Memory coherency ==
 +
{{main|Memory Model#Incoherent memory access}}
  
# You may not assume that a vertex shader will be executed only once for every vertex you pass it. It may be executed multiple times for the same vertex. In indexed rendering scenarios, it is very possible for re-used indices to not execute the vertex shader a second or third time.
+
Writes and atomic operations via image variables are not automatically coherent. Therefore, you must do things to ensure that writes have occurred before you can read those values.
 
+
# The same applies to tessellation evaluation shaders.
+
 
+
# The number of fragment shader invocations generated from rasterizing a primitive depends on the pixel ownership test, whether early depth test is enabled, and whether the rendering is to a multisample buffer. When not using per-sample shading, the number of fragment shader invocations is undefined within a pixel area, but it must be between 1 and the number of samples in the buffer.
+
 
+
# Invocations of the same shader stage may be executed in ''any'' order. Even within the same draw call. This includes fragment shaders; writes to the framebuffer are ordered, but the actual fragment shader execution is not.
+
 
+
# Outside of the above case of a shader which depends on the outputs of another, invocations between stages may be executed in any order. This ''includes'' invocations launched by different rendering commands. While it is technically unlikely that two vertex shaders from different rendering operations could be running at the same time, it is also very ''possible'', so OpenGL provides no guarantees.
+
 
+
=== Ensuring visibility ===
+
 
+
The term "visibility" represents when someone can safely access the value written to an image from a shader invocation. There are two tools to ensure visibility; they are used to ensure visibility from two different contexts. There is the {{code|coherent}} qualifier and there is the {{apifunc|glMemoryBarrier}} function.
+
 
+
{{code|coherent}} is used on image variables, such that writes to {{code|coherent}} qualified variables will be read correctly by {{code|coherent}} qualified variables in another invocation. Note that this requires the {{code|coherent}} qualifier on ''both'' the writer and the reader; if one of them doesn't have it, then nothing is guaranteed.
+
 
+
Note that {{code|coherent}} does not ''ignore'' all of the prior rules. In order for a write to become visible to an invocation, it must first ''have happened''. Therefore, {{code|coherent}} can only really work if you know that the writing invocation has executed. As stated above, if invocation B's existence depends on output from invocation A, then you know that the write has happened.
+
 
+
There are other times you can know that a write has happened. In [[Compute Shader]]s, the {{code|barrier}} function ensures that all other invocations in a work group have reached that point in the computation. This works for [[Tessellation Control Shader]]s as well, for all of the invocations in a patch. So you know that all invocations in a work group/patch have reached that point, so all prior writes have been written. You still need the {{code|coherent}} qualifier on both the reading and writing variable, but it works.
+
 
+
{{code|coherent}} alone is not enough however. You also need to use a memory barrier, to effectively let OpenGL know that you're finished writing a batch of things and want to make them visible to someone else. The functions for this are of the form {{code|memoryBarrier*}} (no relation to the {{apifunc|glMemoryBarrier}} API function). This is a small suite of functions, which represent different barrier cases. For image load/store, {{code|memoryBarrierImage}} is used to order image writes. Or you can use {{code|memoryBarrier}} to order all of these special writes.
+
 
+
Note that {{code|memoryBarrierImage}} requires GL 4.3/{{extref|compute_shader}}.
+
 
+
Atomic operations are always effectively {{code|coherent}}, due to their atomic nature (nothing can interfere with the read/modify/write operation). Memory barriers can still be employed if you wish to ensure the ordering between two separate atomic operations, but it is not necessary.
+
 
+
{{code|coherent}} is ''only'' useful in cases of shader-to-shader reading/writing where you can enforce invocation order. If you want to establish visibility between two different rendering commands, you must use a much more powerful mechanism:
+
 
+
void {{apifunc|glMemoryBarrier}}}(GLbitfield {{param|barriers}});
+
 
+
This function is a way of ensuring the visibility of image load/store operations with a wide variety of OpenGL operations, as listed on the documentation page. The thing to keep in mind about the various bits in the bitfield is this: they represent the operation you want to be able to do ''after'' making the image load/store visible. This is the operation you want to ''see'' the load/store results.
+
 
+
For example, if you do some image load/store operations on a texture, and then want to read it back onto the CPU, you would use the {{enum|GL_TEXTURE_UPDATE_BARRIER_BIT​}}. If you did image load/store to a buffer, and then want to use it for vertex array data, you would use {{enum|GL_VERTEX_ATTRIB_ARRAY_BARRIER_BIT​}}. That's the idea.
+
 
+
Note that if you want image load/store operations from one command to be visible to image load/store operations from another command, you use {{enum|GL_SHADER_IMAGE_ACCESS_BARRIER_BIT​}}.
+
 
+
=== Guidelines and usecases ===
+
 
+
Here are some basic use cases and how to synchronize them properly.
+
 
+
; Read-only image variables
+
: If a shader only reads images, then it does not need any form of synchronization for visibility. Even if you modify objects via OpenGL commands ({{apifunc|glTexSubImage2D}}, for example) or whatever, OpenGL requires that image reads remain properly synchronized.
+
; {{code|barrier}} invocation write/read
+
: Use {{code|coherent}} if you use a mechanism like {{code|barrier}} to synchronize between invocations.
+
; Dependent invocation write/read
+
: If you have one invocation which is dependent on another (the vertex shaders used to generate a primitive used for a fragment shader), then you need to use {{code|coherent}} on the variables and invoke a {{code|memoryBarrier/Image}} as appropriate after you finish writing to the images of interest.
+
; Shader image write/read between rendering commands
+
: There is no need for {{code|coherent}} here. Just use {{apifunc|glMemoryBarrier|(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT​)}} between the two rendering/dispatch commands.
+
; Shader image writes, read by other OpenGL operations
+
: Again, {{code|coherent}} is not necessary. You must use a {{apifunc|glMemoryBarrier}} appropriate to the operation of interest.
+
  
 
== Limitations ==
 
== Limitations ==
 
{{stub}}
 
  
 
[[Category:OpenGL Shading Language]]
 
[[Category:OpenGL Shading Language]]
 
[[Category:Shaders]]
 
[[Category:Shaders]]

Revision as of 23:04, 24 November 2012

Image Load Store
Core in version 4.5
Core since version 4.2
Core ARB extension ARB_shader_image_load_store
EXT extension EXT_shader_image_load_store

Image load/store is the ability of Shaders to more-or-less arbitrarily read from and write to images.

Overview

The idea with image load/store is that the user can bind one of the images in a Texture to a number of image binding points (which are separate from texture image units). Shaders can read information from these images and write information to them, in ways that they cannot with textures.

This can allow for a number of powerful features, including relatively cheap order-independent transparency.

If you think that this is a great feature, remember that there is no such thing as a free lunch. The cost of using image load/store is that all of its write operations are not automatically coherent. By using image load/store, you take up the responsibility to manage what OpenGL would manage for you using regular texture reads/FBO writes.

Image variables

Image variables are variables that have one of the following image​ types. The image types are based on the type of the source Texture for the image. Not all texture types have a separate image type. Image variables must be declared with the uniform​ storage qualifier.

Like samplers, image variables represent either floating-point, signed integer, or unsigned integer Image Formats. The prefix used for the image variable name denotes which, using standard GLSL conventions. No prefix means floating-point, a prefix of i​ means signed integer, and u​ means unsigned integer.

For the sake of clarity, when you see a g preceding "image" in an image name, it represents any of the 3 possible prefixes. The image variables are:

Image Type Corresponding Texture Type
gimage1D​ GL_TEXTURE_1D
gimage2D​ GL_TEXTURE_2D
gimage3D​ GL_TEXTURE_3D
gimageCube​ GL_TEXTURE_CUBE_MAP
gimage2DRect​ GL_TEXTURE_RECTANGLE
gimage1DArray​ GL_TEXTURE_1D_ARRAY
gimage2DArray​ GL_TEXTURE_2D_ARRAY
gimageCubeArray​ GL_TEXTURE_CUBE_MAP_ARRAY (requires GL 4.0 or ARB_texture_cube_map_array)
gimageBuffer​ GL_TEXTURE_BUFFER
gimage2DMS​ GL_TEXTURE_2D_MULTISAMPLE
gimage2DMSArray​ GL_TEXTURE_2D_MULTISAMPLE_ARRAY

There are no "shadow" variants.

Memory qualifiers

Image variables can be declared with a number of qualifiers that have different meanings for how the variable is accessed.

coherent​
Normally, the compiler is free to assume that this shader invocation is the only invocation that modifies values read through this variable. It also can freely assume that other shader invocations may not see values written through this variable.
Using this qualifier is required to allow dependent shader invocations to communicate with one another, as it enforces the coherency of memory accesses. Using this requires the appropriate memory barriers to be executed, so that visibility can be achieved.
When communicating between shader invocations for different rendering commands, glMemoryBarrier should be used instead of this qualifier.
volatile​
The compiler normally is free to assume that values accessed through variables will only change after memory barriers or other synchronization. With this qualifier, the compiler assumes that the contents of the storage represented by the variable could be changed at any time.
restrict​
Normally, the compiler must assume that you could access the same image/buffer object separate variables in the same shader. Therefore, if you write to one variable, and read from a second, the compiler assumes that it is possible that you could be reading the value you just wrote. With this qualifier, you are telling the compiler that this particular variable is the only variable that can modify the memory visible through that variable within this shader invocation (other shader stages don't count here). This allows the compiler to optimize reads/writes better.
You should use this wherever possible.
readonly​
Normally, the compiler allows you to read and write from variables as you wish. If you use this, the variable can only be used for reading operations.
writeonly​
Normally, the compiler allows you to read and write from variables as you wish. If you use this, the variable can only be used for writing operations (atomic writes are forbidden because they also count as reads).

Formats

Image variables can have a format layout qualifier. It defines the image format used for reading from that variable. Image variables used only for reading do not need a format qualifier, but if they are used for writes or certain atomic operations, they must have a format qualifier. Also, if they are declared with readonly​, then they must have the format qualifier.

The format qualifier specifies how data read from the image looks in OpenGL. It allows the image's actual Image Format to differ from the format qualifier in the shader. This allows the GLSL shader to reinterpret the meaning of the data in the image. Writes can also be re-interpreted in this manor, but only if a format is specified. If no format is specified, then the writes will be interpreted as if the format matched the image's actual image format.


= Compatibility

The valid image formats

Images in the context

The way to associate an image variable in GLSL works very similar to the way of associating samplers with textures.

Basic load store

Atomic operations

Memory coherency

Writes and atomic operations via image variables are not automatically coherent. Therefore, you must do things to ensure that writes have occurred before you can read those values.

Limitations