Image Load Store
Image load/store is the ability of Shaders to more-or-less arbitrarily read from and write to images.
- 1 Overview
- 2 Image variables
- 3 Image operations
- 4 Image stores and discard
- 5 Images in the context
- 6 Format conversion
- 7 Memory coherency
- 8 Limitations
The idea with image load/store is that the user can bind one of the images in a Texture to a number of image binding points (which are separate from texture image units). Shaders can read information from these images and write information to them, in ways that they cannot with textures.
This can allow for a number of powerful features, including relatively cheap order-independent transparency.
If you think that this is a great feature, remember that there is no such thing as a free lunch. The cost of using image load/store is that all of its write operations are not automatically coherent. By using image load/store, you take on the responsibility to manage what OpenGL would normally manage for you using regular texture reads/FBO writes.
Image variables in GLSL are variables that have one of the following image types. The image types are based on the type of the source Texture for the image. Not all texture types have a corresponding image type. Image variables must be declared with the uniform storage qualifier (or as function parameter inputs).
Like samplers, image variables represent either floating-point, signed integer, or unsigned integer Image Formats. The prefix used for the image variable name denotes which, using standard GLSL conventions. No prefix means floating-point, a prefix of i means signed integer, and u means unsigned integer.
For the sake of clarity, when you see a g preceding "image" in an image name, it represents any of the 3 possible prefixes. The image variables are:
|Image Type||Corresponding Texture Type|
single layer from
single layer from:
|gimageCubeArray||GL_TEXTURE_CUBE_MAP_ARRAY (requires GL 4.0 or ARB_texture_cube_map_array)|
single layer from:
There are no "shadow" variants.
You will notice several "single layer from" entries in the above table. It is possible to bind a specific layer from certain texture types to an image. When you do so, you must use a different image variable compared to the source texture's actual type, as shown above.
Image variables can be declared with a number of qualifiers that have different meanings for how the variable is accessed.
- Normally, the compiler is free to assume that this shader invocation is the only invocation that modifies values read through this variable. It also can freely assume that other shader invocations may not see values written through this variable.
- Using this qualifier is required to allow dependent shader invocations to communicate with one another, as it enforces the coherency of memory accesses. Using this requires the appropriate memory barriers to be executed, so that visibility can be achieved.
- When communicating between shader invocations for different rendering commands, glMemoryBarrier should be used instead of this qualifier.
- The compiler normally is free to assume that values accessed through variables will only change after memory barriers or other synchronization. With this qualifier, the compiler assumes that the contents of the storage represented by the variable could be changed at any time.
- Normally, the compiler must assume that you could access the same image/buffer object through separate variables in the same shader. Therefore, if you write to one variable, and read from a second, the compiler assumes that it is possible that you could be reading the value you just wrote. With this qualifier, you are telling the compiler that this particular variable is the only variable that can modify the memory visible through that variable within this shader invocation (other shader stages don't count here). This allows the compiler to optimize reads/writes better.
- You should use this wherever possible.
- Normally, the compiler allows you to read and write from variables as you wish. If you use this, the variable can only be used for reading operations.
- Normally, the compiler allows you to read and write from variables as you wish. If you use this, the variable can only be used for writing operations (atomic writes are forbidden because they also count as reads).
Multiple qualifiers can be used, but they must make sense together (a variable cannot be both readonly and writeonly). You are encouraged to use restrict whenever possible.
Image variables can be declared with a format qualifier; this specifies the format for any read operations done on the image. Therefore, a format qualifier is required if you do not declare the variable with the writeonly memory qualifier. Write-only variables cannot be used as in any reading operations; this includes calling load and atomic (read/modify/write) functions. So if you want to read from an image, you must declare the format.
The format defines how the shader interprets the bits of data that it reads from the image. It also defines how it converts the data passed for write operations when it writes it into the image. This allows the actual Image Format of the image to differ between what the shader sees and what is stored in the image, sometimes substantially.
The format are divided into three categories, representing the three types of image variables:
- Floating-point layout image formats:
- Signed integer layout image formats:
- Unsigned integer layout image formats:
OpenGL provides a number of functions for accessing images through image variables.
Image operations have "image coordinates", which serve the purpose of specifying where in an image that an access should take place. Image coordinates are different from texture coordinates in that image coordinates are always signed integers in texel space.
Each image variable has a specific dimensionality for their image coordinates, which represents the dimensionality of the underlying image. An image1D variable takes an int as a coordinate. image2DArray takes an ivec3; the third component is the array layer.
Cube maps (and cube map arrays) are accessed very differently from texture accesses. The image coordinate for cube map and cube map arrays are both ivec3. The coordinate is not a direction; it is a texel coordinate within the space of the cube. The third component is the face index, as this is usually defined. Image variables effectively treat cube maps as simply a form of array texture; cube map arrays are just bigger arrays, as their third component is the layer-face index.
When accessing multisample textures, the accessing function has another parameter, an int that defines the sample index to read from or write to.
Let us collectively refer to the image coordinate parameters as "IMAGE_COORD". When you see this in a function definition below, this means that the function takes an image coordinate, which may include a sample index parameter if it is a multisample image.
|Core in version||4.5|
|Core since version||4.3|
|Core ARB extension||ARB_shader_image_size|
The size of the image for an image variable can be queries with this function:
ivec imageSize(gimage image);
The size of the returned vector will be the size of the image coordinate, except in the case of cube maps. For cube maps, the size will be ivec2; the third dimension would always be 6, so it is not returned. Cube map arrays will return ivec3, with the third component being the number of layer-faces.
Accessing any texels outside of this size results in invalid accesses, as defined below.
Image load functions read a specific location from the image into the shader.
Reading from an image is done with this function:
gvec4 imageLoad(gimage image, IMAGE_COORD);
Image store operations write a value to a specific location in the image.
Writing to an image is done with this function:
void imageStore(gimage image, IMAGE_COORD, gvec4 data);
Atomic operations perform read/modify/write operations on a location in an image. These operations are guaranteed to be "atomic": if two shaders issue the same atomic operation on the same location in the same image, one will go first, followed by the other.
Consider a shader that reads from a location, adds 1 to it, and then writes to it. It is theoretically possible for two such shaders to read from and and write to the same location in the same image at the same time. Because of the way memory accesses are handled, it is entirely possible that this sequence of events works like this:
- Shader A reads the image value, say 0.
- Shader B reads the image value, also 0.
- Shader B adds 1 to its local value of 0, becoming 1.
- Shader B writes its local value to the image. The image now has 1.
- Shader A adds 1 to its local value of 0, becoming 1.
- Shader A writes its local value to the image. The image now has 1.
Atomic operations prevent this possibility entirely. Each shader's independent atomic operation will fully complete before the next one starts.
The return value of all atomic operations is the original value before modifications. The value written will be the modified value.
Atomic operations to any texel that is outside of the boundaries of the bound image will return 0 and do nothing.
There are some severe limitations on image atomic operations. First, atomics can only be used on integer images, either signed or unsigned. Second, they can only be used on images with the GL_R32I/r32i or GL_R32UI/r32ui formats.
It is very possible to use format conversions to make image operations work with other formats. But the GLSL code logic will be operating on 32-bit integers.
Below, the term "gint" means either int or uint, as is appropriate for the gimage type.
Atomic set value
The value at the location in an image can be directly set via this function:
gint imageAtomicExchange(gimage image, IMAGE_COORDS, gint data);
This function is called "exchange" because it effectively exchanges data with the value in the image. The return value of all atomic functions is the original value from the image, so it is exchanging data with that value.
Atomic conditional set
A very powerful operation is the conditional modify operation. This operation will write a new value only if the current value in the image is equal to the given value.
gint imageAtomicCompSwap(gimage image, IMAGE_COORDS, gint compare, gint data);
If the current value of image at the image coordinate is exactly equal to compare, then data will be stored into the image at that location. Otherwise, the location will retain the original value.
While the function doesn't provide a direct way to tell if it actually wrote the value, it always returns the original value. So if you need to know, you can test the return value against compare yourself.
GLSL only provides a single math operation: additions:
gint imageAtomicAdd(gimage image, IMAGE_COORDS, gint data);
This will read the value from the image, add data to it, and then write it to the image.
Obviously if you need subtract, simply negate data. This will only work directly with signed integers. However, because GLSL 4.30 mandates two's complement, you can get the same effect with unsigned integers, since int(uint) and uint(int) conversion constructors will preserve the bit pattern.
So if you need unsigned integer subtraction, you can do this:
Obviously if this produces a negative value, you'll get a very positive value back instead, since it thinks you're doing unsigned math.
GLSL provides 3 atomic bitwise operations: and, or, and xor:
gint imageAtomicAnd(gimage image, IMAGE_COORDS, gint data); gint imageAtomicOr(gimage image, IMAGE_COORDS, gint data); gint imageAtomicXor(gimage image, IMAGE_COORDS, gint data);
These read the value from the image, perform the operation given data, and writes the results back to the image.
GLSL provides a pair of functions that ensure that the value in the image is no larger than the given value or no smaller than it.
gint imageAtomicMin(gimage image, IMAGE_COORDS, gint data); gint imageAtomicMax(gimage image, IMAGE_COORDS, gint data);
For the "Min" function, it will ensure that the image value is no smaller than data. The "Max" function will ensure that the iamge value is no larger than data.
Image stores and discard
The Fragment Shader has the ability to issue a discard command. This will prevent writing any fragment values to the framebuffer. However, it will also have the effect of preventing image store and atomic operations from taking place.
Images in the context
The way to associate an image variable in GLSL works very similar to the way of associating samplers with textures.
For each shader stage, there is some number of available image units (not to be confused with texture image units). The number of image units can be queried per-stage, using GL_MAX_*_IMAGE_UNIFORMS, where * is filled in with the appropriate shader stage. Note that OpenGL 4.3 only requires Fragment Shaders and Compute Shaders to have non-zero numbers of image units; the minimum required in those cases is 8.
The total number of image units available is queried via GL_MAX_IMAGE_UNITS; this represents the total number of images you can bind at one time.
Just as with samplers, image variables reference image unit indices in the context. These are usually set with a binding layout qualifier, but they can also be set with glUniform1i or glProgramUniform1i.
After associating the image variable with its image unit, you then bind an image to the context. This is done via this function:
void glBindImageTexture(GLuint unit, GLuint texture, GLint level, GLboolean layered, GLint layer, GLenum access, GLenum format)
This binds an image from texture to the given image unit, using the given mipmap level and array layer.
Image bindings can be layered or non-layered, which is determined by layered. If layered is GL_TRUE, then texture must be an Array Texture (of some type), a Cubemap Texture, or a 3D Texture. If a layered image is being bound, then the entire mipmap level specified by level is bound.
If the image is not layered, then the user must use the layer to select which array layer will be bound. If the texture does not have array layers, then this parameter must be 0. As with other functions, if this is a cubemap array texture, then layer is the layer-face to select.
If an array or cubemap texture is bound and is not layered, then the bound image is not an array or cubemap image. So if you bind a single array layer from a GL_TEXTURE_1D_ARRAY texture, it should be used with the image1D image variable type. Simiarly, layers from a 2D array texture, cubemap, 3D texture, or cubemap array should be image2D, and a layer from a 2D multisample array should use image2DMS.
The access specifies how the shader may access the image through this image unit. This can be GL_READ_ONLY, GL_WRITE_ONLY, or GL_READ_WRITE. If the shader violates this restriction, then all manner of bad things can happen, including program termination. It is a good idea to use memory qualifiers in the shader itself to catch this at shader compile-time.
The format parameter is an Image Format which defines the format that will be used for writes to the image. If a format qualifier is specified in the shader, this format must match it. The format must be compatible with the texture's image format. The format parameter may only use formats from the following table:
|Image Unit Format||Format Qualifier||Image Unit Format||Format Qualifier|
Also, note that these are the only image formats you can use for images in image load/store operations. You must use exactly these image formats and no others.
Multibind and images
|Core in version||4.5|
|Core since version||4.4|
|Core ARB extension||ARB_multi_bind|
A range of image objects can be bound to a range of image binding points with a single function call:
first is the image binding index to start binding the array to. count is the number of indices to bind. Thus, the image binding indices that will be changed are those on the half-open range, [first, first + count).
If textures is NULL, then it works like binding the 0 image to all of the given binding indices. If not, then it will bind each texture in the array to the indices. textures must have count elements.
For each texture in the array, if the texture is 0, then it will be the equivalent of calling glBindImageTexture(first + i, 0, 0, GL_FALSE, 0, GL_READ_ONLY, GL_R8). If the texture is an actual texture object, it will be the equivalent of calling glBindImageTexture(first + i, textures[i], 0, GL_TRUE, 0, GL_READ_WRITE, lookupInternalFormat(textures[i])), where lookupInternalFormat gets the Image Format of the texture.
The texture will always be bound as a layered texture; you can't multibind a specific layer of a texture. It will always bind mipmap level 0. It will always bind it for reading and writing. And you can't do format remapping; it will always use the exact format.
There is one tool that can mitigate most of these limitations: texture views. You can create a view of an existing texture. This allows you to pick a particular mipmap and array layer/face from a texture up-front. You can then multibind this view, effectively preselecting the mipmap and layer that you want. Furthermore, view textures can already do the same kind of formatting conversion that images can do.
The only thing view textures can't save you from is the read/write binding. However, that can be set in the shader just fine. So you don't really need it.
The Image Format of the image may be different from the format specified to the image binding function and in the shader. Values read and written are converted in the following way, assuming that the formats are compatible.
The term "source format" represents the image format of whatever the source of the operation is. Similarly, the "destination format" is the image format of whatever the destination of the operation is. Therefore:
- Read operation:
- Write operation:
- Source format: The format specified by the image binding function glBindImageTexture.
- Dest format: The image's actual format.
All operations, whether read or write, function as though they were copying data from/to images with those formats. The first step of a write operation would be taking the value provided by the shader and writing that into a texture in the source image format. Then the format conversion takes place, copying the value into the destination. Similarly, the last step of read operations is reading from the destination image into the value in the shader.
The conversion works based on memory copies using existing API functions. The source format values are read into memory as though calling glGetTexImage. The destination format values are written into their image as though they were uploaded via a call to one of the glTexSubImage functions.
Both of these functions take pixel transfer formats and types. The two effective calls will use formats and types that exactly match the source/destination image format.
For example, if the source image format was GL_RGBA8UI, then the format and type passed to glGetTexImage would be GL_RGBA_INTEGER and GL_UNSIGNED_BYTE. If the destination image format for a copy is GL_RGB10_A2 (which may or may not be compatible with GL_RGBA8UI), then the parameters would be GL_RGBA and GL_UNSIGNED_INT_2_10_10_10_REV. The full set of pixel transfer parameters can be found in OpenGL 4.5, Table 8.27, page 281.
The destination format values are written into "memory", using values pulled from the source, as though written with one of the glTexSubImage calls. These calls again use pixel transfer formats and types that exactly match the destination format.
Note that (compatibility willing) we are perfectly capable of switching between floating-point and integral formats. However, converting between GL_R32F and GL_RGBA8 is not well-defined, in terms of the endian conversion. The reason is that GL_R32F will be read using a type of GL_FLOAT, but the writing will write 4 bytes in RGBA order. The R byte will be the first byte read from the GL_FLOAT, but the endian storage of GL_FLOAT is not defined. So the first byte may be the most significant or the least significant.
For any particular platform, you could assume a specific endian. But OpenGL itself provides no guarantees. If you want some guarantees on being able to play with the bits of a 32-bit floating-point texture, convert it to GL_R32UI instead and do your bit manipulation in the shader.
The various image format compatibility matrix for image load/store operations is very similar to the compatiblity for texture views, though there are some differences. The first difference is that the list of image formats that can be used for images in load/store operations is more limited: only the formats mentioned above may be used.
Each of these formats has two properties: a size and a class. The size represents the bit-size of each texel. For example, GL_R32F has a size of 32; GL_RGBA32UI has a size of 128. The class represents the number of components and the bit-depth of each component. The class of GL_R32F is 1x32, while the class of GL_RGBA8 is 4x8.
The class for formats with oddball bitdepths (GL_R11F_G11F_B10F, for example) is the arrangement of components. So GL_R11F_G11F_B10F's class is 11/11/10, while GL_RGB10_A2UI's class is 10/10/10/2. This has a class match with GL_RGB10_A2.
If the texture was allocated by OpenGL (it is possible for OpenCL or other interop layers to allocate textures), then the only thing that matters for compatibility is overall texel size. So it is perfectly valid to map an GL_R32F image to an GL_RGBA8UI format and back, though again endian conversions may make this unusable in platform-neutral code.
If a texture was allocated from outside of OpenGL, then how compatibility is determined may not be by texel size; it may be by class. You must use glGetTexParameter with GL_IMAGE_FORMAT_COMPATIBILITY_TYPE to detect which. It will return either GL_IMAGE_FORMAT_COMPATIBILITY_BY_SIZE or GL_IMAGE_FORMAT_COMPATIBILITY_BY_CLASS, specifying how compatibility is determined.
You can also detect this at the image format level with an image format query using the same parameter; this will be true for all (foreign) textures using that image format and texture type.
As an alternative to querying with foreign textures, you could just stick to formats that match on class. If the classes match, the sizes also match.
Writes and atomic operations via image variables are not automatically coherent. Therefore, you must do things to ensure that writes have occurred before you can read those values.