Image Load Store

Revision as of 19:12, 3 January 2013 by Alfonse (Talk | contribs) (Format compatibility)

Jump to: navigation, search
Image Load Store
Core in version 4.5
Core since version 4.2
Core ARB extension ARB_shader_image_load_store
EXT extension EXT_shader_image_load_store

Image load/store is the ability of Shaders to more-or-less arbitrarily read from and write to images.


The idea with image load/store is that the user can bind one of the images in a Texture to a number of image binding points (which are separate from texture image units). Shaders can read information from these images and write information to them, in ways that they cannot with textures.

This can allow for a number of powerful features, including relatively cheap order-independent transparency.

If you think that this is a great feature, remember that there is no such thing as a free lunch. The cost of using image load/store is that all of its write operations are not automatically coherent. By using image load/store, you take on the responsibility to manage what OpenGL would normally manage for you using regular texture reads/FBO writes.

Image variables

Image variables in GLSL are variables that have one of the following image​ types. The image types are based on the type of the source Texture for the image. Not all texture types have a corresponding image type. Image variables must be declared with the uniform​ storage qualifier (or as function parameter inputs).

Like samplers, image variables represent either floating-point, signed integer, or unsigned integer Image Formats. The prefix used for the image variable name denotes which, using standard GLSL conventions. No prefix means floating-point, a prefix of i​ means signed integer, and u​ means unsigned integer.

For the sake of clarity, when you see a g preceding "image" in an image name, it represents any of the 3 possible prefixes. The image variables are:

Image Type Corresponding Texture Type
gimage1D​ GL_TEXTURE_1D

single layer from

gimage2D​ GL_TEXTURE_2D

single layer from:

gimage3D​ GL_TEXTURE_3D
gimage1DArray​ GL_TEXTURE_1D_ARRAY
gimage2DArray​ GL_TEXTURE_2D_ARRAY
gimageCubeArray​ GL_TEXTURE_CUBE_MAP_ARRAY (requires GL 4.0 or ARB_texture_cube_map_array)

single layer from:


There are no "shadow" variants.

You will notice several "single layer from" entries in the above table. It is possible to bind a specific layer from certain texture types to an image. When you do so, you must use a different image variable compared to the source texture's actual type, as shown above.

Memory qualifiers

Image variables can be declared with a number of qualifiers that have different meanings for how the variable is accessed.

Normally, the compiler is free to assume that this shader invocation is the only invocation that modifies values read through this variable. It also can freely assume that other shader invocations may not see values written through this variable.
Using this qualifier is required to allow dependent shader invocations to communicate with one another, as it enforces the coherency of memory accesses. Using this requires the appropriate memory barriers to be executed, so that visibility can be achieved.
When communicating between shader invocations for different rendering commands, glMemoryBarrier should be used instead of this qualifier.
The compiler normally is free to assume that values accessed through variables will only change after memory barriers or other synchronization. With this qualifier, the compiler assumes that the contents of the storage represented by the variable could be changed at any time.
Normally, the compiler must assume that you could access the same image/buffer object separate variables in the same shader. Therefore, if you write to one variable, and read from a second, the compiler assumes that it is possible that you could be reading the value you just wrote. With this qualifier, you are telling the compiler that this particular variable is the only variable that can modify the memory visible through that variable within this shader invocation (other shader stages don't count here). This allows the compiler to optimize reads/writes better.
You should use this wherever possible.
Normally, the compiler allows you to read and write from variables as you wish. If you use this, the variable can only be used for reading operations.
Normally, the compiler allows you to read and write from variables as you wish. If you use this, the variable can only be used for writing operations (atomic writes are forbidden because they also count as reads).

Multiple qualifiers can be used, but they must make sense together (a variable cannot be both readonly​ and writeonly​). You are encouraged to use restrict​ whenever possible.

Format qualifiers

Image variables can be declared with a format qualifier; this specifies the format for any read operations done on the image. Therefore, a format qualifier is required if you do not declare the variable with the writeonly​ memory qualifier. Write-only variables cannot be used as in any reading operations; this includes calling load and atomic (read/modify/write) functions. So if you want to read from an image, you must declare the format.

The format defines how the shader interprets the bits of data that it reads from the image. It also defines how it converts the data passed for write operations when it writes it into the image. This allows the actual Image Format of the image to differ between what the shader sees and what is stored in the image, sometimes substantially.

The format are divided into three categories, representing the three types of image variables:

  • Floating-point layout image formats:
    • rgba32f
    • rgba16f
    • rg32f
    • rg16f
    • r11f_g11f_b10f
    • r32f
    • r16f
    • rgba16
    • rgb10_a2
    • rgba8
    • rg16
    • rg8
    • r16
    • r8
    • rgba16_snorm
    • rgba8_snorm
    • rg16_snorm
    • rg8_snorm
    • r16_snorm
    • r8_snorm
  • Signed integer layout image formats:
    • rgba32i
    • rgba16i
    • rgba8i
    • rg32i
    • rg16i
    • rg8i
    • r32i
    • r16i
    • r8i
  • Unsigned integer layout image formats:
    • rgba32ui
    • rgba16ui
    • rgb10_a2ui
    • rgba8ui
    • rg32ui
    • rg16ui
    • rg8ui
    • r32ui
    • r16ui
    • r8ui

Image operations

OpenGL provides a number of functions for accessing images through image variables.

Image operations have "image coordinates", which serve the purpose of specifying where in an image that an access should take place. Image coordinates are different from texture coordinates in that image coordinates are always signed integers in texel space.

Each image variable has a specific dimensionality for their image coordinates, which represents the dimensionality of the underlying image. An image1D​ variable takes an int​ as a coordinate. image2DArray​ takes an ivec3​; the third component is the array layer.

Cube maps (and cube map arrays) are accessed very differently from texture accesses. The image coordinate for cube map and cube map arrays are both ivec3​. The coordinate is not a direction; it is a texel coordinate within the space of the cube. The third component is the face index, as this is usually defined. Image variables effectively treat cube maps as simply a form of array texture; cube map arrays are just bigger arrays, as their third component is the layer-face index.

When accessing multisample textures, the accessing function has another parameter, an int​ that defines the sample index to read from or write to.

Let us collectively refer to the image coordinate parameters as "IMAGE_COORD". When you see this in a function definition below, this means that the function takes an image coordinate, which may include a sample index parameter if it is a multisample image.

Image size

Image Size
Core in version 4.5
Core since version 4.3
Core ARB extension ARB_shader_image_size

The size of the image for an image variable can be queries with this function:

 ivec imageSize(gimage image​);

The size of the returned vector will be the size of the image coordinate, except in the case of cube maps. For cube maps, the size will be ivec2​; the third dimension would always be 6, so it is not returned. Cube map arrays will return ivec3​, with the third component being the number of layer-faces.

Accessing any texels outside of this size results in invalid accesses, as defined below.

Image load

Image load functions read a specific location from the image into the shader.

Note: Load operations from any texel that is outside of the boundaries of the bound image will return all zeros.

Reading from an image is done with this function:

 gvec4 imageLoad(gimage image​, IMAGE_COORD);

The return value is the data from the image, converted as per the format specified by the format layout qualifier.

Image store

Image store operations write a value to a specific location in the image.

Note: Store operations to any texel that is outside the boundaries of the bound image will do nothing.

Writing to an image is done with this function:

 void imageStore(gimage image​, IMAGE_COORD, gvec4 data​);

The value in data​ will be written into the image at the given coordinate, using format conversion based on the format​ parameter given to glBindImageTexture.

Atomic operations

Atomic operations perform read/modify/write operations on a location in an image. These operations are guaranteed to be "atomic": if two shaders issue the same atomic operation on the same location in the same image, one will go first, followed by the other.

Consider a shader that reads from a location, adds 1 to it, and then writes to it. It is theoretically possible for two such shaders to read from and and write to the same location in the same image at the same time. Because of the way memory accesses are handled, it is entirely possible that this sequence of events works like this:

  • Shader A reads the image value, say 0.
  • Shader B reads the image value, also 0.
  • Shader B adds 1 to its local value of 0, becoming 1.
  • Shader B writes its local value to the image. The image now has 1.
  • Shader A adds 1 to its local value of 0, becoming 1.
  • Shader A writes its local value to the image. The image now has 1.

Atomic operations prevent this possibility entirely. Each shader's independent atomic operation will fully complete before the next one starts.

The return value of all atomic operations is the original value before modifications. The value written will be the modified value.

Atomic operations to any texel that is outside of the boundaries of the bound image will return 0 and do nothing.

Atomic limitations

There are some severe limitations on image atomic operations. First, atomics can only be used on integer images, either signed or unsigned. Second, they can only be used on images with the GL_R32I/r32i​ or GL_R32UI/r32ui​ formats.

It is very possible to use format conversions to make image operations work with other formats. But the GLSL code logic will be operating on 32-bit integers.

Below, the term "gint" means either int​ or uint​, as is appropriate for the gimage type.

Atomic set value

The value at the location in an image can be directly set via this function:

 gint imageAtomicExchange(gimage image​, IMAGE_COORDS, gint data​);

This function is called "exchange" because it effectively exchanges data​ with the value in the image. The return value of all atomic functions is the original value from the image, so it is exchanging data​ with that value.

Atomic conditional set

A very powerful operation is the conditional modify operation. This operation will write a new value only if the current value in the image is equal to the given value.

 gint imageAtomicCompSwap(gimage image​, IMAGE_COORDS, gint compare​, gint data​);

If the current value of image​ at the image coordinate is exactly equal to compare​, then data​ will be stored into the image at that location. Otherwise, the location will retain the original value.

While the function doesn't provide a direct way to tell if it actually wrote the value, it always returns the original value. So if you need to know, you can test the return value against compare​ yourself.

Atomic math

GLSL only provides a single math operation: additions:

 gint imageAtomicAdd(gimage image​, IMAGE_COORDS, gint data​);

This will read the value from the image, add data​ to it, and then write it to the image.

Obviously if you need subtract, simply negate data​. This will only work directly with signed integers. However, because GLSL 4.30 mandates two's complement, you can get the same effect with unsigned integers, since int(uint)​ and uint(int)​ conversion constructors will preserve the bit pattern.

So if you need unsigned integer subtraction, you can do this:

 imageAtomicAdd(..., uint(-data));

Obviously if this produces a negative value, you'll get a very positive value back instead, since it thinks you're doing unsigned math.

Atomic bitwise

GLSL provides 3 atomic bitwise operations: and, or, and xor:

 gint imageAtomicAnd(gimage image​, IMAGE_COORDS, gint data​);
 gint imageAtomicOr(gimage image​, IMAGE_COORDS, gint data​);
 gint imageAtomicXor(gimage image​, IMAGE_COORDS, gint data​);

These read the value from the image, perform the operation given data​, and writes the results back to the image.

Atomic minmax

GLSL provides a pair of functions that ensure that the value in the image is no larger than the given value or no smaller than it.

 gint imageAtomicMin(gimage image​, IMAGE_COORDS, gint data​);
 gint imageAtomicMax(gimage image​, IMAGE_COORDS, gint data​);

For the "Min" function, it will ensure that the image value is no smaller than data​. The "Max" function will ensure that the iamge value is no larger than data​.

Images in the context

The way to associate an image variable in GLSL works very similar to the way of associating samplers with textures.

For each shader stage, there is some number of available image units (not to be confused with texture image units). The number of image units can be queried per-stage, using GL_MAX_*_IMAGE_UNIFORMS, where * is filled in with the appropriate shader stage. Note that OpenGL 4.3 only requires Fragment Shaders and Compute Shaders to have non-zero numbers of image units; the minimum required in those cases is 8.

The total number of image units available is queried via GL_MAX_IMAGE_UNITS; this represents the total number of images you can bind at one time.

Just as with samplers, image variables reference image unit indices in the context. These are usually set with a binding layout qualifier, but they can also be set with glUniform1i or glProgramUniform1i.

After associating the image variable with its image unit, you then bind an image to the context. This is done via this function:

 void glBindImageTexture(GLuint unit​, GLuint texture​, GLint level​, GLboolean layered​, GLint layer​, GLenum access​, GLenum format​)

This binds an image from texture​ to the given image unit​, using the given mipmap level​ and array layer​.

Image bindings can be layered or non-layered, which is determined by layered​. If layered​ is GL_TRUE, then texture​ must be an Array Texture (of some type), a Cubemap Texture, or a 3D Texture. If a layered image is being bound, then the entire mipmap level specified by level​ is bound.

If the image is not layered, then the user must use the layer​ to select which array layer will be bound. If the texture does not have array layers, then this parameter must be 0. As with other functions, if this is a cubemap array texture, then layer​ is the layer-face to select.

If an array or cubemap texture is bound and is not layered, then the bound image is not an array or cubemap image. So if you bind a single array layer from a GL_TEXTURE_1D_ARRAY texture, it should be used with the image1D​ image variable type. Simiarly, layers from a 2D array texture, cubemap, 3D texture, or cubemap array should be image2D​, and a layer from a 2D multisample array should use image2DMS​.

The access​ specifies how the shader may access the image through this image unit. This can be GL_READ_ONLY, GL_WRITE_ONLY, or GL_READ_WRITE. If the shader violates this restriction, then all manor of bad things can happen, including program termination. It is a good idea to use memory qualifiers in the shader itself to catch this at shader compile-time.

The format​ parameter is an Image Format which defines the format that will be used for writes to the image. If a format qualifier is specified in the shader, this format must match it. The format must be [#Format compatibility|compatible with the texture's image format]]. The format​ parameter may only use formats from the following table:

Image Unit Format Format Qualifier Image Unit Format Format Qualifier
GL_RGBA32F rgba32f​ GL_RGBA32UI rgba32ui​
GL_RGBA16F rgba16f​ GL_RGBA16UI rgba16ui​
GL_RG32F rg32f​ GL_RGB10_A2UI rgb10_a2ui​
GL_RG16F rg16f​ GL_RGBA8UI rgba8ui​
GL_R11F_G11F_B10F r11f_g11f_b10f​ GL_RG32UI rg32ui​
GL_R32F r32f​ GL_RG16UI rg16ui​
GL_R16F r16f​ GL_RG8UI rg8ui​
GL_RGBA16 rgba16​ GL_R32UI r32ui​
GL_RGB10_A2 rgb10_a2​ GL_R16UI r16ui​
GL_RGBA8 rgba8​ GL_R8UI r8ui​
GL_RG16 rg16​ GL_RGBA32I rgba32i​
GL_RG8 rg8​ GL_RGBA16I rgba16i​
GL_R16 r16​ GL_RGBA8I rgba8i​
GL_R8 r8​ GL_RG32I rg32i​
GL_RGBA16_SNORM rgba16_snorm​ GL_RG16I rg16i​
GL_RGBA8_SNORM rgba8_snorm​ GL_RG8I rg8i​
GL_RG16_SNORM rg16_snorm​ GL_R32I r32i​
GL_RG8_SNORM rg8_snorm​ GL_R16I r16i​
GL_R16_SNORM r16_snorm​ GL_R8I r8i​

Also, note that these are the only image formats you can use for images in image load/store operations. You must use exactly these image formats and no others.

Format conversion

The Image Format of the image may be different from the format specified to the image binding function and in the shader. Values read and written are converted in the following way, assuming that the formats are compatible.

The term "source format" represents the image format of whatever the source of the operation is. Similarly, the "destination format" is the image format of whatever the destination of the operation is. Therefore:

All operations, whether read or write, function as though they were copying data from/to images with those formats. The first step of a write operation would be taking the value provided by the shader and writing that into a texture in the source image format. Then the format conversion takes place, copying the value into the destination. Similarly, the last step of read operations is reading from the destination image into the value in the shader.

Note: What follows is a description of how reading and writing appears to behave. Obviously, no hardware does it this way; this is just a convenient way of describing the behavior in terms of bits and bytes.

The conversion works based on memory copies using existing API functions. The source format values are read into memory as though calling glGetTexImage. The destination format values are written into their image as though they were uploaded via a call to one of the glTexSubImage​ functions.

Both of these functions take pixel transfer formats and types. The two effective calls will use format​s and type​s that exactly match the source/destination image format.

For example, if the source image format was GL_RGBA8UI, then the format​ and type​ passed to glGetTexImage would be GL_RGBA_INTEGER and GL_UNSIGNED_BYTE. If the destination image format for a copy is GL_RGB10_A2 (which may or may not be compatible with GL_RGBA8UI),

The destination format values are written into "memory", using values pulled from the source, as though written with one of the glTexSubImage​ calls. These calls again use pixel transfer formats and types that exactly match the destination format.

Note that (compatibility willing) we are perfectly capable of switching between floating-point and integral formats. However, converting between GL_R32F and GL_RGBA8 is not well-defined, in terms of the endian conversion. The reason is that GL_R32F will be read using a type of GL_FLOAT, but the writing will write 4 bytes in RGBA order. The R byte will be the first byte read from the GL_FLOAT, but the endian storage of GL_FLOAT is not defined. So the first byte may be the most significant or the least significant.

For any particular platform, you could assume a specific endian. But OpenGL itself provides no guarantees. If you want some guarantees on being able to play with the bits of a 32-bit floating-point texture, convert it to GL_R32UI instead and do your bit manipulation in the shader.

Format compatibility

The various image format compatibility matrix for image load/store operations is very similar to the compatiblity for texture views, though there are some differences. The first difference is that the list of image formats that can be used for images in load/store operations is more limited: only the formats mentioned above may be used.

Each of these formats has two properties: a size and a class. The size represents the bit-size of each texel. For example, GL_R32F has a size of 32; GL_RGBA32UI has a size of 128. The class represents the number of components and the bit-depth of each component. The class of GL_R32F is 1x32, while the class of GL_RGBA8 is 4x8.

The class for formats with oddball bitdepths (GL_R11F_G11F_B10F, for example) is the arrangement of components. So GL_R11F_G11F_B10F's class is 11/11/10, while GL_RGB10_A2UI's class is 10/10/10/2. This has a class match with GL_RGB10_A2.

If the texture was allocated by OpenGL (it is possible for OpenCL or other interop layers to allocate textures), then the only thing that matters for compatibility is overall texel size. So it is perfectly valid to map an GL_R32F image to an GL_RGBA8UI format and back, though again endian conversions may make this unusable in platform-neutral code.

If a texture was allocated from outside of OpenGL, then how compatibility is determined may not be by texel size; it may be by class. You must use glGetTexParameter with GL_IMAGE_FORMAT_COMPATIBILITY_TYPE to detect which. It will return either GL_IMAGE_FORMAT_COMPATIBILITY_BY_SIZE or GL_IMAGE_FORMAT_COMPATIBILITY_BY_CLASS, specifying how compatibility is determined.

You can also detect this at the image format level with an image format query using the same parameter; this will be true for all (foreign) textures using that image format and texture type.

As an alternative to querying with foreign textures, you could just stick to formats that match on class. If the classes match, the sizes also match.

Memory coherency

Writes and atomic operations via image variables are not automatically coherent. Therefore, you must do things to ensure that writes have occurred before you can read those values.