Hi,

I am currently trying to understand the differences between the coherent and volatile qualifier. First, some quotes (in code tags for a better formatting) from the ARB_shader_image_load_store extension doc:

Short description of coherent and volatile:

Code :
Qualifier       Meaning
------------    -------------------------------------------------
coherent        memory variable where reads and writes are coherent
                with reads and writes from other shader invocations
 
volatile        memory variable whose underlying value may be
                changed at any point during shader execution by
                some source other than the current shader invocation

Long description of coherent:

Code :
Memory accesses to image variables declared using the "coherent" storage
qualifier are performed coherently with similar accesses from other shader
invocations.  In particular, when reading a variable declared as
"coherent", the values returned will reflect the results of previously
completed writes performed by other shader invocations.  When writing a
variable declared as "coherent", the values written will be reflected in
subsequent coherent reads performed by other shader invocations.  As
described in the Section 2.20.X of the OpenGL Specification, shader memory
reads and writes complete in a largely undefined order.  The built-in
function memoryBarrier() can be used if needed to guarantee the completion
and relative ordering of memory accesses performed by a single shader
invocation.
 
When accessing memory using variables not declared as "coherent", the
memory accessed by a shader may be cached by the implementation to service
future accesses to the same address.  Memory stores may be cached in such
a way that the values written may not be visible to other shader
invocations accessing the same memory.  The implementation may cache the
values fetched by memory reads and return the same values to any shader
invocation accessing the same memory, even if the underlying memory has
been modified since the first memory read.  While variables not declared
as "coherent" may not be useful for communicating between shader
invocations, using non-coherent accesses may result in higher performance.

Long description of volatile:

Code :
Memory accesses to image variables declared using the "volatile" storage
qualifier must treat the underlying memory as though it could be read or
written at any point during shader execution by some source other than the
executing shader invocation.  When a volatile variable is read, its value
must be re-fetched from the underlying memory, even if the shader
invocation performing the read had previously fetched its value from the 
same memory.  When a volatile variable is written, its value must be
written to the underlying memory, even if the compiler can conclusively
determine that its value will be overwritten by a subsequent write.  Since
the external source reading or writing a "volatile" variable may be
another shader invocation, variables declared as "volatile" are
automatically treated as coherent.

Issues section:

Code :
(26) What sort of qualifiers should we provide relevant to memory
     referenced by image variables?
 
  RESOLVED:  We will support the qualifiers "coherent", "volatile",
  "restrict", and "const" to be used in image variable declarations.
 
  "coherent" is used to ensure that memory accesses from different shader
  invocations are cached coherently (i.e., one invocation will be able to
  observe writes from another when the other invocation's writes
  complete).  This coherence may mean the use of "coherent"-qualified
  image variables may perform more slowly than of otherwise equivalent
  unqualified variables.
 
  "volatile" behaves as in C, and may be needed if an algorithm requires
  reading image memory that may be written asynchronously by other shader
  invocations.

My understanding of their uses:

coherent:

  • only useful for dependent shader invocations (e.g. fragment shader invocations generated from a complete primitive after vertex shader has processed its vertices)
  • memoryBarrier() function goes hand-in-hand with this qualifier (it does a cache/shared memory flush on coherent qualified variables and determines order of memory accesses), you can say when to flush (btw: is there an implicit memoryBarrier() call at the end of the shader, when there are coherent qualified variables and no memoryBarrier() was specified in the shader?)
  • non-coherent qualified variables might be L-cached or resident in shared memory and hence (dependent) spawning threads on other SIMD processors might not observe their values directly
  • use-case: e.g. read values from an image in a dependent shader invocation, which were written by an invocation in a previous shader stage (values might still be cached, so have to be flushed via memoryBarrier())


volatile:

  • coherent is implicitly inherent
  • always fetches values directly from global memory (no caching)
  • always writes values directly to global memory (no caching)
  • might be more expensive than coherent (absolutely no temporary caching/memory access optimizations allowed)
  • use-case: e.g. atomically increment a texel in an image (volatile qualified) for independent shader invocations (shaders might be executed on different SIMD processors and use shared memory for atomically incrementing)
  • memoryBarrier() only useful for avoiding (compiler) memory access reordering, since volatile already guarantees a direct write to global memory


Other important points I forgot? Or are there any errors in my understanding of the doc? Let us collect some more facts for a better understanding of those two qualifiers.