Hi all,
As you may have already guessed, what my question is:
Why are imageAtomic operations only defined for single channel data (int, uint, float)? I’m not an hardware expert, but from my naive view, I don’t see a reason for this.
To make my point a bit more clear:
I’m particularly interested in something like: imageAtomicAdd(img, iCoord, ivec4(data));
However, I do not necessarily expect, that ALL components are read/written in one atomic. I would be totally satisfied, if each of the components are changed within an atomic.
Maybe calling those functions like the others imageAtomicAdd(img, iCoord, ivec4(data)); would be confusing. But also an additional “component” index like: imageAtomicAdd(img, iCoord, component, data); would be enough.
As I don’t know, how the atomic is realized in hardware, from my point of view, it would even be ok, if the entire texel is locked, although I’m only accessing a single component.
-> Everybody, who uses atomics shouldn’t expect best performance anyway, but maybe the API should give the user the ability to sacrifice some performance for more correctness.
Is there a way to use an additional img as guard/mutex? Something like:
while(0 != imageAtomicCompSwap(mutexImg, texel, 0, 1)); // busy wait
// in mutex
data = imageLoad(img, texel, …);
…
imageStore(img, texel, …);
memoryBarrierImage(); // eehm, possible at all?
imageStore(mutexImg, texel, 0);
I think the problem is, how to make sure that shader invocations see the store of img AFTER the store of mutexImg.
Any other suggestions?
Thanks!
Ambator