PDA

View Full Version : Bugs in 13.4 GL 4.3 exts:compute shader, SBOs and no_attachments



oscarbg
04-29-2013, 06:43 AM
Hi,
I have been testing new OGL compute shader and storage buffer objects extension and found following bugs (13.4 on 7950):
(please note all the samples I use for testing this work correctly on Nvidia OGL 4.3 cards)
*using atomicMax and atomicMin on shared variables hang the GLSL compiler others like atomicOr are OK!

groupshared uint ldsZMax;
uint z;
atomicMax( ldsZMax, z );

*using a compute shader with following launch size and shared arrays usage:
#define BLOCK_SIZE 32
layout (local_size_x = BLOCK_SIZE, local_size_y = BLOCK_SIZE) in;
shared double As[BLOCK_SIZE*BLOCK_SIZE];
shared double Bs[BLOCK_SIZE*BLOCK_SIZE];
crashes with:
Compute shader(s) failed to link.
Compute link error: HW_UNSUPPORTED.
Compute shader not supported by hardware

diminishing BLOCK_SIZE to less than 32 seems to work.. I have tested using
layout (local_size_x = 32, local_size_y = 32) in;
isn't a issue so 32 should work as for this conf each of this two shared arrays is size 8192 (sizeof(double)*32*32) so total shared mem usage is 2*8192 and is equal to reported max

size (GL_MAX_COMPUTE_SHARED_MEMORY_SIZE: 32768).. I verify this issue is on shared mem size usage as using something like
(with BLOCK_SIZE=32):
shared double As[BLOCK_SIZE*BLOCK_SIZE-1];
shared double Bs[BLOCK_SIZE*BLOCK_SIZE];
seems to compile so please fix to be able to use not only 32767 bytes of shared mem but full 32768 bytes..

*using sbo on non compute shaders (like fragment shaders seems no be not correct)
*getting GL_MAX_COMPUTE_WORK_GROUP_COUNT and GL_MAX_COMPUTE_WORK_GROUP_SIZE I get using debug_output bug:
glGetIntegerv parameter <pname> has an invalid enum '0x91be' (GL_INVALID_ENUM)
other new like GL_MAX_COMPUTE_ATOMIC_COUNTERS seem to work..

Related altough no_attachments extension is not adversited new entry points are present so I played with it using default and seems a simple test works on 79xx but not on 58xx

series..
glGenFramebuffers(1,&noat);
glBindFramebuffer(GL_FRAMEBUFFER_EXT,noat);
glFramebufferParameteri(GL_FRAMEBUFFER_EXT,GL_FRAM EBUFFER_DEFAULT_WIDTH, w);
glFramebufferParameteri (GL_FRAMEBUFFER_EXT,GL_FRAMEBUFFER_DEFAULT_HEIGHT, h);
a sample using this works on 7xxx series but not on 5xxx series..

Alfonse Reinheart
04-29-2013, 07:43 PM
so total shared mem usage is 2*8192 and is equal to reported max

size (GL_MAX_COMPUTE_SHARED_MEMORY_SIZE: 32768).

Um, 2 * 8192 is not 32768.

oscarbg
04-30-2013, 08:33 AM
Ok that's embarassing I have a high degree in maths but my math is not good but AMD OGL implementation is worse, I thought that it was a bug due to requesting exact size of shared mem but now bug it's that I can request even half of it's size..

Alfonse Reinheart
04-30-2013, 10:25 AM
It could be because it's exactly 16384. What happens if you allocate more than that size of memory?