Hi I have patched excellent icare3d demo
http://blog.icare3d.org/2010/07/opengl-40-abuffer-v20-linked-lists-of.html
for AMD compatibilty the demo plus source code is here:
http://dl.dropbox.com/u/1416327/ABufferLinkedListAMD.rar
The problem in short is that altough I have patched to use only EXT_shader_image_load_store as even in tex mode used NV_shader_buffer_load and store
extensions the demo works ok still in NV hardware in AMD hardware still doesn’t work…
The shaders compile fine and after linking it seems every GL function isn’t working…
Also altough it work still I see two fixes I have done that can be bugs in AMD as Nvidia doesn’t require:
- layout specifier is the first thing to specify in image declarations for images
don’t work on AMD (nvidia works):
coherent uniform layout(size1x32) uimage2D abufferPageIdxImg
this works:
layout(size1x32) coherent uniform uimage2D abufferPageIdxImg;
2.shader compiler says #version is the first thing to be used even before macros:
is this really needed per spec i.e. shouldn’t macro preprocessing be done earlier so macro usage before #version will work correctly as Nvidia drivers do?
Please AMD boys monitoring this forums download and try to fix errors in AMD implementation…
Summarized changes are:
*Remove “inline” keywords in shaders as this even seems to don’t work on NV drivers…
*Change every code casting a float to int from (int)(a) or (int) a -> int(a) that’s the correct way…
*(AMD Bug?)layout specifier for images has to be used first so instead of
coherent uniform layout(size1x32) uimage2D abufferPageIdxImg
use
layout(size1x32) coherent uniform uimage2D abufferPageIdxImg;
*set to use all tex mode demo at start
i.e.:
int pABufferUseTextures=1;
int pSharedPoolUseTextures=1;
even
by default mode for tex had
int pABufferUseTextures=1;
int pSharedPoolUseTextures=0;
I have defined int pSharedPageUseTextures=1; for the remaining thing (curSharedPage) using nv buffer_load/store…
and change acordingly to use tex stores
same code changes:
setShadersGlobalMacro(“CURSHAREDPAGE_USE_TEXTURES”,pSharedPageUseTextures);
if(pSharedPageUseTextures)
{
if(!curSharedPageTexID)
glGenTextures(1, &curSharedPageTexID);
glActiveTexture(GL_TEXTURE5);
glBindTexture(GL_TEXTURE_BUFFER, curSharedPageTexID);
//Associate BO storage with the texture
glTexBuffer(GL_TEXTURE_BUFFER, GL_R32F, curSharedPageBuffID);
glBindImageTextureEXT(5, curSharedPageTexID, 0, false, 0, GL_READ_WRITE, GL_R32UI);
checkGLError (“curSharedPageTexID”);
}
if(pSharedPageUseTextures)
{
glProgramUniform1iEXT(prog, glGetUniformLocation(prog, “curSharedPageImg”), 5);
}
else
{
glProgramUniformui64NV(prog, glGetUniformLocation(prog, “d_curSharedPage”), curSharedPageAddress);
}
Also removed use of shader buffer load and store functions even for buffer objects associated to texture buffers :
if(havebufferext)
{
glMakeBufferResidentNV(GL_TEXTURE_BUFFER, GL_READ_WRITE);
glGetBufferParameterui64vNV(GL_TEXTURE_BUFFER, GL_BUFFER_GPU_ADDRESS_NV, &sharedPageListAddress);
}
finally seems ok to use glut normal context creation in AMD so:
int glutf=0;
if(glutf)
{
//Init OpenGL 4.0 context
glutInitContextVersion (4, 0);
glutInitContextProfile(GLUT_CORE_PROFILE );
glutInitContextProfile( GLUT_COMPATIBILITY_PROFILE); //needed for glutBitmapCharacter
glutInitContextFlags (GLUT_FORWARD_COMPATIBLE ); //Can be uses for compatibility with openGL 2.x
}
finally another AMD bug can be shader compiler saying #version is the first thing to be used even before macros:
is this really needed per spec i.e. shouldn’t macro preprocessing be done earlier so macro usage before #version usege work correctly as Nvidia drivers do?
see code for putting macros right after #version…
Forgot to add in the code for correct compatibility memorybarrierext i.e. change shader_buffer_load_store
glMemoryBarrierEXT(GL_SHADER_GLOBAL_ACCESS_BARRIER_BIT_NV);
by ALL_BARRIER_BITS_EXT
in image_load_store…
probably can tuned better using TEXTURE_UPDATE_BARRIER_BIT_EXT?
also
SHADER_GLOBAL_ACCESS_BARRIER_BIT_NV 0x00000010
and others are:
VERTEX_ATTRIB_ARRAY_BARRIER_BIT_EXT 0x00000001
ELEMENT_ARRAY_BARRIER_BIT_EXT 0x00000002
UNIFORM_BARRIER_BIT_EXT 0x00000004
TEXTURE_FETCH_BARRIER_BIT_EXT 0x00000008
SHADER_IMAGE_ACCESS_BARRIER_BIT_EXT 0x00000020
COMMAND_BARRIER_BIT_EXT 0x00000040
PIXEL_BUFFER_BARRIER_BIT_EXT 0x00000080
TEXTURE_UPDATE_BARRIER_BIT_EXT 0x00000100
BUFFER_UPDATE_BARRIER_BIT_EXT 0x00000200
FRAMEBUFFER_BARRIER_BIT_EXT 0x00000400
TRANSFORM_FEEDBACK_BARRIER_BIT_EXT 0x00000800
ATOMIC_COUNTER_BARRIER_BIT_EXT 0x00001000
ALL_BARRIER_BITS_EXT 0xFFFFFFFF
also noting that probably texture barrier extension gets deprecated using instead of texturebarrier a
glMemoryBarrierEXT(TEXTURE_UPDATE_BARRIER_BIT_EXT)