Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 1 of 3 123 LastLast
Results 1 to 10 of 24

Thread: AMD conditional rendering broken

  1. #1
    Member Regular Contributor Nowhere-01's Avatar
    Join Date
    Feb 2011
    Location
    Novosibirsk
    Posts
    251

    Post AMD conditional rendering broken

    so conditional render always passes if i use GL_QUERY_NO_WAIT, or stalls the GPU driver if i use GL_QUERY_WAIT as a parameter. i couldn't find much info about this issue, except for: http://www.g-truc.net/post-0300.html
    in my case, it acting differently. i have HD6670 with Catalyst 13.1 Core Profile Forward-Compatible Context 9.12.0.0

    example of rendering(simplified):
    //render to occlusion query
    Code :
    void renderToOcclusion()
        {
            if(!isEnabled || !isInFrustum || !isDiscardable) {
                return;
            }
            glBeginQueryARB(GL_SAMPLES_PASSED_ARB, occQuery);
            modelViewMatrix = currentViewMatrix * modelMatrix;
     
            //RENDER
            lodAvailable = modelStorage[modelId].lodAvailable;
            glBindVertexArray(modelStorage[modelId].data[lodAvailable].vertexArrayObject);
     
            for(unsigned s = 0, off = 0; s < modelStorage[modelId].data[lodAvailable].numSurfaces; s++)
            {
                SShader[3].applyProgram();
     
                glUniformMatrix4fv(SShader[shaderId].shaderSet[programId].uniform_modelViewMatrix, 1, 0, glm::value_ptr(modelViewMatrix));
                glUniformMatrix4fv(SShader[shaderId].shaderSet[programId].uniform_projectionMatrix, 1, 0, glm::value_ptr(currentProjectionMatrix));
     
                glDrawElements(GL_TRIANGLES, modelStorage[modelId].data[lodAvailable].numIndices[s], GL_UNSIGNED_SHORT, BUFFER_OFFSET(off));
                off += modelStorage[modelId].data[lodAvailable].numIndices[s] * sizeof(short);
            }
     
            glEndQueryARB(GL_SAMPLES_PASSED_ARB);
     
            //debug:
            unsigned numSamples = 0;
            unsigned occQueryAvailable = 0;
            while(!occQueryAvailable) {
                glGetQueryObjectuiv(occQuery,GL_QUERY_RESULT_AVAILABLE, &occQueryAvailable); 
            }
            glGetQueryObjectuiv(occQuery, GL_QUERY_RESULT, &numSamples);
            LOG << numSamples << endl;   //!THAT OUTPUTS CORRECT NUMBER OF SAMPLES!
            }
        }

    //render:
    Code :
    if(isEnabled && isInFrustum)
            {
                if(isDiscardable)
                    glBeginConditionalRender(occQuery, GL_QUERY_NO_WAIT);
     
                modelViewMatrix = defaultViewMatrix * modelMatrix;
                normalMatrix = glm::transpose(glm::inverse(glm::mat3(modelViewMatrix)));
     
                //RENDER
                drawGeometry();
                if(isDiscardable)
                    glEndConditionalRender();
            }

    as described above, that results in always rendering all objects with GL_QUERY_NO_WAIT, and freeze with GL_QUERY_WAIT. but glGetQueryObjectuiv(occQuery, GL_QUERY_RESULT, &numSamples) generates correct values and if instead of conditional render i use it's results, i get correct occlusion. but old occlusion query is such a pain in the ass to synchronize. i just don't want to use it anymore. i did expect AMD still having minor problems with their OpenGL implementation, but this is ridiculous. i'm in the debugging nightmare.
    Last edited by Nowhere-01; 02-13-2013 at 10:38 AM. Reason: null

  2. #2
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    985
    If you use GL_QUERY_NO_WAIT and the draw command that you perform the occlusion query on didn't finish at the time you call BeginConditionalRender, then it should in fact perform the conditional draws. That's how it should work.

    If you use GL_QUERY_WAIT and the draw command that you perform the occlusion query on didn't finish at the time you call BeginConditionalRender, then it should wait for the result and only continue afterwards (this wait however happens most of the cases on the GPU, not in the driver). Once again, that's how it should work.

    I guarantee you if you would put a glFinish between the queries and when you use it with BeginConditionalRender with GL_QUERY_NO_WAIT (just as an experiment) it would not draw any occluded object.

    Once again, don't forget that GL_QUERY_NO_WAIT will be a no-op if the results are not available at the time on the GPU, but simply will draw everything as usual. This is the expected behavior.
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

  3. #3
    Member Regular Contributor Nowhere-01's Avatar
    Join Date
    Feb 2011
    Location
    Novosibirsk
    Posts
    251
    ok, it's my fault. i was too optimistic, because i used to test my application on GF 560Ti, which performs 5 to 7 times better and where's, actually, no need to wait for occlusion query. and i thought it was because of conditional rendering handles it better. i assumed that conditional rendering is using something like "last available query result" so if latest result for query object is not available yet, it uses previous available cached result... but that would be sane.

    and now some interesting results:

    putting glFinish(); after each glEndQuery(...); does nothing. it behaves exactly the same way - all objects pass or gpu stall.
    i also tried glFinish(); after the whole occlusion query pass, no result.

    i also tried to keep
    Code :
    unsigned occQueryAvailable = 0;
            while(!occQueryAvailable) {
                glGetQueryObjectuiv(occQuery,GL_QUERY_RESULT_AVAILABLE, &occQueryAvailable); 
            }
    after each query. same, doesn't fix conditional rendering.

    and i think you misunderstood me. GL_QUERY_WAIT doesn't just cause temporal freeze or slowdown, i would expect that. it causes GPU stall long enough to crash the driver.

    and if it works like you describe, then what do can you achieve with it exactly? i fail to see the point of that functionality. in common rendering it makes some cpu work unavoidable(choosing renderpath, binding textures, passing uniforms), but what does it provide in exchange?
    Last edited by Nowhere-01; 02-13-2013 at 12:27 PM.

  4. #4
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948
    that results in always rendering all objects with GL_QUERY_NO_WAIT
    I'm curious: how do you detect that it is being rendered?

    If you use GL_QUERY_NO_WAIT and the draw command that you perform the occlusion query on didn't finish at the time you call BeginConditionalRender
    That should be "at the time the GPU executes the BeginConditionalRender part". The query shouldn't have to be finished yet.

    The general idea with conditional render with NO_WAIT is similar to PBOs; as long as you can put sufficient distance between the query and the conditional part, you can get something useful out of it. If you render them one right after the other, it'll never be useful.

    i assumed that conditional rendering is using something like "last available query result" so if latest result for query object is not available yet, it uses previous available cached result... but that would be sane.
    Sane? No, that would be disastrous. There's no guarantee that the user uses the same query object for the same rendered object. Users can, and in many cases do, have circular buffers of query objects that they rotate through. Query objects do not have to be associated with a particular "object".

    and i think you misunderstood me. GL_QUERY_WAIT doesn't just cause temporal freeze or slowdown, i would expect that. it causes GPU stall long enough to crash the driver.
    Does this happen when you don't query the number of samples passed?

  5. #5
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    985
    Quote Originally Posted by Nowhere-01 View Post
    ok, it's my fault. i was too optimistic, because i used to test my application on GF 560Ti, which performs 5 to 7 times better and where's, actually, no need to wait for occlusion query.
    If the query does not finish in time, the practice is to put more work between performing the query and using its result for conditional rendering. Also, it could depend on when the driver actually decides to submit the commands to the GPU, thus there is no apple-to-apple comparison here.

    Quote Originally Posted by Nowhere-01 View Post
    putting glFinish(); after each glEndQuery(...); does nothing. it behaves exactly the same way - all objects pass or gpu stall.
    This sounds like a potential driver bug (if you did everything properly).

    Quote Originally Posted by Nowhere-01 View Post
    and i think you misunderstood me. GL_QUERY_WAIT doesn't just cause temporal freeze or slowdown, i would expect that. it causes GPU stall long enough to crash the driver.
    Yes, I misunderstood you. This definitely sounds like a driver bug (once again, assuming you did everything properly).
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

  6. #6
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    985
    Quote Originally Posted by Alfonse Reinheart View Post
    That should be "at the time the GPU executes the BeginConditionalRender part". The query shouldn't have to be finished yet.
    Yes, sorry I used wrong wording, I did mean when BeginConditionalRender is actually processed on the GPU.
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

  7. #7
    Member Regular Contributor Nowhere-01's Avatar
    Join Date
    Feb 2011
    Location
    Novosibirsk
    Posts
    251
    Quote Originally Posted by Alfonse Reinheart View Post
    I'm curious: how do you detect that it is being rendered?
    wireframe mode, i also have some lens-flares and my occlusion FBO is very low-res, so small distant objects should disappear

    Quote Originally Posted by Alfonse Reinheart View Post
    The general idea with conditional render with NO_WAIT is similar to PBOs; as long as you can put sufficient distance between the query and the conditional part, you can get something useful out of it. If you render them one right after the other, it'll never be useful.
    that was my B-plan. if it doesn't handle things in a magic way i expected, at least i had several passes between rendering to queries and actually using those.

    Quote Originally Posted by Alfonse Reinheart View Post
    Does this happen when you don't query the number of samples passed?
    do you mean removing "glGetQueryObjectuiv(occQuery, GL_QUERY_RESULT, &numSamples)" or using query with GL_ANY_SAMPLES_PASSED? anyway, answer is positive. it still stalls.

    Quote Originally Posted by aqnuep View Post
    This sounds like a potential driver bug (if you did everything properly).
    well... i just tried putting Sleep(1000); after my occlusion query pass. it renders several frames and then stalls with GL_QUERY_WAIT. and the fact that manually waiting for GL_QUERY_RESULT_AVAILABLE doesn't fix issue(it stalls on a 1st frame without sleep) - makes it weird. i have no idea, what's going on there. but i am biased towards conclusion that it is my mistake, because i'm quite confused and tired now.
    Last edited by Nowhere-01; 02-13-2013 at 02:22 PM.

  8. #8
    Super Moderator Frequent Contributor Groovounet's Avatar
    Join Date
    Jul 2004
    Posts
    934
    I haven't read the entire thread so sorry if I am off topic but I wonder whether the issue isn't that you expect too much of conditional rendering.

    With conditional rendering, only Draw and Clear commands are affected not all commands.

    I recently noticed an AMD OpenGL implementation bug where Clear commands are not affected by conditional rendering.

  9. #9
    Member Regular Contributor Nowhere-01's Avatar
    Join Date
    Feb 2011
    Location
    Novosibirsk
    Posts
    251

    Post

    no, the issue was, in short: no matter what i did, conditional render always passed(didn't affect glDrawElements) if i used it with GL_NO_WAIT, and it always stalled GPU with infinite wait if i used GL_WAIT. the same code with the same scene worked if i requested occlusion query result the normal way with glGetQueryObjectuiv.

    now i switched back to normal occlusion query. and i'm "happy" to report, that it is mostly broken for AMD cards too. but the issue is different. it has HUGE delay. if i render to occlusion about 12 objects of varying size, using 256x256 FBO with color mask disabled, i get to wait 25ms until occlusion query is ready. this is already an awful result, i was testing it with HD 6670 on latest drivers(Catalyst 13.1). it seems heavily fillrate limited, because testing about 50 small billboards to test lens flares for light sources takes about 1-2ms and rendering bounding boxes instead of objects doesn't help at all. but bare with me, i get 25ms if i do this after occlusion query finished:
    Code :
    while(!occQueryAvailable){     
        glGetQueryObjectuiv(Objects[lastObject].occQuery,GL_QUERY_RESULT_AVAILABLE, &occQueryAvailable);  
    }

    you don't get occlusion query results like that, right? you do some stuff and then request result. maybe it will be alright...
    so i've made my code to check if occlusion query is ready to execute on the next frame. with 30 fps, it had 33 ms to finish. but it didn't. and it didn't after 2 frames. it took 3-4 frames until occlusion query was ready. it's 105ms average. if 25 would be ok for GeForce FX5200, i don't have words to express how ridiculous is 105 ms. so it seems like occlusion query is totally broken on AMD cards. it seems like it gets delayed more and more with rendering commands pushed to pipeline. same occlusion query algorithm takes less than 1 ms on 560Ti with glFinish(to block until occlusion query is ready). at this point i expect either tomatoes being thrown at me for doing something catastrophically wrong, either AMD employee showing up and taking care of it.
    Last edited by Nowhere-01; 02-18-2013 at 01:56 PM.

  10. #10
    Member Regular Contributor Nowhere-01's Avatar
    Join Date
    Feb 2011
    Location
    Novosibirsk
    Posts
    251
    if someone else is also interested in this topic, a little update: http://devgurus.amd.com/message/1287564#1287564

    maybe i get some explanation here, what the hell is going on? maybe i am ignorant in this situation? but i don't get, what is the purpose of fixing conditional rendering when occlusion query is so uselessly slow. and judging by lack of reaction on my posts about it - AMD employer thinks it's totally ok to have occlusion query in a very simple scene, low-resolution framebuffer to finish in 100+ ms(or 25ms if you force it with glFinish, but it's not an acceptable thing to do) it takes multiple frames to finish, it's not acceptable. i don't get how can they ignore the fact that on proper GPU it finishes in about 1 ms.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •