glDrawPixels slow down when set GL_MAP_STENCIL to true on nvidia

Have written the following code:


        glPixelTransferi(GL_MAP_STENCIL,TRUE);

        long timeBeforeDrawPixels=timeGetTime();

        glDrawPixels(FINAL_WIDTH,FINAL_HEIGHT,GL_STENCIL_INDEX,GL_UNSIGNED_BYTE,pboImageReadBackRed);
        //FINAL_WIDTH and FINAL_HEIGHT are set to 1024 and 768 respectively

        long timeAfterDrawPixels=timeGetTime();
        
        printf("We found this artificial noise lasted for %ld miliseconds.
", timeAfterDrawPixels-timeBeforeDrawPixels);
        //I got above 400 ms here on GeForce GTX 650 Ti.

        glPixelTransferi(GL_MAP_STENCIL,FALSE);

Can anyone give any hint on this issue? Is it a driver error or architecture error? Thanks in advance!:doh:

Why no one answers? Is there anything not declared clearly?
And brave Nowhere-01, why not show your omnipresent heroism this time?

It’s been less than 12 hours, and it’s possibly few if any have ever done this.

I personally have never even thought to look and see if this is supported. CPU<->GPU = slow. Keep it all on the GPU when possible.

I could guess at this, but not sure it would help you much. Is pboImageReadBackRed a CPU pointer or an offset into a bound PBO? If the latter when was it last uploaded (or generated)?

[QUOTE=Dark Photon;1248030]It’s been less than 12 hours, and it’s possibly few if any have ever done this.

I personally have never even thought to look and see if this is supported. CPU<->GPU = slow. Keep it all on the GPU when possible.

I could guess at this, but not sure it would help you much. Is pboImageReadBackRed a CPU pointer or an offset into a bound PBO? If the latter when was it last uploaded (or generated)?[/QUOTE]

Thanks a lot, Dark Photon!

At first, I also think it may due to the bandwidth between CPU<->GPU. But after I modified the code to




        //glPixelTransferi(GL_MAP_STENCIL,TRUE);  //<------------------------- Do not map stencil
 
        long timeBeforeDrawPixels=timeGetTime();
 
        glDrawPixels(FINAL_WIDTH,FINAL_HEIGHT,GL_STENCIL_INDEX,GL_UNSIGNED_BYTE,pboImageReadBackRed);
        //FINAL_WIDTH and FINAL_HEIGHT are set to 1024 and 768 respectively
 
        long timeAfterDrawPixels=timeGetTime();
 
        printf("Now, we found this artificial noise is cancelled out, so drawing takes only %ld miliseconds.
", timeAfterDrawPixels-timeBeforeDrawPixels);
        //now we've got only about 30 ms here on GeForce GTX 650 Ti.
 
        //glPixelTransferi(GL_MAP_STENCIL,FALSE);  //<------------------------- Do not map stencil


No delay was found at all!

So, this is an obvious fault, most probably in hardware’s opengl driver, perhaps even intended:sick:.
And most probably others have encountered this fault, should know how to fix it or know where to download a fixed version of the Nvidia’s driver.

I know Nowhere-01 is a sly guy, he seems to know everything and is always ready to give instructions, I even think he is an employee of Nvidia, but why this time he hides his head like a goddamn turtle:sick:!

i guess, you are still trying to blit stencil. it’s done like that:

[ul]
[li]you initialize source and destination FBO’s. both should have stencil attachment, defined like that: glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH24_STENCIL8, Width, Height);[/li]glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_STENCIL_ATTACHMENT, GL_RENDERBUFFER, renderBuffer);
[li]you render to source buffer and fill stencil with whatever you want [/li][li]you bind destination buffer as a draw_buffer - glBindFramebuffer(GL_DRAW_FRAMEBUFFER, bufferObject), and source buffer as read_buffer - glBindFramebuffer(GL_READ_FRAMEBUFFER, bufferObject). [/li][li]you blit from source to destination glBlitFramebuffer(0, 0, srcWidth, srcHeight, 0, 0, dstWidth, dstHeight, GL_STENCIL_BUFFER_BIT, GL_NEAREST); [/li][li]GL_NEAREST is important, depth\stencil buffer is not directly compatible with filtering. [/li][li]you use stencil as usual in destination frame buffer; [/li][/ul]

i’ve ignored your topic because of that: https://www.opengl.org/discussion_boards/showthread.php/180986-Some-stencil-fbo-code-runs-on-ati-but-not-on-nvidiayou’ve got the solution to one of your problems, but instead of commenting it you’ve posted some unrelated message quoting yourself. and that’s not the first time you’ve completely ignored an answer.
and in this topic you’ve posted more text discussing me, than anything related to your problem. was that necessary?
no, i’m not related to nvidia or any other corporation. and i’m very far in my knowledge level from someone you can assume is an experienced professional.

i think your method is slow, because you were using legacy functionality, which is not hardware-accelerated in any way.

And most probably others have encountered this fault, should know how to fix it or know where to download a fixed version of the Nvidia’s driver.

Or perhaps there is no “fixed version” because it’s not broken. Just because it’s slow doesn’t mean it’s broken. It’s slow because MAP_STENCIL is some terrible ARB_imaging feature that’s not hardware accelerated. And it’s not going to become hardware accelerated in the near future.

Don’t use this map-stencil stuff. If you need to process the image data, process it yourself; don’t rely on the driver to do it for you. And especially don’t rely on anything in the imaging subset.

[QUOTE=Nowhere-01;1248039]i guess, you are still trying to blit stencil. it’s done like that:

[ul]
[li]you initialize source and destination FBO’s. both should have stencil attachment, defined like that: glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH24_STENCIL8, Width, Height);[/li]glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_STENCIL_ATTACHMENT, GL_RENDERBUFFER, renderBuffer);
[li]you render to source buffer and fill stencil with whatever you want [/li][li]you bind destination buffer as a draw_buffer - glBindFramebuffer(GL_DRAW_FRAMEBUFFER, bufferObject), and source buffer as read_buffer - glBindFramebuffer(GL_READ_FRAMEBUFFER, bufferObject). [/li][li]you blit from source to destination glBlitFramebuffer(0, 0, srcWidth, srcHeight, 0, 0, dstWidth, dstHeight, GL_STENCIL_BUFFER_BIT, GL_NEAREST); [/li][li]GL_NEAREST is important, depth\stencil buffer is not directly compatible with filtering. [/li][li]you use stencil as usual in destination frame buffer; [/li][/ul]

i’ve ignored your topic because of that: https://www.opengl.org/discussion_boards/showthread.php/180986-Some-stencil-fbo-code-runs-on-ati-but-not-on-nvidiayou’ve got the solution to one of your problems, but instead of commenting it you’ve posted some unrelated message quoting yourself. and that’s not the first time you’ve completely ignored an answer.
and in this topic you’ve posted more text discussing me, than anything related to your problem. was that necessary?
no, i’m not related to nvidia or any other corporation. and i’m very far in my knowledge level from someone you can assume is an experienced professional.

i think your method is slow, because you were using legacy functionality, which is not hardware-accelerated in any way.[/QUOTE]

Well, Nowhere-01. Anyway you have shown up again!
But, this time, I think you should reread this thread carefully. The last problem on this thread http://www.opengl.org/discussion_boards/showthread.php/180986-Some-stencil-fbo-code-runs-on-ati-but-not-on-nvidia?p=1248046#post1248046 has been solved.

This question is no more harder than the last one of http://www.opengl.org/discussion_boards/showthread.php/180986-Some-stencil-fbo-code-runs-on-ati-but-not-on-nvidia?p=1248046#post1248046.

And since you are an nvidia familiar, you can easily make out what is wrong here.
So, this time, why not use your agile mind and execute your omnipresent heroism to show me your epertise knowledge between opengl and nvidia?:devilish:

Best regard,

newbiecow

can you specify problem? blitting works for you on AMD hardware? if it doesn’t, what code exactly you use to blit? my method doesn’t work? and let’s stay in this topic.

[QUOTE=Alfonse Reinheart;1248042]Or perhaps there is no “fixed version” because it’s not broken. Just because it’s slow doesn’t mean it’s broken. It’s slow because MAP_STENCIL is some terrible ARB_imaging feature that’s not hardware accelerated. And it’s not going to become hardware accelerated in the near future.

Don’t use this map-stencil stuff. If you need to process the image data, process it yourself; don’t rely on the driver to do it for you. And especially don’t rely on anything in the imaging subset.[/QUOTE]

Thanks a lot, dear Alfonse Reinheart. Since I’m only an opengl application developer and not a member of ARB, I can’t fathom deep enough to the cause of such important abridgment. But as being only an application developer, I just care about the method how the mapping of stencil buffer can be done substitutely.
I think any modification on an industry standard should deliberate the compatibility. If a function is cut, any previous implemention depending on it should be considered over of an alternative on the new version.

As a veteran in this field, Alfone Reinheart, can you give me some advice, if there is a way anyway, on how my task can be fulfilled without the usage of stencil mapping of ARB_imaging. And the most important of all, without the lose of efficiency of course!:tired:

Well. The last problem has already been solved. If you wish to know the origin of my question, I think the best way is to get an ATI card and write some stencil blitting code, then you will know the difference between them.

As for this thread, you can see here Alfone Reinheart had found the cause of it.
Can you please also help me to think over if there is an alternative way to write my code without loss of efficiency?

Best regards,

nuclear bomb

[QUOTE=newbiecow;1248065]Well. The last problem has already been solved. If you wish to know the origin of my question, I think the best way is to get an ATI card and write some stencil blitting code, then you will know the difference between them.

As for this thread, you can see here Alfone Reinheart had found the cause of it.
Can you please also help me to think over if there is an alternative way to write my code without loss of efficiency?
[/QUOTE]

i was asking because i actually use method of blitting, described above, and i’d like to know if it doesn’t work on AMD cards and what exactly works. that’s the main reason i was so invested into this question.

to get help replacing your code, you should describe, what exactly are you trying to achieve. you are trying to output stencil buffer? for what exactly?

[QUOTE=Nowhere-01;1248066]i was asking because i actually use method of blitting, described above, and i’d like to know if it doesn’t work on AMD cards and what exactly works. that’s the main reason i was so invested into this question.

to get help replacing your code, you should describe, what exactly are you trying to achieve. you are trying to output stencil buffer? for what exactly?[/QUOTE]

You will know when you try this by yourself.:disgust:

welcome to my ignore list.