Shader "return value"

Hi,

I’m looking for a way to check, on the CPU, whether a fragment has met a certain condition or not.

In other words, I would like a boolean which is true if and only if one or more fragment shader executions have done a special action.

At the first glance I thought I could simply use a 1x1 texture, but I can’t write to a texture which is not a RT ( can I ? )

Another idea I’ve had is to use the histogram functionnality : since I don’t really care about the real colors ( I’m doing some weird kind of GPGPU ) I could just have the fragment color to be red if the condition is met, and blue otherwise. But how fast is that ? Aren’t they faster/simpler solution(s) ?

Thanks !

You can render your color to the framebuffer or a render_to_texture. You can make your window or render_to_texture 1x1 if you want.
I’ve never used histogram so I can’t say if it is fast or not.

Histrograms are not accelerated. Hovewer, you can write a shader that discards or accepts fragments based on the condition and then use the occlusion query.

I had already thought about occlusion query but it is not an option since I’m already using it for other needs. ( I eventually couly render the scene twice and make 2 different queries … )

V-man : Will it really work ?
I want a 512512 buffer to tell me that one or more fragments have met the condition. If I render to a 11 texture, the fragment shader will be executed only once, won’t it ?
Moreover, shaders are supposed to write to gl_FragColor or they won’t compile. But if a shader execution “outputs” 1, and then another one ouputs 0, the value will be cleared …

Another solution I’ve been thinking about it to render the condition value in another framebuffer ( 512512 ) and then pass it to another shader that could shrink it to a 256256, 128128, … 11 tex in several passes, each time taking the max of the four texel values.

So, my three options seem to be :

  • use the histogram and parse it on the CPU
  • render my 2 complex and often huge VBOs twice, because I need 2 separate occlusion queries
  • Add another “boolean” render target, shrink it on the GPU, download it to the CPU.

Do you have any idea would be the fastest ?