View Full Version : Shader "return value"

A. Masserann
09-25-2008, 04:46 PM

I'm looking for a way to check, on the CPU, whether a fragment has met a certain condition or not.

In other words, I would like a boolean which is true if and only if one or more fragment shader executions have done a special action.

At the first glance I thought I could simply use a 1x1 texture, but I can't write to a texture which is not a RT ( can I ? )

Another idea I've had is to use the histogram functionnality : since I don't really care about the real colors ( I'm doing some weird kind of GPGPU ) I could just have the fragment color to be red if the condition is met, and blue otherwise. But how fast is that ? Aren't they faster/simpler solution(s) ?

Thanks !

09-25-2008, 09:51 PM
You can render your color to the framebuffer or a render_to_texture. You can make your window or render_to_texture 1x1 if you want.
I've never used histogram so I can't say if it is fast or not.

09-25-2008, 11:21 PM
Histrograms are not accelerated. Hovewer, you can write a shader that discards or accepts fragments based on the condition and then use the occlusion query.

A. Masserann
09-27-2008, 02:16 AM
I had already thought about occlusion query but it is not an option since I'm already using it for other needs. ( I eventually couly render the scene twice and make 2 different queries ... )

V-man : Will it really work ?
I want a 512*512 buffer to tell me that one or more fragments have met the condition. If I render to a 1*1 texture, the fragment shader will be executed only once, won't it ?
Moreover, shaders are supposed to write to gl_FragColor or they won't compile. But if a shader execution "outputs" 1, and then another one ouputs 0, the value will be cleared ...

Another solution I've been thinking about it to render the condition value in another framebuffer ( 512*512 ) and then pass it to another shader that could shrink it to a 256*256, 128*128, ... 1*1 tex in several passes, each time taking the max of the four texel values.

So, my three options seem to be :
- use the histogram and parse it on the CPU
- render my 2 __complex__ and often __huge__ VBOs twice, because I need 2 separate occlusion queries
- Add another "boolean" render target, shrink it on the GPU, download it to the CPU.

Do you have any idea would be the fastest ?