shader slowdown

HI

I notice that as soon as i put in the following line in my compute shader code, it slows down. Does anyone have ideas as to why?

Attempt 1:


float result = float(int(1 | (-1*int(x > threshold))));

Attempt 2:


float result=float(int(x < threshold) - int(x > threshold));

Both of these return -1 if x (the value being tested) is greater than threshold, 0 if equal, and 1 if x is less than threshold.

threshold is a uniform declared at the beginning of the shader.


uniform float threshold;

float x; 

Note: x is a float read out of a SSBO or array buffer. I just labelled it float x her for brevity.

Much thanks.

Hi.

Compute shaders provide some form of memory sharing and thread synchronization.
(I assume you are using win and dx11)

I had this problem on my previous computer, but it seemed to resolve “it self” on my new.

Have you tried with other threshold values (just for testing)?

Good luck.
Best regards.

actually i am using win and opengl /glsl shaders, not dx11. i have tried two threshold values - 0 and 0.5. It doesn’t make any difference though.

[QUOTE=driver;1262578]I notice that as soon as i put in the following line in my compute shader code, it slows down. Does anyone have ideas as to why?

Attempt 2:


float result=float(int(x < threshold) - int(x > threshold));

Both of these return -1 if x (the value being tested) is greater than threshold, 0 if equal, and 1 if x is less than threshold.[/QUOTE]

Interesting!

What about:


float result = sign( threshold-x );

Actually I have tried this too - doesn’t help.

BUT

if i do just float result=sign(x) , it works fast…even tough this is not what i want. Is there any way the gpu optimizes this differently ?

Why would


 float result=sign(x)

be fast but


 float result= sign(threshold-x);

not be fast?

Is this the only place that the “threshold” uniform is used?

If so, then the latter is going to pull in one more uniform into your shader (uniforms are silently removed if you don’t reference them). It’s possible that the amount of uniform space that your shader is consuming is the factor.

That said, define “fast” and “slow” in your use case. What frames/sec (or kernel executions/sec) are we talking about here for the “fast” and “slow” case, and how many of these operations are being executed per frame (or per kernel execution)?