PDA

View Full Version : shader slowdown



driver
11-18-2014, 02:09 PM
HI

I notice that as soon as i put in the following line in my compute shader code, it slows down. Does anyone have ideas as to why?


Attempt 1:


float result = float(int(1 | (-1*int(x > threshold))));


Attempt 2:


float result=float(int(x < threshold) - int(x > threshold));


Both of these return -1 if x (the value being tested) is greater than threshold, 0 if equal, and 1 if x is less than threshold.

threshold is a uniform declared at the beginning of the shader.



uniform float threshold;

float x;



Note: x is a float read out of a SSBO or array buffer. I just labelled it float x her for brevity.

Much thanks.

William87
11-21-2014, 02:13 AM
Hi.

Compute shaders provide some form of memory sharing and thread synchronization.
(I assume you are using win and dx11)

I had this problem on my previous computer, but it seemed to resolve "it self" on my new.

Have you tried with other threshold values (just for testing)?

Good luck.
Best regards.

driver
11-21-2014, 01:46 PM
actually i am using win and opengl /glsl shaders, not dx11. i have tried two threshold values - 0 and 0.5. It doesn't make any difference though.

Dark Photon
11-21-2014, 06:11 PM
I notice that as soon as i put in the following line in my compute shader code, it slows down. Does anyone have ideas as to why?
...
Attempt 2:


float result=float(int(x < threshold) - int(x > threshold));


Both of these return -1 if x (the value being tested) is greater than threshold, 0 if equal, and 1 if x is less than threshold.

Interesting!

What about:


float result = sign( threshold-x );

driver
11-23-2014, 01:04 PM
Actually I have tried this too - doesn't help.

BUT

if i do just float result=sign(x) , it works fast....even tough this is not what i want. Is there any way the gpu optimizes this differently ?

Why would


float result=sign(x)

be fast but


float result= sign(threshold-x);


not be fast?

Dark Photon
11-24-2014, 05:49 AM
Why would "sign(x)" be fast but "sign(threshold-x);" not be fast?

Is this the only place that the "threshold" uniform is used?

If so, then the latter is going to pull in one more uniform into your shader (uniforms are silently removed if you don't reference them). It's possible that the amount of uniform space that your shader is consuming is the factor.

That said, define "fast" and "slow" in your use case. What frames/sec (or kernel executions/sec) are we talking about here for the "fast" and "slow" case, and how many of these operations are being executed per frame (or per kernel execution)?