The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

HI

I notice that as soon as i put in the following line in my compute shader code, it slows down. Does anyone have ideas as to why?

Attempt 1:
Code :
`float result = float(int(1 | (-1*int(x > threshold))));`

Attempt 2:
Code :
`float result=float(int(x < threshold) - int(x > threshold));`

Both of these return -1 if x (the value being tested) is greater than threshold, 0 if equal, and 1 if x is less than threshold.

threshold is a uniform declared at the beginning of the shader.

Code :
```uniform float threshold;

float x;```

Note: x is a float read out of a SSBO or array buffer. I just labelled it float x her for brevity.

Much thanks.

2. Hi.

Compute shaders provide some form of memory sharing and thread synchronization.
(I assume you are using win and dx11)

I had this problem on my previous computer, but it seemed to resolve "it self" on my new.

Have you tried with other threshold values (just for testing)?

Good luck.
Best regards.

3. actually i am using win and opengl /glsl shaders, not dx11. i have tried two threshold values - 0 and 0.5. It doesn't make any difference though.

4. Originally Posted by driver
I notice that as soon as i put in the following line in my compute shader code, it slows down. Does anyone have ideas as to why?
...
Attempt 2:
Code :
`float result=float(int(x < threshold) - int(x > threshold));`

Both of these return -1 if x (the value being tested) is greater than threshold, 0 if equal, and 1 if x is less than threshold.
Interesting!

Code glsl:
`float result = sign( threshold-x );`

5. Actually I have tried this too - doesn't help.

BUT

if i do just float result=sign(x) , it works fast....even tough this is not what i want. Is there any way the gpu optimizes this differently ?

Why would
Code :
` float result=sign(x)`
be fast but
Code :
` float result= sign(threshold-x);`

not be fast?

6. Originally Posted by driver
Why would "sign(x)" be fast but "sign(threshold-x);" not be fast?
Is this the only place that the "threshold" uniform is used?

If so, then the latter is going to pull in one more uniform into your shader (uniforms are silently removed if you don't reference them). It's possible that the amount of uniform space that your shader is consuming is the factor.

That said, define "fast" and "slow" in your use case. What frames/sec (or kernel executions/sec) are we talking about here for the "fast" and "slow" case, and how many of these operations are being executed per frame (or per kernel execution)?

Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•