GPU vs SSE registers (floating point math)

Is it right to assume that floating point math done on a modern GPU (via GLSL) will produce identical results to the SSE registers?

I want to move some colour management functionality off the CPU onto the GPU if possible, but need to guarantee the results are the same.

Many thanks

I don’t think this is the case… IMHO the precision of GPUs may be a bit lower. Why don’t you just try it out? I remember testing precision on my old GFFX some years ago and it wasn’t so great…

Thanks Zengar… I was hoping to be lazy and avoid testing it!

Obviously normal float math on a CPU is at higher precision, but I was hoping the 32bit SSE would be the same on a GPU.

FWIW - I’m looking to use floating point textures (ARB_TEXTURE_FLOAT / ATI_TEXTURE_FLOAT) which implies GF6 or better i believe.

Maybe GFFX is too limited for this.

Regards

I’m not sure about older GPUs, but on my geforce8 addition and multiplication are IEEE-compliant, so they have a maximum error of
0.5 ulp. I already checked it on a number of my gpgpu programs with a cpu fallback path and the results are the same, bit for bit. There are some differences with cpu math however, especially with the trigonometric and non-linear functions.

N.

Actually, the GFFX support the NV_float_buffer extension but its use is limited compared to the (ARB_TEXTURE_FLOAT / ATI_TEXTURE_FLOAT) extensions.

N.

Thanks NiCo

I remember some issues with SSE and SQRT or 1.0/x computations.
On 32bit float SSE you may have only about 4 signs precision instead of about 6 in FPU!

I decided some years ago not to implement SSE in my project (there wasn’t SSE2 this time) because of this big differences on FPU and SSE.

So, keep an eye of this. Maye its better with newer processors today… .

Good point. sqrt and reciprocal (1/x) are the ones to watch out for. Most likely the GPU doesn’t actually calculate the value but instead looks in a table and then does some interpolation between 2 neighboring values.
We could say the same for SSE but I’m sure don’t SSE and whatever GPU you have may produce diff values.

Also, GPU floating point calc is not fully IEEE compliant. It doesn’t deal with NAN properly. This might not be a problem for you.