And this is the ONLY change in the code. The results should be the same, but only the two-line version works. Any ideas? Could be some hw limitation I don’t know about? The card is ATI Radeon 9600 or 9800.
I had a similar problem with GF6600, so I don’t think its a hardware limitation.
AFAIK GPUs can multiply the result of an operation by 1/4, 1/2, 2, 4 and 8 without the need for an extra instruction, but compilers seem to become confused easily when they try to make use of this feature, so it could be a compiler bug.
Originally posted by tomtrenki: I had a similar problem with GF6600, so I don’t think its a hardware limitation.
AFAIK GPUs can multiply the result of an operation by 1/4, 1/2, 2, 4 and 8 without the need for an extra instruction, but compilers seem to become confused easily when they try to make use of this feature, so it could be a compiler bug.
Actually here 0.5 fixed it. It didn’t matter that it was 0.5 in particular. The problem appears to be that the results were never written to the output register.
Originally posted by tomtrenki: I had a similar problem with GF6600, so I don’t think its a hardware limitation.
AFAIK GPUs can multiply the result of an operation by 1/4, 1/2, 2, 4 and 8 without the need for an extra instruction, but compilers seem to become confused easily when they try to make use of this feature, so it could be a compiler bug.
The interesting part is that the code before the problematic line reads something like that: