I’m trying to write an image fusion program in OpenGL/GLSL. At one point I need to calculate a cumulated histogram for a 7x7 pixel area. In fragment shader I’m trying to use a function:
float graythresh(float aArea[49])
{
float hist[256];
for(int i = 0; i < 49; i++)
hist[(int)(aArea[i]*255)]+= 1.0;
for(int i = 0; i < 256; i++)
hist[i] /= 49.0;
for(int i = 255; i >= 0; i--)
for(int j = 255; j >= 0; j--)
if(j > i)
hist[i] += hist[j];
return 0.0;
}
The last loop causes application crash and error “driver stopped responding”. I tried using older drivers, and testing the application on different computer, nothing changed.
for(int i = 255; i >= 0; i--)
for(int j = i-1; j >= 0; j--)
hist[i] += hist[j];
[QUOTE=Aethanol;1238562]The last loop causes application crash and error “driver stopped responding”. …
for(int i = 255; i >= 0; i--)
for(int j = i-1; j >= 0; j--)
hist[i] += hist[j];
…(I’m using GeForce GT 420M, windows 7)[/QUOTE]
Well, that behavior’s obviously not supposed to happen. Are you checking for GL errors? Making sure your shaders compile and link successfully? Printing the shader info/error log to your console/log file? Still shouldn’t happen, but I wonder if you’re getting any clues before it gives up the ghost.
Anyway, I can only guess you’re exceeding the fragment instruction limit of your GPU (or something), possibly due loop unrolling or some such. Try putting this before your doublely-nested loop:
#pragma optionNV(unroll none)
This “should” disable all loop unrolling, savings a boatload of instructions, but possibly to the detriment of performance. If that works, you can play with other options to fine-tune if/how long a loop the driver will unroll:
#pragma optionNV(unroll count=#)
which should direct the compiler to only unroll the loop if it results in less than # instructions (replace # with a number).
Other options you can play with can be found here:
Nope, that’s not it. If i start to unroll the loop by hand I can get few times more data processed without any issue, that would cause an error while processing in a loop.
You can try unroll all in the pragma, but don’t have high hopes since your loop count depends on your data.
If you haven’t already, upgrade to latest NVidia drivers and retry.
If that doesn’t fix it, isolate it into a small, stand-alone test prog you can post. If you do, others here can verify and give you ideas. And assuming it’s NVidia’s bug and not yours, NVidia is almost sure to fix it. They’re very good about addressing problems if you isolate the problem to a small stand-alone test program.
I tried using various drivers, many different loops, even unrolled the whole loop myself, but it didn’t helped. Finally I tried running, the same program on a desktop GeForce GTX 460, and it worked. Turns out my graphic card probably didn’t have enough memory (either for data or unrolled code). Now I’m using smaller images, while still developing.
Thank You all for Your support.