Nested loop in GLSL causes driver crash.

Hello,

I’m trying to write an image fusion program in OpenGL/GLSL. At one point I need to calculate a cumulated histogram for a 7x7 pixel area. In fragment shader I’m trying to use a function:


float graythresh(float aArea[49])
  {
  float hist[256];
  for(int i = 0; i < 49; i++)
    hist[(int)(aArea[i]*255)]+= 1.0;

  for(int i = 0; i < 256; i++)
    hist[i] /= 49.0;

  for(int i = 255; i >= 0; i--)
    for(int j = 255; j >= 0; j--)
      if(j > i)
        hist[i] += hist[j];

  return 0.0;
  }

The last loop causes application crash and error “driver stopped responding”. I tried using older drivers, and testing the application on different computer, nothing changed.


  for(int i = 255; i >= 0; i--)
    for(int j = i-1; j >= 0; j--)
        hist[i] += hist[j];

Gave same error. Is it even possible to this?

Thanks in Advance, any help will be appreciated.

(I’m using GeForce GT 420M, windows 7)

[QUOTE=Aethanol;1238562]The last loop causes application crash and error “driver stopped responding”. …


  for(int i = 255; i >= 0; i--)
    for(int j = i-1; j >= 0; j--)
        hist[i] += hist[j];

…(I’m using GeForce GT 420M, windows 7)[/QUOTE]
Well, that behavior’s obviously not supposed to happen. Are you checking for GL errors? Making sure your shaders compile and link successfully? Printing the shader info/error log to your console/log file? Still shouldn’t happen, but I wonder if you’re getting any clues before it gives up the ghost.

Anyway, I can only guess you’re exceeding the fragment instruction limit of your GPU (or something), possibly due loop unrolling or some such. Try putting this before your doublely-nested loop:


#pragma optionNV(unroll none)

This “should” disable all loop unrolling, savings a boatload of instructions, but possibly to the detriment of performance. If that works, you can play with other options to fine-tune if/how long a loop the driver will unroll:


#pragma optionNV(unroll count=#)

which should direct the compiler to only unroll the loop if it results in less than # instructions (replace # with a number).

Other options you can play with can be found here:

Thanks, for your reply.

I tried adding this option, but it gives me compilation errors.

I changed the code fixing all the warnings, and now it’s like this:


"uniform sampler2D myTexture;"
"uniform int uWidth;"
"uniform int uHeight;"
""
"float pixH = 1.0/float(uWidth);"
"float pixV = 1.0/float(uHeight);"
""
"float graythresh(float aArea[49])"
"  {"
"  float hist[256];"
"  int v;"
"#pragma optionNV(unroll none)"
"  for(int i = 0; i < 49; i++)"
"    for(v = int(aArea[i]*255.0); v < 256; v++)"
"      hist[v]+= 0.020408;"//adding ~1/49 to all histogram bins greater than actual value
""
"  return 0.5;"//just a test, to check if the shader passed
"  }"
""
"void main (void)"
"{"
"vec4 comp = vec4(0.0, 0.0, 0.0, 0.0);"
"float area[49];"
"float sum;"
"float maxf;"
"float current;"
"for(int k = 0; k <3; k++)"
"  {"
"  sum = 0.0;"
"  maxf = 0.0;"
"  current = 0.0;"
"  for(int i = 0; i < 7; i++)"
"    {"
"    for(int j = 0; j < 7; j++)"
"      {"
"      current = texture2D(myTexture, vec2(gl_TexCoord[0].x+float(i-3)*pixH, gl_TexCoord[0].y+float(j-3)*pixV))[k];"
"      sum += current;"
"      maxf = max(maxf, abs(current));" 
"      area[i*7+j] = current;"
"      }"
"    }"
"  if(maxf > 0.0)"
"    {"
"    for(int i = 0; i < 49; i++)"
"      area[i] /= maxf;"
"    }"
"  comp[k] = graythresh(area);"
"  }"
"gl_FragColor = comp;"
"}"

Without the pragma it behaves, like I described earlier, with it, I get error:


0(1) : error C0000: syntax error, unexpected $undefined at token "#"
0(1) : error C0000: syntax error, unexpected ')', expecting ',' or ';' at token")"
0(1) : error C0000: syntax error, unexpected ')', expecting ',' or ';' at token")"

Am I not using it right?

Edit:
error was cause by lack of newline, but after adding the pragma, it still does not work.

You are using rather slow hardware (GT 420) I think that your shader takes to long time to execute and OS kills your app.

Nope, that’s not it. If i start to unroll the loop by hand I can get few times more data processed without any issue, that would cause an error while processing in a loop.

I’ve had problems with nested loops in GLSL
just unroll, problem solved

You can try unroll all in the pragma, but don’t have high hopes since your loop count depends on your data.

If you haven’t already, upgrade to latest NVidia drivers and retry.

If that doesn’t fix it, isolate it into a small, stand-alone test prog you can post. If you do, others here can verify and give you ideas. And assuming it’s NVidia’s bug and not yours, NVidia is almost sure to fix it. They’re very good about addressing problems if you isolate the problem to a small stand-alone test program.

I tried using various drivers, many different loops, even unrolled the whole loop myself, but it didn’t helped. Finally I tried running, the same program on a desktop GeForce GTX 460, and it worked. Turns out my graphic card probably didn’t have enough memory (either for data or unrolled code). Now I’m using smaller images, while still developing.
Thank You all for Your support.