Iterate on previous result?

Dear all,
I have the following problem:

I am doing some image process with opengl.In the project there is a 2D texture with dimision 512x512. On every row,let a0,a1,a2 …a511 be the pixel value.I want to do the equation:

a1=a0+a1;
a2=a0+a1+a2;

a511=a0+a1+…+a511

I wonder if there is some fast method to do it.I have tried the fragment program with 4 texture fetches each time,then about 120 iterate,but it’s too slow on my card(FX 5200,2s).I want to do it real time.Can you give me some suggestion?

Thank you all.

creating SAT:s? nvidia showed a shortcut of doung that by render one column at the time, reading from the same texture as they are writing to (reading xpos-1, wrinting to xpos), and render as many strips as needed ( 512 in your case)

however, i havent tried it, and one that tried says it doenst worked for him, and the spec about render to texture states that this kind of rendering result in an undefined behaivour… but if you are bound to nvidia HW and want to try it, go ahead.

Originally posted by Mazy:
[b]creating SAT:s? nvidia showed a shortcut of doung that by render one column at the time, reading from the same texture as they are writing to (reading xpos-1, wrinting to xpos), and render as many strips as needed ( 512 in your case)

however, i havent tried it, and one that tried says it doenst worked for him, and the spec about render to texture states that this kind of rendering result in an undefined behaivour… but if you are bound to nvidia HW and want to try it, go ahead.[/b]

Thanks Mazy.
Can you talk it more clear? Or some stuff links?
thank you.

I’ not sure if you want some like this:
a1=a0+a1;
a2=a1+a2;
a3=a2+a3;
a4=a3+a4;

a511=a510+a511

this eq. is fasest than
“a1=a0+a1;
a2=a0+a1+a2;

a511=a0+a1+…+a511”

Originally posted by Ffelagund:
[b]I’ not sure if you want some like this:
a1=a0+a1;
a2=a1+a2;
a3=a2+a3;
a4=a3+a4;

a511=a510+a511

this eq. is fasest than
“a1=a0+a1;
a2=a0+a1+a2;

a511=a0+a1+…+a511”

[/b]

No,in fact the pixel value will depend on the all previous column pixel values.

with this you do that…
a1=a0+a1;
a2=a1+a2; (here a1 already is a0+a1);
a3=a2+a3; (and so on.)

a1=a0+a1;
a2=a0+a1+a2; (here you have a0+a1*2+a2) thats not what you want, is it?

Originally posted by Mazy:
[b]with this you do that…
a1=a0+a1;
a2=a1+a2; (here a1 already is a0+a1);
a3=a2+a3; (and so on.)

a1=a0+a1;
a2=a0+a1+a2; (here you have a0+a1*2+a2) thats not what you want, is it?

[/b]

Suppose in one row the raw pixel value is(only part of it ):

0 1 2 3 4 5 6 7 8 9

I want the result is :
0 1 3 6 10 15 21 28 36 45

Is it clear?

ok, the first one then

if nvidia is right and the render to a pbuffer bound as a texture you can do this with a simple technique

for this example we make it 1d

you have your array as before
0 1 2 3 4 5 6 7 8 9 (this will symbolize both the position and the value, ok? )

if you have that in a pbuffer that is bound as a texture and as a rendertarget you can draw a small (1pixel sized) polygon over the ‘1’ with texturecoords corresponding to ‘0’ and blend them with gl_one,gl_one, that will make the value 1 to be stored in the (1) position, then direcly after that, render another polygon, same size over the (2), with the texture coords pointing at (1), since 1 has the value 1 and we blend with an add we get the value 3 in (2), just continue doing that for the whole row and your done… if you have several rows

a0 a1 a2 a3 a4
b0 b1 b2 b3 b4

then make a quat that covers both rows (a1,b1) with texture coords that matches (a0,b0).

if you want to sum the other way as well (b0=b0+a0), just make another pass but draw the quads horizontally instead.

since this values very soon will be over 255 and wont fit inside a 32bit render target, i guess you will go for floatingpoint rendertargets, and those haven’t blending… just use the texture twice instead ( multitexture) and a fragment program to add them together.

Attention : this is untested by me, and the spec for pbuffers says it undefined, but ias i said before, nVidia has it in one of its papers and it probably work on their HW/drivers.

Thanks Mazy.Good idea.

Originally posted by Mazy:

since this values very soon will be over 255 and wont fit inside a 32bit render target, i guess you will go for floatingpoint rendertargets, and those haven’t blending… just use the texture twice instead ( multitexture) and a fragment program to add them together.

Yes ,I have been using the fp-buffer because of full precision.
What do you mean by "just use the texture twice instead ( multitexture) and a fragment program to add them together. "?Is it like this?
1:render 1st pass with mulitexture and
fragment program in a fp-buffer1.
2:bind fp-buffer1 to fp-buffer2, then
render 2nd pass with mulitexture and
fragment program in a fp-buffer2.
3:repeat 1,2 for all the pixels.

If it is ,then there is questions in it :
1:there will be many passes (>100) for us to iterate.This will kill the fps.
2:what is the function of multiture and fragment program in every pass?

I mean, draw the quad as in the example over that, but make 2 texture fetech, one from pixel-1 ( the previous result) and one from the current pixel, add them and put that in the current pixels position.

this is not multiple passes, its columns*rows number of quads youre drawing , and that should your card handle pretty well

Originally posted by Mazy:
[b]I mean, draw the quad as in the example over that, but make 2 texture fetech, one from pixel-1 ( the previous result) and one from the current pixel, add them and put that in the current pixels position.

this is not multiple passes, its columns*rows number of quads youre drawing , and that should your card handle pretty well[/b]

Sorry, I can’t understand.

first,since the fp-buffer does not support blend,then draw quads does no effect in adding.
second,you mean in a fragment only 2 textures are added, then how about the rest?

The whole trick is that you render to the same surface that you read from…

this = this + former (all in pixels)

so the when you render the 2nd pixel you fetch the first and second pixel in a fragment program, add them together and store them at the 2nd pixel… since your quads are drawn in order the second pixel will be written to before the next quad ( 3rd pixel) reads it, so by then the 2nd pixel is the sum of the 1st and 2nd pixels original values.

Originally posted by Mazy:
[b]The whole trick is that you render to the same surface that you read from…

this = this + former (all in pixels)

so the when you render the 2nd pixel you fetch the first and second pixel in a fragment program, add them together and store them at the 2nd pixel… since your quads are drawn in order the second pixel will be written to before the next quad ( 3rd pixel) reads it, so by then the 2nd pixel is the sum of the 1st and 2nd pixels original values.[/b]

Let us suppose you are right.Then the whole process is just like this:

glBindProgramNV(GL_FRAGMENT_PROGRAM_NV, add);
glEnable(GL_FRAGMENT_PROGRAM_NV);
glBindTexture(GL_TEXTURE_2D, texture);
glEnable(GL_TEXTURE_2D);
glBegin(GL_QUADS);

glTexCoord2f(0.0f, 0.0f);
glVertex2f(0, 599);
glTexCoord2f(1.0f, 0.0f);
glVertex2f( 511, 599);
glTexCoord2f(1.0f, 1.0f);
glVertex2f( 511, 88);
glTexCoord2f(0.0f, 1.0f);
glVertex2f(0, 88);

glEnd();
//repeat above?
glBegin(GL_QUADS);
glTexCoord2f(0.0f, 0.0f);
glVertex2f(0, 599);
glTexCoord2f(1.0f, 0.0f);
glVertex2f( 511, 599);
glTexCoord2f(1.0f, 1.0f);
glVertex2f( 511, 88);
glTexCoord2f(0.0f, 1.0f);
glVertex2f(0, 88);
glEnd();
… //perhaps 512 times

glDisable(GL_FRAGMENT_PROGRAM_NV);
glDisable(GL_TEXTURE_2D);

Is it right?

is that quad just a pixel in size?

first of all, nvidia only suports floatingpoint textures/rendertargets with RECT textures if im not missinformed, so texture coordinates should be between 0 to size-1

so
render the data to the pbuffer, and bind it as a texture, set up an ortho mode that makes opengl units corresponds to pixels, and then:
for i=0 to pixel in width {
glBegin(gl_quads);
glMultiTexcoord(GL_texture0, i-1,0);
glMultiTexcoord(GL_texture1, i,0);
glVertex2f(i,0);
glMultiTexcoord(GL_texture0, i-1,0);
glMultiTexcoord(GL_texture1, i,0);
glVertex2f(i+1,0);
glMultiTexcoord(GL_texture0, i,height);
glMultiTexcoord(GL_texture1, i+1,height);
glVertex2f(i+1,height);
glMultiTexcoord(GL_texture0, i,height);
glMultiTexcoord(GL_texture1, i+1,height);
glVertex2f(i,height);
glend();
}

the fragment shaders should to (psudocode)
fetch col0 = texcoord0,texture0,RECT;
fetch col1 = texcoord1,texture0,RECT;
out = col0+col1;

this can be put in a displaylist, or VBO instead. And you can make a vertexprogram that calculates the texturecoords instead of sending them yourself

Originally posted by Mazy:
[b]is that quad just a pixel in size?

first of all, nvidia only suports floatingpoint textures/rendertargets with RECT textures if im not missinformed, so texture coordinates should be between 0 to size-1

so
render the data to the pbuffer, and bind it as a texture, set up an ortho mode that makes opengl units corresponds to pixels, and then:
for i=0 to pixel in width {
glBegin(gl_quads);
glMultiTexcoord(GL_texture0, i-1,0);
glMultiTexcoord(GL_texture1, i,0);
glVertex2f(i,0);
glMultiTexcoord(GL_texture0, i-1,0);
glMultiTexcoord(GL_texture1, i,0);
glVertex2f(i+1,0);
glMultiTexcoord(GL_texture0, i,height);
glMultiTexcoord(GL_texture1, i+1,height);
glVertex2f(i+1,height);
glMultiTexcoord(GL_texture0, i,height);
glMultiTexcoord(GL_texture1, i+1,height);
glVertex2f(i,height);
glend();
}

the fragment shaders should to (psudocode)
fetch col0 = texcoord0,texture0,RECT;
fetch col1 = texcoord1,texture0,RECT;
out = col0+col1;

this can be put in a displaylist, or VBO instead. And you can make a vertexprogram that calculates the texturecoords instead of sending them yourself

[/b]

Thank mazy.I will try it.

Sorry,Mazy.I get zero value for all pixels.The code is :

glBindTexture(GL_TEXTURE_RECTANGLE_NV, fptexture);
glCopyTexSubImage2D(GL_TEXTURE_RECTANGLE_NV, 0,0,0,512,512, 512,512);
glReadPixels(0,512,512,GL_RGBA,GL_FLOAT,data);
add_bias(); //check the value
fpbuffer2.deactivate();
fpbuffer.activate();
glBindProgramARB(GL_FRAGMENT_PROGRAM_NV, simple_sum);
glEnable(GL_FRAGMENT_PROGRAM_NV);
glActiveTextureARB(GL_TEXTURE0_ARB);
glBindTexture(GL_TEXTURE_RECTANGLE_NV, fptexture);
glActiveTextureARB(GL_TEXTURE1_ARB);
glBindTexture(GL_TEXTURE_RECTANGLE_NV, fptexture);
glEnable(GL_TEXTURE_RECTANGLE_NV);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
for(int i=0;i<512;i++)
{
glBegin(GL_QUADS);
glMultiTexCoord2fARB(GL_TEXTURE0_ARB, i-1,0);
glMultiTexCoord2fARB(GL_TEXTURE1_ARB, i,0);
glVertex2f(i,0);
glMultiTexCoord2fARB(GL_TEXTURE0_ARB, i-1,0);
glMultiTexCoord2fARB(GL_TEXTURE1_ARB, i,0);
glVertex2f(i+1,0);
glMultiTexCoord2fARB(GL_TEXTURE0_ARB, i,511);
glMultiTexCoord2fARB(GL_TEXTURE1_ARB, i+1,511);
glVertex2f(i+1,511);
glMultiTexCoord2fARB(GL_TEXTURE0_ARB, i,511);
glMultiTexCoord2fARB(GL_TEXTURE1_ARB, i+1,511);
glVertex2f(i,511);
glEnd();
}
glReadPixels(0,0,512,512,GL_RGBA,GL_FLOAT,data);
add_bias(); //check the value

simple_sum:
!!FP1.0
TEX R1,f[TEX0],TEX0,RECT;
TEX R2,f[TEX1],TEX0,RECT;
ADD o[COLR],R1,R2;
END

what is wrong with it?

Originally posted by Mazy:
first of all, nvidia only suports floatingpoint textures/rendertargets with RECT textures if im not missinformed, so texture coordinates should be between 0 to size-1
Nope
Tex coords should be in the range 0 to size (without the -1).

zeckensack : Dang!!! well missed that one

xu : black?

First of all, you need something in your buffer
makeCurrent(pdc, pbuffercontext);
render alot of stuff

bind the buffer as a texture ( you only need to bind it once, since you can make 2 reads from the same texture)

then render… dont Clear before you do that… that will clear the texture you want to read from since the rendertarget and the sourcetexture are the same.

whats all the copy, and readpixels doing there?

Originally posted by Mazy:
[b]zeckensack : Dang!!! well missed that one

xu : black?

First of all, you need something in your buffer
makeCurrent(pdc, pbuffercontext);
render alot of stuff

bind the buffer as a texture ( you only need to bind it once, since you can make 2 reads from the same texture)

then render… dont Clear before you do that… that will clear the texture you want to read from since the rendertarget and the sourcetexture are the same.

whats all the copy, and readpixels doing there? [/b]

Sorry I forgot to say there is a render before the code into the float buffer fpbuffer2.

And the fpbuffer.activate() functions the same as makeCurrent(pdc, pbuffercontext);

the readpixels and add_bias() is used as checking the pixel value.The first time there is non-zero value.But the 2nd time only zero value.

I have used fragment program instead of multi-texture.But still the same zero value.
Only when I alter the glTexCoord to this:

glTexCoord2f(0.0f, 0.0f);glVertex2f(0, 599);
glTexCoord2f(511, 0.0f);glVertex2f(511, 599);
glTexCoord2f(511, 511);glVertex2f( 511, 88);
glTexCoord2f(0.0f, 511);glVertex2f(0, 88);

there is non zero value.So I think if it is possible to use the TexCoord with that?

Thanks Mazy.
Ok ,it works .
Ii is the coordinate mistake.I have tested it with simple prog.
Thank again.