precision problem of fragment program in GeForceFX

Hi,everyone here,
I am really upset by the following problem, hope to get help from Gurus here.
In the following fragment programs, each instruction really run in IEEE32float?
if the input TEX0 data is big enough, no error occurs.
!!FP1.0
TEX R0, f[WPOS].xyxx, TEX0, RECT;
MUL R0, R0, p[0];
MOV o[COLR], R0;
END

But in this one, I just add or multiply them, the error is not tolerable. Of course the input TEX0 is smaller than the above one.
!!FP1.0
TEX R0, f[WPOS].xyxx, TEX0, RECT;
MUL R0, R0, p[0];
MUL o[COLR], R0, {10000};
END

Can anyone provide some suggestions? Thanks a lot.

[This message has been edited by foollove (edited 02-02-2004).]

I’m sorry for being unable to help you, but at least, you seem to be able to help me . This regards to the thread I started just before you.

If your FPs do not work (“loading fragment program failed”), what is wrong, your FP code or my loading code? How do you load them?

Jan

Please don’t hi-jack posts.

Note that textures are typically stored with 8 bits per pixel precision, and frame buffers are typically scanned out with 8 bits per pixel precision. You can increase either, using extensions, to some extent.

What is the frame buffer into which you store that multiplied-by-10000 value? What is the error you’re talking about; i e, what are you getting, and what would you expect to get?

Also, the spec does not guarantee IEEE 32-bit for fragment operations. ATI uses 24-bit floats (is that 16 bits mantissa? something like that). NVIDIA uses 32-bit or 16-bit, depending on what your precision hints are (I believe).

In NVfp (Not ARB), it says that if you don’t define the precision of the instruction, the destination register determines the precision.

R0 is 32 bit per component and so is o[COLR].

For this line
TEX R0, f[WPOS].xyxx, TEX0, RECT;

can’t you do

TEX R0, f[WPOS], TEX0, RECT;

//////////////////////////////////////////////////////////////////////
// NV_Fragment_Program
void LoadFragmentProgramFromFile(char *ShaderFile, unsigned int &ShaderID)
{
//////////////////////////////////////////////////////////////
/// Load the fragment program from a text file
FILE *fp;
unsigned char *stringShader;

if((fp = fopen(ShaderFile,"rb")) == NULL)
{
	CString Msg;
	Msg.Format("Error to read in the Fragment Program file <%s>",ShaderFile);
	AfxMessageBox(Msg);
	return;
}
fseek(fp, 0, SEEK_END);	
int stringLength = ftell(fp);	
fseek(fp, 0, SEEK_SET);	
stringShader = new unsigned char[stringLength + 1];
fread(stringShader, 1, stringLength, fp);	
stringShader[stringLength] = '\0';	
fclose(fp);	

///////////////////////////////////////////////////////////////
glGenProgramsNV(1, &ShaderID);	
glBindProgramNV(GL_FRAGMENT_PROGRAM_NV, ShaderID);
glLoadProgramNV(GL_FRAGMENT_PROGRAM_NV, ShaderID, stringLength, stringShader);

///////////////////////////////////////////////////////////////
if(GL_INVALID_OPERATION == glGetError())
{
	// Find the error position
	GLint errPos;
	glGetIntegerv( GL_PROGRAM_ERROR_POSITION_ARB, &errPos );

	// Print implementation-dependent program
	// errors and warnings string.
	unsigned char *errString = (unsigned char*)glGetString(GL_PROGRAM_ERROR_STRING_ARB);

	CString Msg;
	Msg.Format("When Loading Fragment Program <%s>, Error at position: %d

Error Content:
%s
",ShaderFile, errPos, errString);
AfxMessageBox(Msg);
exit(0);
}

///////////////////////////////////////////////////////////////
delete[] stringShader;
stringShader = NULL;
return;

}


Here I use float pbuffer to compute the values which textures are also stored in float.
In my program, even I declare like the following, the computation error is still there.
MULR R0, R0, p[0];

Here I do Vector +/-/* Vector operation on GPU, when the element of vector is samll, the result is right. But if the element is big, then error occurs.

[This message has been edited by foollove (edited 02-02-2004).]

Can you give a concrete example of where multiplying numbers gives unacceptable results? What do you multiply 10000 by, and what did you expect it to give?
If your float buffer 16-bit per component or 32-bit?

I did some tests regarding precision… On NV boards computations are carried on about 4-7 digits after the point. So you shouldn’t use too small numbers. Computation with large numbers go just fine.

Can you give a concrete example of where multiplying numbers gives unacceptable results? What do you multiply 10000 by, and what did you expect it to give?
If your float buffer 16-bit per component or 32-bit?

What I do is in 32-bit float environment, float texture, float pbuffer.

It’s really difficult to provide a simple example here. But I try to make myself clear.
The input vector1 is [0, 10^6, 210^6,310^6],
the input vector2 is [0,1,2,3], the difference of multiplication of these two vectors on GPU and that on CPU is zero.
But if the input vector1 changed to [0, 10^7, 210^7,310^7], the error sum between the result on CPU and on GPU is upto 128.

Why do you need to compute such a large numbers? Maybe you could find a workaroud? GeforceFX is constructed in a way to provide most “logical” computation precision. It’s a pity…

But if the input vector1 changed to [0, 10^7, 210^7,310^7], the error sum between the result on CPU and on GPU is upto 128.

You realize that 10^7 * 1000 can’t be represented exactly using IEEE-754? In fact, even 2*10^7 can’t be represented exactly. You only have 24 bits (23+1) of mantissa in IEEE-754.

Since your CPU (likely an x86) does computations in 80-bit floating-point by default, I’d say that would be the reason for the discreptency.

Thanks a lot.

Here, I just want to do some general purpose computation on GPU. So the precision is the No.1 problem I have to solve. And the next problem is that we should put the right data with texture in the right position in the form of fragment.

Are there anyone who are familiar with this? Come to have a deep discussion here. Thanks again.

lyq@ios.ac.cn

Here, I just want to do some general purpose computation on GPU.

Do you know this website http://www.gpgpu.org/ ? They seem to deal with exactly that field of applications.

Edit: spelling.

[This message has been edited by ZbuffeR (edited 02-05-2004).]