PDA

View Full Version : precision problem of fragment program in GeForceFX



foollove
02-02-2004, 05:59 PM
Hi,everyone here,
I am really upset by the following problem, hope to get help from Gurus here.
In the following fragment programs, each instruction really run in IEEE32float?
if the input TEX0 data is big enough, no error occurs.
!!FP1.0
TEX R0, f[WPOS].xyxx, TEX0, RECT;
MUL R0, R0, p[0];
MOV o[COLR], R0;
END

But in this one, I just add or multiply them, the error is not tolerable. Of course the input TEX0 is smaller than the above one.
!!FP1.0
TEX R0, f[WPOS].xyxx, TEX0, RECT;
MUL R0, R0, p[0];
MUL o[COLR], R0, {10000};
END

Can anyone provide some suggestions? Thanks a lot.




[This message has been edited by foollove (edited 02-02-2004).]

JanHH
02-02-2004, 06:43 PM
I'm sorry for being unable to help you, but at least, you seem to be able to help me http://www.opengl.org/discussion_boards/ubb/wink.gif. This regards to the thread I started just before you.

If your FPs do not work ("loading fragment program failed"), what is wrong, your FP code or my loading code? How do you load them?

Jan

jwatte
02-02-2004, 07:13 PM
Please don't hi-jack posts.

Note that textures are typically stored with 8 bits per pixel precision, and frame buffers are typically scanned out with 8 bits per pixel precision. You can increase either, using extensions, to some extent.

What is the frame buffer into which you store that multiplied-by-10000 value? What is the error you're talking about; i e, what are you getting, and what would you expect to get?

Also, the spec does not guarantee IEEE 32-bit for fragment operations. ATI uses 24-bit floats (is that 16 bits mantissa? something like that). NVIDIA uses 32-bit or 16-bit, depending on what your precision hints are (I believe).

V-man
02-02-2004, 08:05 PM
In NVfp (Not ARB), it says that if you don't define the precision of the instruction, the destination register determines the precision.

R0 is 32 bit per component and so is o[COLR].

For this line
TEX R0, f[WPOS].xyxx, TEX0, RECT;

can't you do

TEX R0, f[WPOS], TEX0, RECT;

foollove
02-02-2004, 09:50 PM
//////////////////////////////////////////////////////////////////////
// NV_Fragment_Program
void LoadFragmentProgramFromFile(char *ShaderFile, unsigned int &ShaderID)
{
//////////////////////////////////////////////////////////////
/// Load the fragment program from a text file
FILE *fp;
unsigned char *stringShader;

if((fp = fopen(ShaderFile,"rb")) == NULL)
{
CString Msg;
Msg.Format("Error to read in the Fragment Program file <%s>",ShaderFile);
AfxMessageBox(Msg);
return;
}
fseek(fp, 0, SEEK_END);
int stringLength = ftell(fp);
fseek(fp, 0, SEEK_SET);
stringShader = new unsigned char[stringLength + 1];
fread(stringShader, 1, stringLength, fp);
stringShader[stringLength] = '\0';
fclose(fp);

///////////////////////////////////////////////////////////////
glGenProgramsNV(1, &ShaderID);
glBindProgramNV(GL_FRAGMENT_PROGRAM_NV, ShaderID);
glLoadProgramNV(GL_FRAGMENT_PROGRAM_NV, ShaderID, stringLength, stringShader);

///////////////////////////////////////////////////////////////
if(GL_INVALID_OPERATION == glGetError())
{
// Find the error position
GLint errPos;
glGetIntegerv( GL_PROGRAM_ERROR_POSITION_ARB, &errPos );

// Print implementation-dependent program
// errors and warnings string.
unsigned char *errString = (unsigned char*)glGetString(GL_PROGRAM_ERROR_STRING_ARB);

CString Msg;
Msg.Format("When Loading Fragment Program <%s>, Error at position: %d\nError Content:\n %s\n",ShaderFile, errPos, errString);
AfxMessageBox(Msg);
exit(0);
}

///////////////////////////////////////////////////////////////
delete[] stringShader;
stringShader = NULL;
return;
}


*********************************************
Here I use float pbuffer to compute the values which textures are also stored in float.
In my program, even I declare like the following, the computation error is still there.
MULR R0, R0, p[0];

Here I do Vector +/-/* Vector operation on GPU, when the element of vector is samll, the result is right. But if the element is big, then error occurs.

[This message has been edited by foollove (edited 02-02-2004).]

al_bob
02-03-2004, 05:36 AM
Can you give a concrete example of where multiplying numbers gives unacceptable results? What do you multiply 10000 by, and what did you expect it to give?
If your float buffer 16-bit per component or 32-bit?

Zengar
02-03-2004, 05:51 AM
I did some tests regarding precision... On NV boards computations are carried on about 4-7 digits after the point. So you shouldn't use too small numbers. Computation with large numbers go just fine.

foollove
02-04-2004, 05:51 AM
Can you give a concrete example of where multiplying numbers gives unacceptable results? What do you multiply 10000 by, and what did you expect it to give?
If your float buffer 16-bit per component or 32-bit?
What I do is in 32-bit float environment, float texture, float pbuffer.

It's really difficult to provide a simple example here. But I try to make myself clear.
The input vector1 is [0, 10^6, 2*10^6,3*10^6],
the input vector2 is [0,1,2,3], the difference of multiplication of these two vectors on GPU and that on CPU is zero.
But if the input vector1 changed to [0, 10^7, 2*10^7,3*10^7], the error sum between the result on CPU and on GPU is upto 128.

Zengar
02-04-2004, 06:55 AM
Why do you need to compute such a large numbers? Maybe you could find a workaroud? GeforceFX is constructed in a way to provide most "logical" computation precision. It's a pity..

al_bob
02-04-2004, 09:53 AM
But if the input vector1 changed to [0, 10^7, 2*10^7,3*10^7], the error sum between the result on CPU and on GPU is upto 128.
You realize that 10^7 * 1000 can't be represented exactly using IEEE-754? In fact, even 2*10^7 can't be represented exactly. You only have 24 bits (23+1) of mantissa in IEEE-754.

Since your CPU (likely an x86) does computations in 80-bit floating-point by default, I'd say that would be the reason for the discreptency.

foollove
02-05-2004, 02:16 AM
Thanks a lot.

Here, I just want to do some general purpose computation on GPU. So the precision is the No.1 problem I have to solve. And the next problem is that we should put the right data with texture in the right position in the form of fragment.

Are there anyone who are familiar with this? Come to have a deep discussion here. Thanks again.

lyq@ios.ac.cn

ZbuffeR
02-05-2004, 02:22 AM
Here, I just want to do some general purpose computation on GPU.
Do you know this website http://www.gpgpu.org/ ? They seem to deal with exactly that field of applications.

Edit: spelling.

[This message has been edited by ZbuffeR (edited 02-05-2004).]