use double precision inside GL - AGAIN

narrabeenzzz · October 28, 2003, 10:13pm

I just read a POST from way back in 2001 about why OPENGL should or should not have support for double precision. If every agrees that double precision may be a reality in 2030, then remove all double precision calls. The fact that they exist causes a lot of people to assume that if you use these calls, you can handle large coordinates. There is a simple solution that I’ve sort of discussed with a graphics card guru, and he thought the implimentation would be ‘relatively’ painless. Have a ‘locale’ like Java3D, which is the ONLY call that works in double precision and is only a translation transformation. That way, opengl can subtract/add this offset at the appropriate places. To do this outside of opengl calls (ie) do it yourself, is very error prone and intrusive. The fact is that with the volume of data people are wishing to render, and that double precision on most cpu’s is as fast/faster then float, everyone now uses double precision maths. What is the goal of additions to any software. To give more functionality to the end user, and if possible, make that task easier. The end user for OPENGL is us, programmers. So, if we don’t address/solve this issue, then every programmer has to invent a way around the problem, rather then a few vendors saying that it is too hard. If an api becomes so hard to use, users will look at other API’s.

OK, I’ve had my 2 cents worth.

Cyranose · October 29, 2003, 6:46pm

Hmm. I haven’t seen the data that says doubles are faster than float on most CPUs. SSE seems to favor floats for one thing…

Anyway, I’m not sure how much OpenGL can or should do to totally solve the problem, but here are some more toughts (this is after having worked on EarthViewer, which needed lots of DP support).

Requiring the ModelView stack to be DP at all levels prior to rendering might not cost too much on the driver side nowadays. That would catch some of the problems and probably be backward compatible (some apps might rely on the lack of precision, but it’s something that could probably be turned on or off).

The most common problem case I’ve seen is zooming in very close to a very “big” object with jello vertices as a result. So as long as the final modelview+projection transform is within float tolerances, it will help the jello problem somewhat, but not solve it.

Verts themselves can also be scaled way up or down or translated way out into space. What part of the driver chain or HW is supposed to catch that the modelview contains a scale or transform that causes otherwise valid FP verts to lose precision in the pipeline? Does it do a DP subtraction on all vertices on the fly? Before or after scale and rotation?

Seems like a true solution also wants verts to be generally in a +/- 1 local coordinate system and all scaling/transform done with a local DP transform instead. And I’d be much more hesitant about requiring that, even though normalizing is fairly common in practice.

Are there other places DP would need to be handled in a typical float-only HW pipeline?

Avi

[This message has been edited by Cyranose (edited 10-29-2003).]

narrabeenzzz · October 30, 2003, 7:41pm

Hi Cyranose,

It would be nice to have DP for the ModelView matrix stack, but there will probably be side affects in the Texture matrix stack (I think).

Given that we currently live with the precision problems in the ModelView stack, limiting changes to just 1 additional translation (it being the only DP calc) has more chance of being implimented.

I have implimented this stategy with very good results (outside of OpenGL). I had to wrap the opengl calls of interest to me to do it. But I still get caught out because it is a very manual process, and using new gl calls, have to spend a fair deal of time to get it right.

There would be a number of tricky bits inside opengl so solve. When sending the double[4] values to glTexGen calls, opengl whould have to modify the these values.
(that happens anyway when using EYE_LINEAR).

If you are moving over a very large scale database, you would probably have to move the Locale, which can lead to textures scrolling/translating, so some thought is required there.

Certainly if the full ModelView/Texture stack was DP, nearly all problems disappear ?
(xcept trying to display the universe, where as you say, we new quad precision

The folks making the graphic cards know much better then I, what really happens inside OpenGL. So if they would like to add their 5 cents worth, please do !!! I know that the first vendor to impliment an extension would get a case of beer from me, and my recommending that ALL of my customers should get that card. Hint Hint.

Cyranose · November 1, 2003, 1:56pm

Originally posted by narrabeenzzz:
[b]Hi Cyranose,

It would be nice to have DP for the ModelView matrix stack, but there will probably be side affects in the Texture matrix stack (I think).

Given that we currently live with the precision problems in the ModelView stack, limiting changes to just 1 additional translation (it being the only DP calc) has more chance of being implimented.
[/b]

Doing one DP addition per vertex per frame? I kind of doubt HW would want to support that if it didin’t have to.

Perhaps you could ask for DP support in vertex programs and do it that way? Like a ADP (ADD DP) instruction that uses two floats for one DP slot. Somehow, that seems more likely than changing the spec to add DP support throughout the pipeline.

Hmm. Maybe it can even be done manually with two floats of vasttly different scale. With 23 bits of mantissa for each and appropriate exponents and predictable round-off errors, you might be able to recover 46 bits of mantissa. I’d need to think about it more to be sure.

Incidentally, in practice, we only had to do the the DP vertex translation once in a while (on significant camera motion), not every frame, but we were creating dynamic vertices.

The implications for the texture matrix are interesting. I’m not sure about the stack, but I can see texgen being an issue.

Avi

narrabeenzzz · November 4, 2003, 1:39pm

Originally posted by Cyranose:
Doing one DP addition per vertex per frame? I kind of doubt HW would want to support that if it didin’t have to.

Could this be done by the computers FPU?
Would this stall the GPU to much ? The good thing is we are not talking a full DP FPU. Only add or subtract. A $1 chip ? Or a bit of microcode.

Perhaps you could ask for DP support in vertex programs and do it that way? Like a ADP (ADD DP) instruction that uses two floats for one DP slot. Somehow, that seems more likely than changing the spec to add DP support throughout the pipeline.

This had been suggested by someone, but texgen is out in the cold ? What other parts ?

Hmm. Maybe it can even be done manually with two floats of vasttly different scale. With 23 bits of mantissa for each and appropriate exponents and predictable round-off errors, you might be able to recover 46 bits of mantissa. I’d need to think about it more to be sure.

Incidentally, in practice, we only had to do the the DP vertex translation once in a while (on significant camera motion), not every frame, but we were creating dynamic vertices.

Ditto.

The implications for the texture matrix are interesting. I’m not sure about the stack, but I can see texgen being an issue.

All of this is not an issue for games. They have a constained problem. With real world apps, people are wanting mm accuracy over large areas. Not sure what physics people do with things of the scale of a Planck length
which is roughly equal to 1.6 x 10-35 m or about 10-20 times the size of a proton

Cyranose · November 5, 2003, 11:45pm

Originally posted by narrabeenzzz:
All of this is not an issue for games. They have a constained problem. With real world apps, people are wanting mm accuracy over large areas. Not sure what physics people do with things of the scale of a Planck length
which is roughly equal to 1.6 x 10-35 m or about 10-20 times the size of a proton

I’m not sure I agree with that. I can think of a few MMO games that will want to go from space down to a planet’s surface without the traditional canned cross-fade effect. And that needs some sort of DP work-around or you get the jello effect.

Anyway, doing the DP add on the FPU isn’t practical since the vertices will ideally reside in AGP or video mem for speed. OTOH, there’s nothing stopping you from writing your own myglFixPrecision(vertex_array, offset) function on the CPU side of things, if that’s your preference.

Avi

Korval · November 6, 2003, 6:12pm

what physics people do with things of the scale of a Planck length
which is roughly equal to 1.6 x 10-35 m or about 10-20 times the size of a proton

They’re not concerned about rendering in hardware. They’re trying to solve a computation. Worst-case, they use a floating-point class that gets arbitrary precision. Much of the time, doubles are good enough for them. After all, you don’t have to work in meters; you can work in nano-meters.

I’m not sure I agree with that. I can think of a few MMO games that will want to go from space down to a planet’s surface without the traditional canned cross-fade effect. And that needs some sort of DP work-around or you get the jello effect.

There are a dozen really good ways to fake this without resorting to a cross-fade or double-precision.

In a general-purpose application, you might want to have doubles to “fix” the problem. But, for any particular circumstance, you can always fake it. For example, you could render things beyond a certain depth into a texture and throw that onto the background.

Cyranose · November 6, 2003, 10:39pm

Originally posted by Korval:
[b] There are a dozen really good ways to fake this without resorting to a cross-fade or double-precision.

In a general-purpose application, you might want to have doubles to “fix” the problem. But, for any particular circumstance, you can always fake it. For example, you could render things beyond a certain depth into a texture and throw that onto the background. [/b]

A dozen? I’d love to hear the top four or five, or even the top two. I’m serious. I’ve worked on this issue for a long time, so I’m sure we can all learn something here.

However, in the example you gave, a form of IBR might help with the loss of Z precision across a scene, but that’s not the problem we’re addressing.

If you were to zoom straight “down” towards an earth-sized planet from space, for example, you’d see the jello effect when you got really close, even with the entire image being more or less at the same Z within a frame.

The jello effect is pretty much independent of screen-space Z, mainly because typical near/far ranges don’t result in more than a few powers of 2 difference in scale between near and far planes. Zooming down from space to an earth-sized planet zooms over 16 powers of two typically. And anything over 12 or 13 (sometimes less) is probably going to be a problem due to multiplication effects in the viewing transform.

Anyway, using doubles throughout the pipeline thankfully isn’t necessary. The general solution is to constantly shift the frame of reference such that the center of viewing (not the camera’s center, but the viewing target’s) is somewhere near 0.0 float and that’s most easily done with DP matrices and an occasional DP vertex buffer offset to re-center the verts as needed.

Avi

[This message has been edited by Cyranose (edited 11-07-2003).]

narrabeenzzz · November 7, 2003, 7:44pm

I guess you could have a number of ways to solve the jello problem. It all comes down to how much work you want to do outside of OpenGL. I’ve also banged my head against this problem for a few years, and certainly the simplest one is a subtracting the eye or target coordinates before calling OpenGL for anything. It certainly sounds like it has been discovered a few times by different people. This is looking like a pattern to me.