How to setup projectionMatrix/worldMatrix for use with inverted float depth buffer?

Osbios · February 9, 2015, 11:46am

So I try to get the luxury of a decent depth buffer resolution but so far only see the horrors of z fighting.

Here is what I “understand”:

First I need GL_NV_depth_buffer_float with glDepthRangedNV(-1.0, 1.0)
or
GL_ARB_clip_control with glClipControl(GL_LOWER_LEFT, GL_ZERO_TO_ONE)

(As I understand it, the glDepthRange in core gl >= 4.3 is still clamped and therefor still useless for this?)

Then I use a FBO with a depth buffer of the type GL_DEPTH_COMPONENT32F

And now I also have to use a inverted projection matrix so that 0 becomes the far plane.
Including using clearDepth(1.0f) and glDepthFunc(GL_GREATER);

I tried scale with (1.0f, 1.0f, -1.0f) or invert the near and far values and a bunch of other stuff. But I never got any more precision out of it.
The only thing I noticed is, that I get more precision (like 1 bit?) when I use glClipControl(GL_LOWER_LEFT, GL_ZERO_TO_ONE) in all cases, even when not using a float depth buffer.
This makes me believe that my assumption that glClipControl(GL_LOWER_LEFT, GL_ZERO_TO_ONE) does the same as glDepthRangedNV(-1.0, 1.0) are totally wrong.

For testing I use 2 simple planes with width 10, depth 1000, and a height difference of 0.1
With a zNear, zFar value of 0.01, 100000.0 I get a lot of z-fighting even before 1000.0, with everything I tried so far.

GClements · February 9, 2015, 12:18pm

There’s no reason that any of that will help significantly with Z-fighting.

A 32-bit float still only has a 23-bit significand, so it doesn’t automatically give you better precision than a 24-bit fixed-point depth buffer.

One thing that will help is to use linear depth rather than reciprocal depth. But that means having the fragment shader write gl_FragDepth, which largely precludes early-depth optimisation (the depth* layout qualifiers for gl_FragDepth added in GLSL 4.2 are only of marginal benefit here).

The problem with reciprocal depth is that it means that roughly half of your depth range is used for -Z values between the near distance and twice the near distance, and everything between twice the near distance and the far distance gets the other half. So in practice >99% of your -Z range is mapped to the depth range 0.5…1.0, which only has 23 bits of precision with either 32-bit floating-point or 24-bit fixed-point.

Osbios · February 9, 2015, 1:35pm

Well, the whole point about reversed float depth buffer is to use of the precision around 0 of floating point numbers.

Source: Outerra: Maximizing Depth Buffer Range and Precision

GClements · February 10, 2015, 6:02am

That would probably help; if you can ensure that precision isn’t lost in intermediate calculations. In that regard, an inverting the near/far values in the calculation of the projection matrix would probably be better than glDepthRange(1.0,0.0), but still may not be sufficient. Essentially, the precision of 1-x can’t be any better than the precision of x itself.

If the precision loss is occurring in the projection transformation, using double-precision for the projection matrix might help.

Osbios · February 10, 2015, 7:20am

Yes glClipControl(GL_LOWER_LEFT, GL_ZERO_TO_ONE) combined with glDepthRange(1.0,0.0) is marginally better but not what I was locking for.

I got a working projection matrix for glClipControl(GL_LOWER_LEFT, GL_ZERO_TO_ONE).

glClipControl(GL_LOWER_LEFT, GL_ZERO_TO_ONE);
glDepthFunc(GL_GREATER);
glClearDepth(0.0);

//Reversed Infinite Projection Matrix
const float zNear = 0.001f;
const double viewAngleVertical = 90.0f;
const float f = 1.0 / tan(viewAngleVertical / 2.0); // 1.0 / tan == cotangent
const float aspect = float(resolutionX) / float(resolutionY);
mat4 projectionMatrix =
{
    f/aspect, 0.0f,    0.0f,  0.0f,
        0.0f,    f,    0.0f,  0.0f,
        0.0f, 0.0f,    0.0f, -1.0f,
        0.0f, 0.0f, 2*zNear,  0.0f
};

Really good resolution! On a distance of 10000 it sill works fine for 0.01 differences. And that is with a zNear value of 0.001.

But I did not get a working projection matrix for glDepthRangedNV(-1.0, 1.0) so far. Would be a nice fallback, even if I think there isn’t much stuff out there supporting NV_depth_buffer_float but not GL_ARB_clip_control.

Interestingly radeonsi/mesa drivers somewhere loses 1 bit of precision in all cases compared to catalyst on win7, but that is another matter.

Alfonse_Reinheart · February 10, 2015, 8:11am

Would be a nice fallback, even if I think there isn’t much stuff out there supporting NV_depth_buffer_float but not GL_ARB_clip_control.

It’s not a very good fallback, since no version of MacOSX supports either one. Also, clip control is brand new, so it’s not as widely supported as you might think.

Osbios · February 10, 2015, 8:43am

Well depending on how glDepthRange(-1.0, 1.0) actually acts in GL Core >= 4.3, you also could use that. Sure that also is not supported by MacOS right now, but that is not my target at the moment.
Also ARB_clip_control looks really easy to implement and is mainly designed to simplify porting d3d code to GL. So I expect them to implement it in the near future.

Not using this kind of depth precision is out of the question. I tasted blood now!

Alfonse_Reinheart · February 10, 2015, 9:01am

Well depending on how glDepthRange(-1.0, 1.0) actually acts in GL Core >= 4.3, you also could use that.

Nonsense. The change you’re referring to doesn’t mean what you think it does.

In 4.2, they removed the GLclampf type from the API. That does not mean that those parameters suddenly become unclamped. They simply removed the implicit clamping based on the typename (which they do so that the spec would have to be explicit about any parameter clamping). The specification clearly states that the parameters to glDepthRange (and all its variations) will be clamped to the [0, 1] range.

Admittedly, they “clearly stated” this in OpenGL 4.3. But it’s there now

Also ARB_clip_control looks really easy to implement and is mainly designed to simplify porting d3d code to GL. So I expect them to implement it in the near future.

I wouldn’t hold my breath. MacOSX has always been slow to add OpenGL features. After all, OpenGL is version 4.5, while MacOSX only supports version 4.1. Which was four years ago. And it’s not like 4.1 was state-of-the-art when MacOSX 10.9 came out; OpenGL was at 4.4 when 10.9 was released.

That’s not to say that it won’t be there. But Apple only does significant OpenGL upgrades when they release a new OS version. And they already shipped 10.10 without significant OpenGL upgrades. So your best bet will be 10.11.

Osbios · March 1, 2015, 7:51am

To make this more useful for future google archeologists here is my solution.

The projection matrix is the same for both variants:

With GL_ARB_clip_control:


glDepthFunc(GL_GREATER); //default would be GL_LESS
clearDepth(0.0f);        //default would be 1.0f

glClipControl(GL_LOWER_LEFT, GL_ZERO_TO_ONE);

const float zNear = 0.001f;
const float viewAngleVertical = 90.0f;
const float f = 1.0f / tan(viewAngleVertical / 2.0f); // 1.0 / tan(X) == cotangent(X)
const float aspect = float(resolutionX) / float(resolutionY);

//infinite Perspective matrix reversed
glm::mat4 projectionMatrix = {
  f/aspect, 0.0f,  0.0f,  0.0f,
      0.0f,    f,  0.0f,  0.0f,
      0.0f, 0.0f,  0.0f, -1.0f,
      0.0f, 0.0f, zNear,  0.0f
};

With GL_NV_depth_buffer_float:

With catalyst you must use GL_DEPTH_COMPONENT32F_NV or GL_DEPTH32F_STENCIL8_NV as texture format!
If you use GL_DEPTH_COMPONENT32F or GL_DEPTH32F_STENCIL8 then glDepthRangedNV will be ignored. (Or clamped, not sure)

I don’t know how NV drivers handel this. Would be nice if somebody could test and report back.


glDepthFunc(GL_GREATER); //default would be GL_LESS
clearDepth(0.0f);        //default would be 1.0f

glDepthRangedNV(-1.0, 1.0);

const float zNear = 0.001f;
const float viewAngleVertical = 90.0f;
const float f = 1.0f / tan(viewAngleVertical / 2.0f); // 1.0 / tan(X) == cotangent(X)
const float aspect = float(resolutionX) / float(resolutionY);

//infinite Perspective matrix reversed
glm::mat4 projectionMatrix = {
  f/aspect, 0.0f,  0.0f,  0.0f,
      0.0f,    f,  0.0f,  0.0f,
      0.0f, 0.0f,  0.0f, -1.0f,
      0.0f, 0.0f, zNear,  0.0f
};

Just for reference, this is how you would set it up without a reversed depth buffer.
Of course this comes with terrible depth precision.


glDepthFunc(GL_LESS); //this is the default value of OpenGL
clearDepth(1.0f);

const float zNear = 0.001f;
const float viewAngleVertical = 90.0f;
const float f = 1.0f / tan(viewAngleVertical / 2.0f); // 1.0 / tan(X) == cotangent(X)
const float aspect = float(resolutionX) / float(resolutionY);

//infinite Perspective matrix
glm::mat4 projectionMatrix = {
f/aspect, 0.0f,        0.0f,  0.0f,
    0.0f,    f,        0.0f,  0.0f,
    0.0f, 0.0f,       -1.0f, -1.0f,
    0.0f, 0.0f, -zNear*2.0f,  0.0f
};

macarter · March 24, 2015, 11:02am

[QUOTE=Osbios;1264627] …I don’t know how NV drivers handel this. Would be nice if somebody could test and report back…
[/QUOTE]

I tested an NVIDA GTX 680 with driver version 347.52. Both glDepthRangedNV(-1, 1) or glClipControl(GL_LOWER_LEFT, GL_ZERO_TO_ONE) with glDepthRange(0, 1) seem to do exactly the same thing. There was no need to use the GL_NV_depth_buffer_float extension of GL_DEPTH_COMPONENT32F_NV for my FBO rendering in either case, GL_DEPTH_COMPONENT32F worked well.

If you absolutely must have a non infinite far clipping plane, use this modification to the projection matrix:

projectionMatrix[2][2] = zNear / (zNear - zFar)
projectionMatrix[3][2] = projectionMatrix[2][2] == 0 ? zNear : (zFar * Znear / (zFar - Znear))

But it will not result in any improvement in depth accuracy. Warning: don’t trust the sign of my projection matrix terms.

I consider the standard projection matrix and fixed point depth to be archaic. An extension specifying a bit range extraction of the IEEE float value for use in lower precision 16 or 24 bit depth buffers should be provided. For instance, the low seven bits of the exponent combined with the high seventeen bits of the mantissa would yield an excellent 24 bit depth buffer.

macarter · April 25, 2015, 2:13pm

Yesterday I experimented with depth buffer reads combined with extreme depth scales. My setup consisted of a large coarsely tessellated ground plane observed from high altitude with the horizon line near the middle of the viewport. I varied the near plane distance to fill the frame buffer with arbitrarily scaled depth values. I was pleased to see the 99% of adjacent depth values were the same or differed by only a one bit floating point mantissa change, strong evidence the GTX 680 depth interpolators implement full IEEE 32 bit floating point precision. I did not see any denormalized floating point values, evidence the GTX 680 does not support floating point numbers smaller than 2-126 (1.1755x10-38) other than zero. Therefore, most depth value steps should be proportional to depth, matching the floating point mantissa steps which vary between one part 223 (8388608) and one part in 224 - 1 (16777215). (A double step is possible from the sum of the x and y depth interpolator).

Accuracy deteriorated when depth values were made smaller than about 10-33. This is expected as the depth interpolator coefficient exponents begin to underflow.

Unlike a fixed point depth buffer, the near plane setting has no effect on depth accuracy. It only scales the distance at which depth accuracy begins to deteriorate due to underflow. A one millimeter near plane permits highly accurate depth to beyond the size of the observable universe (8.7x1029 mm). http://en.wikipedia.org/wiki/Orders_of_magnitude_(length)#Astronomical. If z clipping could be disabled, depths as small as the Plank Length (10-32 mm) could be supported at the same time.

Inverting the depth value to get true depth is nearly lossless. The program below demonstrates nearly 83% of inverted depth values map to unique depths.

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#define TINY_FLOAT  (pow(2, -126))
int main(int argc, char *argv[])
{
    unsigned long i, begin, end, diff, last, run, longRun;
    union
    {
        unsigned long  l;
        float f;
    } u;
    // step through all possible floating point numbers from TINY_FLOAT to 1.0
    u.f   = TINY_FLOAT;
    begin = u.l;
    u.f   = 1;
    end   = u.l;
    for (i = begin, last = diff = longRun = run = 0; i < end; last = u.l, i++)
    {
        u.l = i;
        u.f = 1 / u.f;
        if (u.l != last) // test for a change from the previous inversion result
        {
            if (run > longRun) longRun = run;
            diff++;     // new value
            run = 0;
        }
        else run++;     // duplicate value
    }
    printf("Unique inversion fraction %g, from %g(0x%lX) to %g(0x%lX),"
           " longest run %ld
" , (float)diff / (end - begin), TINY_FLOAT,
           begin, 1.0, end, longRun+1);
    return 0;
}

I see no room for improvement on these results.

Osbios · April 29, 2015, 8:17am

Well there is GL_ARB_depth_clamp (Core since 3.2). So maybe a simple glDisable(GL_DEPTH_CLAMP) could do the trick? Never used that one and have no idea how it interacts with clip_control.
But maybe with that you can fully use the floating point range and get Plank Length accuracy!?

Alfonse_Reinheart · April 29, 2015, 8:25am

That wouldn’t work (also, it’s glEnable(GL_DEPTH_CLAMP) that turns off near/far clipping). Depth clamping means that Z values closer than the camera z-near are clamped.

Without depth clamping, if the perspective projection transform would have resulted in a Z outside the [0, 1] range (or without clip_control, outside of the [-1, 1] range), then the primitive would be clipped to within that range. With depth clamping, the value is clamped to that range. So it doesn’t give you any extra precision; all values closer than the camera z-near will have the same depth: 0.

macarter · May 1, 2015, 8:30am

Actually with inverted depth the closer values will clamp to 1.
There must have been some debate over floating point depth clamps. NV_depth_buffer_float explicitly states floating point depth values are not clamped. ARB_depth_buffer_float explicitly states they are clamped. With inverted depth I desire a mixed clamping mode, clamping to zero (infinite depth), but not to 1.

Alfonse_Reinheart · May 1, 2015, 8:49am

Fair enough.