Deferred shading and light volumes

Hello,
currently I’m trying to optimize my deferred shading by using lightvolumes to process only those fragments that are affected by the lightsource. I’ve read many tutorials and I’m very confused about how to do that.
First of all I’ve created a crude sphere around my pointlightsource and rendered it for testing porposes in the first(geometry) pass (see screenshot). But this doesn’t do any good since the determination of pixels to lit up actually happens during the second(light) pass? Right now I draw a 2D fullscreen quad in the range [-1,1], supply the VS with the quad vertex coordinates (gl_Position = vec4(li_vtx.xyz, 1) * 2.0 - 1.0) and apply the light fragmentshader. Fine so far. But how do I do that with a 3D sphere around the lightsource during the second light pass? I guess this has to be done somehow in screenspace for every pixel? If I draw the sphere in 3D space in the light pass and apply the FS, I see only the sphere and it seems that the FS tries to draw the textures on the sphere. I simply don’t understand how to do exactly this step - using this sphere to determine the pixels the FS has to cover for the lighting pass.:confused:

Any comprehensive input is very much apprechiated :slight_smile:

Thanks and regards
Saski

You can do that, though conventional wisdom is that if you have a lot of light sources (the goal), the overhead of all that per-light light volume rasterization and state changes can hinder lighting performance. Can be cheaper to just use screen-aligned quads … more on that in a sec. That also opens you up for tile-base deferred shading, where the real perf++ can be had if you have lots of light sources and/or light volume overlap in screen space.

For the light volumes thing, think about it this way. If on top of your G-buffer, you rasterize (with the same MV and PROJ) an opaque solid representing a light volume (where a light volume is the furthest extent of the light source’s illumination), then you know that it will occlude (in screen-space XY) every single G-buffer sample that it could possibly illuminate (and potentially alot more if you consider screen-space Z). Right? And it’s typically less fill than a full-screen quad, right? Ok, so don’t render it opaque. Just use its rasterization to give you samples in the G-buffer that you need to run your lighting shader on for that light source. Congrats! You’ve just limited your lighting fill.

Now, about the over-coverage in screen-space Z… Think of a light source volume 1) completely in front of an opaque fragment, or 2) completely behind an opaque fragment. With the normal single-sided depth test, you can take care of one of these cases (rendering front faces with LEQUAL depth test, or rendering back faces with GEQUAL depth test, where the depth buffer is the G-buffer’s depth buffer). But to take care of both you need some special sauce. One approach is to do both passes, using stencil to AND the first result into the second. That gives you a precise volume solid test in Z, but is even more expensive. Another approach is to use the depth bounds test and test against both a min Z and a max Z at once – avoids the multipass per light as with the stencil trick, though it gives you only a constant min and max screen Z to trim fragments in Z (in practice this is the min/max Z value of your light source volume solid), and while it’s supported on NVidia don’t think I’ve seen confirmation that AMD supports it yet.

As I said, these are all nice, but can be problematic to batch together to avoid a bunch of state change “pipeline bubbles” per light. From what I gather, often what is done nowadays is to take the bounding solid for each light, bin it by a screen-space “tile”, and then blast the lights in a single tile all-in-one-go. Saves G-buffer read/write fill when there is potentially a lot of light volume overlap in screen space, and even more importantly allows you to batch a bunch of light sources together with no state changes in between.

Without tile based, your best bet for batching might be rendering geometry instanced light source volumes with a single-sided depth test.

Hi Dark Photon,

first I’d like to thank you for your time explaining to me the details. But I still don’t get the whole picture.

For the light volumes thing, think about it this way. If on top of your G-buffer, you rasterize (with the same MV and PROJ) an opaque solid representing a light volume (where a light volume is the furthest extent of the light source’s illumination), then you know that it will occlude (in screen-space XY) every single G-buffer sample that it could possibly illuminate (and potentially alot more if you consider screen-space Z). Right?

Let me see if I got this: What you mean exactly with on top of the gbuffer? I guess that I should draw the sphere together with the geometry in the Gbuffer like I already did? Or should I do this in a separate pass together with gbuffer read/light pass? Can yo shade some more detail on that, please? I’m having a hard time wrapping my mind around this :frowning:

Ok, so don’t render it opaque. Just use its rasterization to give you samples in the G-buffer that you need to run your lighting shader on for that light source. Congrats! You’ve just limited your lighting fill.

Well, if I do render the sphere with the geometry to the gbuffer and make the sphere not opaque(transparent?) than the fragment shader has no way of detecting it after rasteriazion during the light pass, right? So this leaves me with the question when exactly do I have to rasterize the sphere (geoemetry or light pass)? And I guess that the fullscreen 2D-Quad I draw is only for the fragments not being affacted by any lightsource, right?

I’ll postpone the screen-Z issue until I got the basics right.

Some thoughts about tiled based clipping: It would probably suit my needs the best since I’ll have many overlapping lightsources and the target hardware has a narrow memory bandwidth (Intel SandyBridge with Mesa 9.1 on Linux). However most implementations determine the tiles using DirectCompute or OpenCL. Since the target hardware on Linux isn’t capable of this I’ll probably end up computing the tiles on the CPU and I fear I’loose any performance advantage that way. Any Opinions here?

Thanks & Regards
Saski

In case you don’t already know, it has been shown that tile based deferred shading trumps classic deferred shading and light pre-pass by a substantial margin. I refer you to Andrew Lauritzen’s slides. In there you’ll also find the same advice on bounding geometry that Dark Photon already suggested. Throw away those full-fledged spheres and cones!

I don’t understand why mapping light sources to tiles must be implemented using OpenCL or DirectCompute. It’s a simple mapping of bounding geometry to tile bounds. I suggest you implement it and compare it to your non-tiled solution.

BTW, SandyBridge + Linux + OpenCL works - just not with the GPU. So you could implement all the stuff using CL kernels and when the time comes that the GPUs are finally supported on Linux you’re already set.

Hi,
@thokra: You’re probably right that tiled shading is the best way to go. I’m kinda new to Shading and don’t know too much about OpenCL so I’ll give it a shot after I’ve got the oldstyle light volumes to work properly. I guess OpenCL CPU emulated must be slow as hell however.

@Dark Photon: I believe I found at least one more answer but I’n not there yet.
Here is what I do:
-Create FBO MTRs
-Enable writing to FBO (first geometry shader pass)
-Do my scene geometry
-Disable writing to FBO
-Bind MRT Textures
-Now instead drawing a FS Quad I draw the Sphere with same MovelView/Projection and do the lighting pass, do NO draw the fullscreen quad.

I Guess this is the way to go. If all goes right I should be seeing ONLY the parts that intersects with the sphere, i.e. the fragment that are lit, right? Well, it’s not happening though :(. All I see is the sphere in 3D space at it seems that shader maps colors on the sphere (see screenshot).

Let me show you some more detail (sorry about the messy code!):


 // enable for writing to FBO
    glBindFramebuffer(GL_DRAW_FRAMEBUFFER, ids[framebuffer]);
    SubSys_Set3DMode();
    SubSys_SetViewport(SubSys_GetVideoResolution());
    glClearColor(0,0,1,1);
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
    glEnable(GL_DEPTH_TEST);
    glDepthRange(0.0, 1.0);
    glDisable(GL_ALPHA_TEST);
    glDisable(GL_BLEND);
    glDisable(GL_SAMPLE_ALPHA_TO_COVERAGE);
  SubSys_LookAt(eye, dir);
  SubSys_SetState(r_cull, true);
  SubSys_EnableShaderStage(shader_fillMRT);
  DR_Draw(dr_geometry, 0xffff);
  SubSys_DisableShaderStage();
  SubSys_SetState(r_cull, false);
    glBindFramebuffer(GL_DRAW_FRAMEBUFFER, 0); // disable writing
  glDisable(GL_DEPTH_TEST);

here comes the part where I try to render the sphere:


  glClear(GL_COLOR_BUFFER_BIT);
    glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, ids[colorbuffer]);
    glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, ids[normalbuffer]);
    glActiveTexture(GL_TEXTURE2); glBindTexture(GL_TEXTURE_2D, ids[depthbuffer]);
    SubSys_EnableShaderStage(shader_deferred);
    // do the matrix stuff
    glGetFloatv(GL_MODELVIEW_MATRIX, modelview);
    glUniformMatrix4fv(si->loc[mvmatrix_loc], 1, GL_FALSE,  modelview);
    glGetFloatv(GL_PROJECTION_MATRIX, projection);
    glUniformMatrix4fv(si->loc[pjmatrix_loc], 1, GL_FALSE,  projection);
        glPushMatrix();
            glTranslatef(0.0f, 0.0f,0.0f);
            // draw the spere at (0,0,0)
            DisplaySphere();
        glPopMatrix();
  SubSys_DisableShaderStage();
    glActiveTexture(GL_TEXTURE2); glBindTexture(GL_TEXTURE_2D, 0);
    glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, 0);
    glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, 0);

// and the code to render the sphere (ugly as crap :()

void DisplaySphere(vec3f32_t* pos) {
    int r = 145; // TODO: variable
    int lats = 30; // TODO: static
    int longs = 30;// TODO: static
    int16 i, j;
    for(i = 0; i <= lats; i++) {
        float32 lat0 = MATH_PI * (-0.5 + (float32) (i - 1) / lats);
        float32 z0  = r * M_fsin(lat0);
        float32 zr0 = r * M_fcos(lat0);
        float32 lat1 = MATH_PI * (-0.5 + (float32) i / lats);
        float32 z1 = r * M_fsin(lat1);
        float32 zr1 = r * M_fcos(lat1);
        // TODO: VBO 
        glBegin(GL_QUAD_STRIP);
        for(j = 0; j <= longs; j++) {
            float32 lng = 2 * MATH_PI * (float32) (j - 1) / longs;
            float32 x = M_fcos(lng);
            float32 y = M_fsin(lng);
            glVertex3f(x * zr0 + pos->x, y * zr0 + pos->y, z0 + pos->z);
            glVertex3f(x * zr1 + pos->x, y * zr1 + pos->y, z1 + pos->z);
        }
        glEnd();
    }

the vertex shader looks like this:


#version 130 
precision mediump float;

uniform mat4 u_mvmatrix;
uniform mat4 u_pjmatrix;
uniform vec3 u_lightpos;

in vec4 li_vtx;
in vec2 li_texcoord;

out vec2 gs_texcoord;
out mat4 gs_pjimatrix;

void main(void) {
  gs_texcoord = li_texcoord;
  gl_Position = u_pjmatrix * u_mvmatrix * li_vtx;
}

I meant in the same buffer as the G-buffer, with depth testing and writing enabled, after you rasterize the opaque scene into it. While you could actually do this if you want, I just propose writing an opaque sphere or something is just for you to get a good mental picture of what would happen.

(Hmmm. Need a pic. Search images.google.com…) Ah! Here we go:

Imagine the plane is the opaque ground plane already in your G-buffer (the “scene”). Now imagine that sphere is the maximum illumination extent of a point light source. Further imagine your eyepoint is looking straight down on that plane with the sphere in-view. You can see that the sphere is going to cover all of the pixels on the G-buffer that could potentially be illuminated by the point light source. Right?

Now for the screen-Z issue, check this out:

If (same assumptions) plane is the scene in your G-buffer and the sphere is the light source volume around a point light source, and you’re looking down on the plane above the sphere, then you can see that the sphere, while it has screen-space XY coverage over some of the plane (G-buffer) pixels, it doesn’t actually contain any of the G-buffer pixels in Z. So it can’t illuminate any of the plane pixels.

Make sense?

Well, if I do render the sphere with the geometry to the gbuffer and make the sphere not opaque(transparent?) than the fragment shader has no way of detecting it after rasteriazion during the light pass, right?

Not quite, no. You’re only using the rasterization of the light volume solid (or quad, or whatever) to use to determine which samples from the G-buffer you’re going to read and apply illumination to for that point light source. The lit result is written to (and blended with) the lighting buffer.

So you don’t actually draw an opaque sphere to the lighting buffer or the G-buffer. You just use the pixels that would be rendered as an excuse to apply lighting to those pixels for that light source.

So this leaves me with the question when exactly do I have to rasterize the sphere (geoemetry or light pass)?

After you’ve finished rasterizing your opaque scene into the G-buffer, and after you’ve retargetted your rendering from the G-buffer to the lighting buffer. The lighting passes “read” from the G-buffer (via G-Buffer textures bound to texture units for example) and “write” (blend) to the lighting buffer.

And I guess that the fullscreen 2D-Quad I draw is only for the fragments not being affacted by any lightsource, right?

Well, if they’re not illuminated by any light source, you can’t see them, right? :wink:

Most often, you’d rip a full-screen quad for the directional light source (sun/moon/etc.) if there is one in your scene. If you’re in a dungeon though, you might not have one.

Some thoughts about tiled based clipping: It would probably suit my needs the best since I’ll have many overlapping lightsources and the target hardware has a narrow memory bandwidth (Intel SandyBridge with Mesa 9.1 on Linux). However most implementations determine the tiles using DirectCompute or OpenCL. Since the target hardware on Linux isn’t capable of this I’ll probably end up computing the tiles on the CPU and I fear I’loose any performance advantage that way. Any Opinions here?

It depends, but my guess is it’s probably still worth it if you’ve got a lot of lights and a lot of light overlap. If you’ve got slow memory and a fast CPU (the usual case), you’re likely gonna save a crapload of slow memory reads and writes with tiled. Try non-tiled first and see what you think. Optimize it. Then try tiled if you need more perf. You just need a way to bin them efficiently. If you’re doing culling already, you’ve probably already got a half-way efficient binning algorithm lying around that you can extend.

And no reason you can’t easily implement tiled on the CPU. Personally, I’d always implement something on the CPU first – it’s just too darn “easy” to get running and optimized compared to implementing it on the GPU. Then if you need more speed even after optimization, you can go multicore or go GPU with it.

[QUOTE=saski;1247756]Hi,
@thokra: You’re probably right that tiled shading is the best way to go. I’m kinda new to Shading and don’t know too much about OpenCL so I’ll give it a shot after I’ve got the oldstyle light volumes to work properly.[/QUOTE]

If/when you do, allocate some time to learning it slowly. It’s not as easy as writing GLSL shaders. You have to think very carefully about parallel execution issues.

[QUOTE=saski;1247756]@Dark Photon: I believe I found at least one more answer but I’n not there yet.
Here is what I do:
-Create FBO MTRs
-Enable writing to FBO (first geometry shader pass)
-Do my scene geometry
-Disable writing to FBO
-Bind MRT Textures
-Now instead drawing a FS Quad I draw the Sphere with same MovelView/Projection and do the lighting pass, do NO draw the fullscreen quad.

[/QUOTE]

Yep, you got it!

I Guess this is the way to go. If all goes right I should be seeing ONLY the parts that intersects with the sphere, i.e. the fragment that are lit, right?

Right.

What’s all that blue stuff around your sphere?

Looks like you’ve got the sphere rendering. Now it’s just down to figuring out if/why you’re not generating the necessary illuminated fragments for the sphere pixels.

Hi Dark Photon,

again thanks a bunch for your patience and your effort. and this time I believe I finally got it :slight_smile:
Image below shows basically the sphere after some code tweaks. There is a “nice” artifact on the right side of the sphere below the texture lightbold indicating that the sphere is indeed a 3D object. Btw. to make the sphere visible I’ve used some GL_FUNC_ADD blending.

[ATTACH=CONFIG]368[/ATTACH]

And here is another shot with an additional FS Quad supplying all fragments with static ambient light:
[ATTACH=CONFIG]369[/ATTACH]

However I’m still not happy with it. First thing I do a glCullFace(GL_FRONT) and that makes the light dissappear as soon as the camera enters the light volume. Disbaling culling seems to help. Is there some other solution?
Second Drawing the FS Quad and the Sphere produces overdraw (rendering the lit fragments inside the sphere and then again with the new lit values).

And still there are many fragments that are not actually inside the sphere but in front or completely behind. You mentioned the Z-Depth issue and I guess I can cull out many fragments lying in front or completely behind the sphere, right? Can I use a stencil buffer - similar like Carmacks reverse to do the culling?

How do I do that? I guess I can render my geometry as usual into the G buffer but I have to setup a separate stencil buffer, right? I found some hints to attach the depth buffer as a stencil buffer much like this:


    glGenTextures(1, &ids[depthbuffer]);
    glBindTexture(GL_TEXTURE_2D, ids[depthbuffer]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH24_STENCIL8, dim[0], dim[1], 0,
            GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8, NULL);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
    glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT,
            GL_TEXTURE_2D, ids[depthbuffer], 0);
    glBindTexture(GL_TEXTURE_2D, 0);

Well this one actually works for me. OpenGL wiki says the format is 24 bit for depth buffer and 8 bits for stencil (32 bit total). I’m confused about the GL_UNSIGNED_INT_24_8. The depth texture sampling gives us a value between 0…1 yet the format is unsigned int. So how is this supposed to work? Does the shader internally converts the unsigned int to float when sampling with texture()?

Assuming the MRT setup is good, how do I go on from there?

  • I guess disable writing into the depth buffer after populating the geomerty to the gbuffer.
  • Do an extra sphere render pass where I set up the stencil, like decrement for front side and increment for the backside of a polygon?
  • Render the sphere again with lighting pass enabled and set the glStencilFunc correctly, right?

This is a crude description of how I think this might actually work but I’m very fuzzy about the particular details.

Any opinion or explanation is of course very much apprechiated :slight_smile:

Thx a lot in advance
Saski

Hi,

I’ve made some progress using the stencil culling and it amlost works :). There is however one more issue: Depending on the view of the camera the stencilbuffer seems to have trouble culling out fragments that are completely in front of the sphere.

Screenshot below shows the sphere with a view in direction (0,0,-1):
[ATTACH=CONFIG]370[/ATTACH]

Well this looks pretty good actually. The blue background is intentional by the way ;).

Now lets see how the result is when the camera is in direction (-1,0,0):
[ATTACH=CONFIG]371[/ATTACH]

And in direction (0,0,1):

[ATTACH=CONFIG]372[/ATTACH]

It seems that some floor and ceiling fragments covering the part of the sphere hat reaches below the floor/ceiling failed the stencil test. This is especially strange since it depends on the viewpount. How can I fix that?

Below the code:

FBO/MRT setup:


    // create main framebuffer object
  glGenFramebuffers(1, &ids[framebuffer]);
  glBindFramebuffer(GL_FRAMEBUFFER, ids[framebuffer]);

  // create MRT textures

  // color / diffuse
  glGenTextures(1, &ids[colorbuffer]);
    glBindTexture(GL_TEXTURE_2D, ids[colorbuffer]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, dim[0], dim[1], 0, GL_RGBA,
            GL_UNSIGNED_BYTE, NULL);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
    glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
            GL_TEXTURE_2D, ids[colorbuffer], 0);
  glBindTexture(GL_TEXTURE_2D, 0);

  // normal
  glGenTextures(1, &ids[normalbuffer]);
    glBindTexture(GL_TEXTURE_2D, ids[normalbuffer]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA16F, dim[0], dim[1], 0, GL_RGBA,
            GL_FLOAT, NULL);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
    glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT1,
            GL_TEXTURE_2D, ids[normalbuffer], 0);
    glBindTexture(GL_TEXTURE_2D, 0);

  // depth
    glGenTextures(1, &ids[depthbuffer]);
    glBindTexture(GL_TEXTURE_2D, ids[depthbuffer]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH24_STENCIL8, dim[0], dim[1], 0,
            GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8, NULL);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
    glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT,
            GL_TEXTURE_2D, ids[depthbuffer], 0);

    // aux
    glGenTextures(1, &ids[auxbuffer]);
    glBindTexture(GL_TEXTURE_2D, ids[auxbuffer]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, dim[0], dim[1], 0,
            GL_RGBA, GL_UNSIGNED_BYTE, NULL);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT2,
            GL_TEXTURE_2D, ids[auxbuffer], 0);
    glBindTexture(GL_TEXTURE_2D, 0);

    // populate textures to framebuffer (color and normals)
  GLenum buffers[] = {GL_COLOR_ATTACHMENT0, GL_COLOR_ATTACHMENT1};
    glDrawBuffers(2, buffers);

And the code to render the geometry to the gbuffer:


  // enable for writing to FBO
    glBindFramebuffer(GL_DRAW_FRAMEBUFFER, ids[framebuffer]);
    GLenum buffers[] = {GL_COLOR_ATTACHMENT0, GL_COLOR_ATTACHMENT1};
    glDrawBuffers(2, buffers); // attach albedo and normal/specular buffer to FBO
//------------------------------------------------>
  // Only the geometry pass updates the depth buffer
  SubSys_Set3DMode();
    SubSys_SetViewport(SubSys_GetVideoResolution());
    glEnable(GL_DEPTH_TEST);
    glDepthMask(GL_TRUE); // enable depth buffer for writing
    glClearColor(0,0,0,1); // clear color
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
    // set depth buffer range from 0..1
    glDepthRange(0, 1);
    // disable blending
    glDisable(GL_ALPHA_TEST);
    glDisable(GL_BLEND);
    glDisable(GL_SAMPLE_ALPHA_TO_COVERAGE);
    // set camera position
  SubSys_LookAt(eye, dir);
  // enable backface culling
  SubSys_SetState(r_cull, true);
  SubSys_EnableShaderStage(shader_fillMRT, false);
    DR_Draw(dr_geometry, 0xffff); // populate geometry to gbuffer
  SubSys_DisableShaderStage();
  SubSys_SetState(r_cull, false);
//------------------------------------------------>
  glDepthMask(GL_FALSE); // make depth buffer readonly
  glDisable(GL_DEPTH_TEST);

The stencil pass:


  // stencil pass
  SubSys_EnableShaderStage(shader_null, false);
      glEnable(GL_STENCIL_TEST);
    glDrawBuffer(GL_NONE); // detach MRTs from FBO
        glEnable(GL_DEPTH_TEST); // we definately want to use the depth buffer
        glDisable(GL_CULL_FACE); // no culling during stencil pass!
        glClear(GL_STENCIL_BUFFER_BIT); // clear stencil buffer
        glStencilFunc(GL_ALWAYS, 0, 0); // preset buffer to all-pass
        // increment stencil if we enter/leave a polyon on its backside
        glStencilOpSeparate(GL_BACK, GL_KEEP, GL_INCR, GL_KEEP);
        // decrement stencil if we enter/leave a polyon on its frontside
        glStencilOpSeparate(GL_FRONT, GL_KEEP, GL_DECR, GL_KEEP);
        DisplaySphere(lpos); // draw shpere
        glDisable(GL_DEPTH_TEST);
  SubSys_DisableShaderStage();

The light pass:


  // bind new target colorbuffer
  glBindFramebuffer(GL_DRAW_FRAMEBUFFER, ids[framebuffer]);
  glDrawBuffer(GL_COLOR_ATTACHMENT2); // attach auxiliary color buffer
  glClearColor(0,0,1,1); // set fallback color for non touched fragments
  glClear(GL_COLOR_BUFFER_BIT); // set buffer color

  // activate MRTs
    glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, ids[colorbuffer]);
    glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, ids[normalbuffer]);
    glActiveTexture(GL_TEXTURE2); glBindTexture(GL_TEXTURE_2D, ids[depthbuffer]);
  // do the light pass
  SubSys_EnableShaderStage(shader_deferred, true);
        glStencilFunc(GL_NOTEQUAL, 0, 0xFF); // pass all stencil values > 1
        glEnable(GL_CULL_FACE); // enable culling
        glCullFace(GL_FRONT);
        DisplaySphere(lpos); // draw the sphere
        glCullFace(GL_BACK);
        glDisable(GL_BLEND);
  SubSys_DisableShaderStage();
  // disable MRT TMUs
  glActiveTexture(GL_TEXTURE2); glBindTexture(GL_TEXTURE_2D, 0);
    glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, 0);
    glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, 0);
    glDisable(GL_STENCIL_TEST);    // disable stencil

  // blit FBO to screen
    glBindFramebuffer(GL_DRAW_FRAMEBUFFER, 0);
  glBindFramebuffer(GL_READ_FRAMEBUFFER, ids[framebuffer]);
  glReadBuffer(GL_COLOR_ATTACHMENT2);
  int16* sdim = SubSys_GetVideoResolution();
  glBlitFramebuffer(0, 0, sdim[0], sdim[1], 0, 0, sdim[0], sdim[1],
          GL_COLOR_BUFFER_BIT, GL_LINEAR);

And finally the (messy) code to draw the sphere:


void DisplaySphere(vec3f32_t* pos) {
    int r = 145; // TODO: variable
    int lats = 20; // TODO: static
    int longs = 20;// TODO: static
    int16 i, j;
    for(i = 0; i <= lats; i++) {
        float32 lat0 = MATH_PI * (-0.5 + (float32) (i - 1) / lats);
        float32 z0  = r * M_fsin(lat0);
        float32 zr0 = r * M_fcos(lat0);
        float32 lat1 = MATH_PI * (-0.5 + (float32) i / lats);
        float32 z1 = r * M_fsin(lat1);
        float32 zr1 = r * M_fcos(lat1);
        // TODO: VBO THIS!
        glBegin(GL_QUAD_STRIP);
        for(j = 0; j <= longs; j++) {
            float32 lng = 2 * MATH_PI * (float32) (j - 1) / longs;
            float32 x = M_fcos(lng);
            float32 y = M_fsin(lng);
            glVertex3f(x * zr1 + pos->x, y * zr1 + pos->y, z1 + pos->z);
            glVertex3f(x * zr0 + pos->x, y * zr0 + pos->y, z0 + pos->z);
            //glVertex3f(x * zr1 + pos->x, y * zr1 + pos->y, z1 + pos->z);
        }
        glEnd();
    }
}

Any ideas?

Thanks for your time and best regards
Saski

[QUOTE=saski;1247841]


...
    glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH24_STENCIL8, dim[0], dim[1], 0,
            GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8, NULL);

…I’m confused about the GL_UNSIGNED_INT_24_8. The depth texture sampling gives us a value between 0…1 yet the format is unsigned int. So how is this supposed to work? Does the shader internally converts the unsigned int to float when sampling with texture()?[/QUOTE]

The format and type args are just used when you provide texel data as input via the last arg. You provided NULL, so in practice they don’t matter.

Internally the DEPTH24_STENCIL8 internal format is a 32-bit fixed-point format where 24-bits are fixed-point depth (0…2^24-1) and 8-bits are fixed-point stencil (0…2^8-1). How you end up with 0…1 depth values is that 0…24^-1 is mapped to 0…1 when the pipeline operates on it.

Not sure from a glance exactly what the problem is. Make sure that the eyepoint isn’t inside the light source volume (or more correctly, ensure that no part of the near clip plane clips away part of the light source volume).

This is in practice similar to shadow volume boundary counting with stencil shadows where you need a special case for when the eyepoint is inside a shadow volume (or more correctly, the near clip plane clips away any shadow volume face.

Hello,

I’m still struggeling with the light volume artifacts described in my last post. stupid question: Why is the light volume only being culled by the depth-test? This seems to be the problem with my code.
screenshot below shows that the sphere is still completely visible even when there are objects (wall on the right side) between the sphere and the viewpoint. The lightvolume produced by the stencil test is still a 3D object, right?
[ATTACH=CONFIG]382[/ATTACH]

So what do I have to do to enable z-culling for the light volume?

here are the code snippets:

Render geometry to FBO:


    glBindFramebuffer(GL_DRAW_FRAMEBUFFER, ids[framebuffer]);
    GLenum buffers[] = {GL_COLOR_ATTACHMENT0, GL_COLOR_ATTACHMENT1};
    glDrawBuffers(2, buffers); // attach albedo and normal/specular buffer to FBO
  // geometry pass updates the depth buffer
  SubSys_Set3DMode();
    SubSys_SetViewport(SubSys_GetVideoResolution());
    glEnable(GL_DEPTH_TEST);
    glDepthMask(GL_TRUE); // enable depth buffer for writing
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
    // set depth buffer range from 0..1
    glDepthRange(0, 1);
    // disable blending
    glDisable(GL_ALPHA_TEST);
    glDisable(GL_BLEND);
    glDisable(GL_SAMPLE_ALPHA_TO_COVERAGE);
    // set camera position
  SubSys_LookAt(eye, dir);
  // enable backface culling
  glEnable(GL_CULL_FACE);
    glCullFace(GL_BACK);
  SubSys_EnableShaderStage(shader_fillMRT, false);
    DR_Draw(dr_geometry, 0xffff); // populate geometry to gbuffer
  SubSys_DisableShaderStage();
  glDisable(GL_CULL_FACE);
  glDepthMask(GL_FALSE); // make depth buffer readonly
  glDisable(GL_DEPTH_TEST);

Render the sphere in the stencil pass:


 SubSys_EnableShaderStage(shader_null, false);
        glDrawBuffer(GL_NONE); // detach MRTs from FBO
        glEnable(GL_DEPTH_TEST); // we definately want to use the depth buffer
        glDisable(GL_CULL_FACE); // no culling during stencil pass!
      glEnable(GL_STENCIL_TEST);
      glStencilMask(0xff);
        glClear(GL_STENCIL_BUFFER_BIT); // clear stencil buffer
        glStencilFunc(GL_ALWAYS, 0, 0); // preset buffer to all-pass
        // increment stencil if we enter/leave a polyon on its backside
        glStencilOpSeparate(GL_BACK, GL_KEEP, GL_INCR, GL_KEEP);
        // decrement stencil if we enter/leave a polyon on its frontside
        glStencilOpSeparate(GL_FRONT, GL_KEEP, GL_DECR, GL_KEEP);
        GLS_Sphere();
        glDisable(GL_DEPTH_TEST);
  SubSys_DisableShaderStage();

// bind the target buffer to write stencil/color into


 // bind new target colorbuffer
  glBindFramebuffer(GL_DRAW_FRAMEBUFFER, ids[framebuffer]);
  glDrawBuffer(GL_COLOR_ATTACHMENT2); // attach auxiliary color buffer
  glClearColor(0,0,1,1); // set fallback color for non touched fragments
  glClear(GL_COLOR_BUFFER_BIT); // set buffer color

And the light pass:


 // ******************* light pass *******************
  // activate MRTs
    glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, ids[colorbuffer]);
    glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, ids[normalbuffer]);
    glActiveTexture(GL_TEXTURE2); glBindTexture(GL_TEXTURE_2D, ids[depthbuffer]);
  SubSys_EnableShaderStage(shader_deferred, true);
        glStencilFunc(GL_NOTEQUAL, 0, 0xFF); // pass all stencil values > 1
        glEnable(GL_CULL_FACE); // enable culling
        glCullFace(GL_FRONT);
        GLS_Sphere();
        glCullFace(GL_BACK);
        glDisable(GL_BLEND);
        glDisable(GL_STENCIL_TEST);    // disable stencil
  SubSys_DisableShaderStage();
  // disable MRT TMUs
  glActiveTexture(GL_TEXTURE2); glBindTexture(GL_TEXTURE_2D, 0);
    glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, 0);
    glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, 0);

blit FBO to screen


    glMatrixMode(GL_PROJECTION); // reset projection
    glLoadIdentity();
    glBindFramebuffer(GL_DRAW_FRAMEBUFFER, 0);
  glBindFramebuffer(GL_READ_FRAMEBUFFER, ids[framebuffer]);
  glReadBuffer(GL_COLOR_ATTACHMENT2);
  int16* sdim = SubSys_GetVideoResolution();
  glBlitFramebuffer(0, 0, sdim[0], sdim[1], 0, 0, sdim[0], sdim[1],
          GL_COLOR_BUFFER_BIT, GL_LINEAR);

Never mind! Found the solution. It is always a good Idea to watch the stencil buffer ranges when incrementing/decrementing :whistle:

Doing:


   glStencilOpSeparate(GL_FRONT, GL_KEEP, GL_DECR_WRAP, GL_KEEP);
   glStencilOpSeparate(GL_BACK, GL_KEEP, GL_INCR_WRAP, GL_KEEP);

instead of:


   glStencilOpSeparate(GL_FRONT, GL_KEEP, GL_DECR, GL_KEEP);
   glStencilOpSeparate(GL_BACK, GL_KEEP, GL_INCR, GL_KEEP);

did the trick.

Best regards
Saski