new technique paper - for stenciled shadow volumes

Mark Kilgard and I have a paper on robust stenciled shadow volumes that’s probably a good read for anyone thinking of implementing them. We’ll be talking about this at the Advanced OpenGL course at GDC next Wednesday. So come and bring your questions.

This link will be live sometime tomorrow: http://developer.nvidia.com/view.asp?IO=robust_shadow_volumes

But you can look at the paper at: http://cvs1.nvidia.com/OpenGL/doc/whitepapers/RobustShadowVolumes.pdf (2.1 MB)

And the hard-to-find “personal communication” reference to Carmack’s Reverse: http://cvs1.nvidia.com/OpenGL/doc/whitepapers/CarmackOnShadowVolumes.txt (7 kB)

And Heidman’s original article on using stencil for shadow volumes. http://cvs1.nvidia.com/OpenGL/doc/whitepapers/RealShadowsRealTime.pdf (9.1 MB)

Thanks -
Cass

I have on thing to say…

Thank you Thank you Thank you Thank you Thank you Thank you Thank you Thank you

I really wanted to make it to the conference, but it costs too much for a plane ticket from this side of the world to san jose and im just a poor University student, so this is the next best thing, yay!

Also, how long will it be before the material covered in the conference becomes availabe to those of us who couldn’t go?

[This message has been edited by robert (edited 03-13-2002).]

Thanks for the info. And what I really like now is that the whole sdk is ‘browseable’ online! cvs1.nvidia.com

kon

I love it!
Especially that infinite projection-matrix trick is neat!

We should have all the material that we present at GDC on the NVIDIA web site within the following week.

Glad you like the paper. The demo(s) used to generate the images need a little bullet-proofing before we release them, but hopefully in the same timeframe.

Thanks -
Cass

Well i think that paper is just awesome! You guys did an excellent job on it and the demos. I was kind of suprised on the low framerates on that one demo with the q2 models, but its just a demo, its not like that is a fully optimized game engine. But even then, i would have figured the geforce 4 could push it faster with what it was having to do. Oh well. Speaking of that, i sure hope my geforce 4 ti comes in this week. Yay!

-SirKnight

Great paper. I understood everything in one reading which is very rare for me with white papers.

Well the document has a very practical approach to it and ALL the details are there. Good job.

[This message has been edited by Gorg (edited 03-13-2002).]

I’ve had the paper for just over a month now and was also surprised by the simplicity of this new method. Mark Kilgard gave me a thorough explaination of a particularly obscure technical point. I was expecting this technical point ( with Mark’s detailed explanation ) to have been included in the paper. Personally, I think it’s an important point.

The only thing that I was worried about was the performance. It definitely consumes more fill but it can be implemented entirely and robustly on the GPU. And that’s a good thing.

Thank you! I’ve been looking for that Heidman article for ages!

I like the Iris GL listing at the end.

Originally posted by PH:
[b]I’ve had the paper for just over a month now and was also surprised by the simplicity of this new method. Mark Kilgard gave me a thorough explaination of a particularly obscure technical point. I was expecting this technical point ( with Mark’s detailed explanation ) to have been included in the paper. Personally, I think it’s an important point.

The only thing that I was worried about was the performance. It definitely consumes more fill but it can be implemented entirely and robustly on the GPU. And that’s a good thing.[/b]

Please tell us more

Gorg,

Cass has a copy of the e-mail I got from Mark too. You could try to convince him to post the explanation here . Mark said they really wanted to address the technical point in the paper but wanted to keep it 8 pages ( the newly released paper is slightly different, so I wonder why it wasn’t added ).

Ok, this is actually an interesting point, so here’s the relevant email thread.
http://cvs1.nvidia.com/OpenGL/doc/whitepapers/projected_shadow_volume_caps.html

Thanks -
Cass

Thanks for that damn good paper !!!

Now some questions, hints and calculations…
From The Paper:

In the case of a directional light, all the vertices of a possible silhouette edge loop project to the same point at infinity. In this case, a TRIANGLE_FAN primitive can render these polygons extremely efficiently (1 vertex/projected triangle).

This would also mean, that it isn’t necessary to draw the light-backfacing polygons of the occluder to save additional fillrate, or am i wrong there?

In my current approach i use a special edge structure, where i only have those edges that could get possible silhouette edges. That means i pre-filter edges out, that connect triangles which have the same face normals.
This is a very common case, and saves some CPU computations.

But i also see the problem of the fillrate hit. If i summerize all drawing operations i get the following:

  1. ambient pass (no backfaces)
  2. The Volume sides (only stencil but very unpredictable in their size, and also front and back)
  3. volume caps pass (only stencil, but front and back)
  4. the light pass (no backfaces, no depth)

ambient-front + volume-front + volume-back + volumecaps-front + volumecaps-back + lighted-front

This would mean a maximum of over four times the passes then without shadows.
And i don’t know how much you gain by only drawing to the stencilbuffer, is it a relevant factor in the equatation ?

Thanks
Lars

[This message has been edited by Lars (edited 03-13-2002).]

Lars,

That’s right. No reason to draw a projected end cap on the shadow volume because they’d all be degenerate triangles anyway.

Shadow volumes definitely eat fill. If you’re working with bounded light sources, be sure to use the scissor test to avoid wasting fill on regions that cannot be illuminated. There are additional tricks for limiting the geometry you actually send down too.

The real potential for making shadow volumes fast is figuring out when you don’t have to draw a shadow volume. Some approaches will take advantage of early culling hardware. Others will require detailed scene-level analysis of occluder geometry.

Because stenciled shadow volumes are becoming commonplace (especially in future games), you can expect the mode of writing stencil only to get faster as well.

Thanks -
Cass

Fine, Cass. Next Wednesday, I will be there to see it.
By the way, how many of you will be at GDC?

Cab,

Looking forward to it. There are 8 guys in my group that will be presenting at the OpenGL course. Lots more for other sessions and the dreaded booth duty.

Cass

Originally posted by cass:
you can look at the paper at: http://cvs1.nvidia.com/OpenGL/doc/whitepapers/RobustShadowVolumes.pdf

Cass,

Thanks for releasing this paper. However, could you please make it available in a PDF format that is readable by recent versions of GhostView ? Not everybody runs Windows here.

It’s sad but true, many NVidia papers are unreadable without the PDF reader from Acrobat :stuck_out_tongue:

Julien.

Julien,

I’ll see what I can do - but you may have to remind me again. Preparing for GDC is taking all my time right now.

Thanks -
Cass

If you’re interested in seeing what the geometry of clipped vs. infinite shadow volume geometry looks like, I just uploaded two wireframe shots,
http://www.geocities.com/SiliconValley/Pines/8553/ClippedShadowGeometry.html

The fill rate savings from clipped shadows is quite a lot for common bounded lights.
The difficulty with the clipping approach lies in trying to avoid clipping the front faces of the shadow volumes. If front faces are clipped, the original occluder geometry must be modified to avoid cracks.

PS. This is important for creating perfect beam trees. With an approximate beam tree, you can avoid clipping front faces.