aggressive scissor for shadow volumes

I did a bit of hand waving about this topic at GDC this year, so I figured I should do a demo to illustrate the point a little more clearly. You can find it at:
http://developer.nvidia.com/view.asp?IO=shadow_volume_intersection

Thanks -
Cass

I haven’t looked through the source yet, but it seems that you’re finding all the intersections with the bounding box of the “room” containing the shadow-caster. Would it not save more fillrate to clip the extruded quads to those calculated intersection points, making the stencil volume only exist inside the area it has influence in?

Ya earlier today I noticed this demo on the developer page. Looks pretty cool. I havn’t tried it out yet though, but I will in a sec. The presentation from GDC that is linked to from that demo’s page is pretty cool too. Lot’s of good stuff in there.

-SirKnight

Originally posted by Assassin:
I haven’t looked through the source yet, but it seems that you’re finding all the intersections with the bounding box of the “room” containing the shadow-caster. Would it not save more fillrate to clip the extruded quads to those calculated intersection points, making the stencil volume only exist inside the area it has influence in?

Good point, Assasssin, it would save more fill to clip and cap the shadow volume to the bounds. The trouble is, that’s a pretty expensive operation to do on the CPU. This approach uses rough approximations with bounding volumes, and gets a lot of the same fill reduction by just using the scissor. The CPU and geometry load is signficiantly lighter.

Cass

I think it depends on your application which way is faster. In some cases your 3d engine may run faster one way but another one may run faster the other way. The best thing is to test both ways to see which one your application likes better. A program that does not use much CPU may do better with the method Assassin says, on the other hand, if your program already uses quite a bit of CPU time, then it would be better to not make the CPU do more than what it can handle at the moment, thus hurting performance. This is when the technique cass’ demo shows would prevail.

-SirKnight

[This message has been edited by SirKnight (edited 04-11-2003).]

Originally posted by cass:
The trouble is, that’s a pretty expensive operation to do on the CPU.
Cass

Are you sure? Because from my tests w/ the function to compute the screenspace bounding rectangle of the bounds of a light to use with glScissor was quite fast I think. According to the VC++ 6.0 Pro profiler it usually was under 0.010 ms of time taken to execute the function. Most of the time it was around 0.008 ms.

-SirKnight

Cass,

The presentation from GDC was very interesting. I like the sem-automatic shadow volume method ( I implemented it using 8 instructions for local lights ). The formula in the presentation seems to be wrong.

pos = pospos.w + (posL.w - Lpos.w)(1-pos.w)

For L.w = 1, and a vertex with w = 0:
pos = pos0 + (pos1 - L*0) * (1-0) = pos

and w = 1
pos = pos1 + (pos1 - L*1) * (1-1) = pos

So you get the original vertex position is both cases ( unless I misunderstood something in the presentation ).

Anyway, I have a question on the scissor topic: when computing the convex hull formed by the light and the view frustum, is it neccessary to use the infinite view frustum or will it work using the ordinary frustum ( used when doing normal culling ) ? To me, it seems that as long as the shadows don’t project into the finite frustum, they can be disregarded ( I’m looking at the comments in the “Generalized Shadow Volume Culling Rule” section ).

Originally posted by SirKnight:

Are you sure? Because from my tests w/ the function to compute the screenspace bounding rectangle of the bounds of a light to use with glScissor was quite fast I think. According to the VC++ 6.0 Pro profiler it usually was under 0.010 ms of time taken to execute the function. Most of the time it was around 0.008 ms.

I think Cass and Assassin were talking about actually clipping the shadow volumes. This is pretty expensive ( can be optimized quite a bit when using axis-aligned planes but is still non-trivial ).

Edit: For the interested, here’s a shot that shows clipped shadows with caps carved out of an axis-aligned box ( the expensive part ).

[This message has been edited by PH (edited 04-12-2003).]

That’s right. The bounding box test is pretty simple, but clipping/capping the actual high poly shadow volume to the light bounds is a lot more expensive.

That you can get very similar fill consumption and cheaper geometry
processing with a relatively coarse
bounding box calculation is the important result.

Cass

Originally posted by PH:
[b]Cass,

The presentation from GDC was very interesting. I like the sem-automatic shadow volume method ( I implemented it using 8 instructions for local lights ). The formula in the presentation seems to be wrong.

pos = pospos.w + (posL.w - Lpos.w)(1-pos.w)

For L.w = 1, and a vertex with w = 0:
pos = pos0 + (pos1 - L*0) * (1-0) = pos

and w = 1
pos = pos1 + (pos1 - L*1) * (1-1) = pos

So you get the original vertex position is both cases ( unless I misunderstood something in the presentation ).

Anyway, I have a question on the scissor topic: when computing the convex hull formed by the light and the view frustum, is it neccessary to use the infinite view frustum or will it work using the ordinary frustum ( used when doing normal culling ) ? To me, it seems that as long as the shadows don’t project into the finite frustum, they can be disregarded ( I’m looking at the comments in the “Generalized Shadow Volume Culling Rule” section ).[/b]

Hi Paul,

Let me check the presentation on this again. Sounds like an error though. Thanks!

On the scissor determination, if you clip to the frustum, you need to clip based on the region of possible shadow. If you know that no shadow can fall beyond your “original” far plane, then computing your scissor based on that far plane is fine.

The way that bounds are treated can be uniform. Any information you can use to further constrain the “region of possible shadow” is fair game.

Thanks -
Cass

OOHHHH so that’s what was meant. Ok well then in that case I agree that the bounding box method is much faster. Alright, forget about my other post then. I think I was thinking about something else there. :stuck_out_tongue:

-SirKnight

[This message has been edited by SirKnight (edited 04-12-2003).]

Just wondering, do we really need the larger scissor rectangle (the blue one in the demo) if we use the smaller (green) scissor rectangle? Or are both being drawn to show the difference of the light bounds scissor rect and the aggressive constrained scissor rect? If that’s the case, and I think it is, then using the aggressive constrained scissor saves a ton more fill than the light bound scissor.

-SirKnight

You only need the smaller scissor.

The blue one was shown to give an indication of how much better you could do with per-object (vs per-light) scissor.

Thanks -
Cass

Question:

I understand that the scissor will be massive savings in many cases (except for degenerate cases, where nothing will save you) if you extrude to infinity.

If you extrude to the end of the light radius in a vertex shader, rather than extruding to infinity, wouldn’t you get most (if not all) of the fill rate savings anyway? I like to avoid CPU geometry computations if possible.

I realize that there is a bit of a trade-off, because you have to extrude sufficiently further out from the light radius so that the edge segments don’t linearly cut into the bounding sphere, which may be hard to do always do right. But CPU calculation seems like such a waste; especially if you’re also doing skinning and stuff. It seems we’re relegating vertex programs to apply the tangent space basis for us, and that’s about it; we do a lot of transform work on the CPU that’ll get re-done on the GPU…

Ok that’s what I thought. Thanks for the clarification.

BTW, I am using Visual Studio .NET and I can’t get this demo to compile. Are you aware of this problem with .NET? I can’t get hardly any nvidia demo to compile, some I can though. I think it has problems with the demos that use the vector class, well some of them anyway. Here is the error I get:

h:\NVIDIA Corporation\SDK\DEMOS\OpenGL\src\volume_intersect\volume_intersect.cpp(489): error C2475: ‘std::vector<_Ty,_Ax>::size’ :
forming a pointer-to-member requires explicit use of the address-of operator (‘&’) and a qualified name
with
[
_Ty=edge,
_Ax=std::allocator
]

Thanks.
-SirKnight

[This message has been edited by SirKnight (edited 04-12-2003).]

You can fix that compile problem by adding “()” after the call to a.e.size on line 489 of volume_intersect.cpp.

I’ll get an updated version of the zip file up in a bit that fixes this issue.

Brilliant! At first i thought it wouldn’t be very usefull, because you’d need to calculate six ray/plane intersections per shadow volume edge. Then i realized you’d just use the bounding box of the object for that, instead of the real object; hence the CPU cost remains relatively low.

So far i’ve been capping my shadow volumes to the light radius with a vertex shader to save fill-rate, and didn’t use the scissor test.

Y.

A few more quick questions on the presentation ( drifting slightly off topic ):

Is it really worth the trouble to build connected loops while extracting the silhouette ? Or put another way, is anyone actually doing this and can confirm whether this is a definite win ?

What about silhouette extraction of static occluders using the method from the “Silhouette Clipping” paper ?

I’ll probably just keep those on my wish/todo list .

Edit:
No problems with the code after all.

[This message has been edited by PH (edited 04-13-2003).]

>>>Is it really worth the trouble to build connected loops while extracting the silhouette ? Or put another way, is anyone actually doing this and can confirm whether this is a definite win ?<<<<

I did it for finding the true silhouette but that is way to expensive.

Now I’m using Cass’s trick there and it’s not necessary to connect the edges in the “pseudo-silhouette”.

Of course if you do it, it will cost you.

Originally posted by jra101:
[b]You can fix that compile problem by adding “()” after the call to a.e.size on line 489 of volume_intersect.cpp.

I’ll get an updated version of the zip file up in a bit that fixes this issue.[/b]

Ok thanks. I just realized that this actually was a very easy error to find and fix. :stuck_out_tongue: If I only would have clicked on the error message I would have been brought right to it and saw the error. But I assumed it was one of the problems I had before where if I clicked on the error it would take me inside the vector file. I guess that’s what I get for assuming.

-SirKnight