some (advanced) shadow volumes questions

i read Eric Lengyel’s article on stencil shadow volumes on gamasutra and adapted his improvements to my existing code, works great :smiley: i still have some questions though:

  • he suggests to use an infinite projection matrix. should i use this projection all the time or just when i extrude my shadow volumes?

  • how can i perform frustum and occlusion culling for meshes that can potentially cast shadows? i cannot cull an object just because it’s outside the frustum, as it could still cast a shadow into the frustum… how do you do it instead?

  • i thought about faking soft shadows by rendering the stencil mask to a texture, blurring it and use the result in a shader, similar to a shadow map. does that make sense or is it a silly idea?

-what can i do to prevent those “popping” self-shadows?

thanks! :slight_smile:

The soft shadows idea isn’t that bad, I did something similar with shadow mapping here:
http://www.humus.ca/index.php?page=3D&ID=54

The only think you need to keep in mind is weight samples in your blur according to depth difference to avoid light and shadow bleeding across from the background or foreground.

Hi Vexator,

I’ll answer the first two questions since they’re easier. :slight_smile:

  1. If you use the infinite projection matrix, you need to use it all the time.

  2. You need to check if the shadow volume is in the intersection of the light volume and the view frustum. This isn’t too hard if you use basic bounding volumes.

I’ll hand wave on the other two.

  1. Humus’s idea may be good. There’s also the penumbra wedges approach that Thomas Moller came up with. I haven’t studied the problem much.

  2. There are a number of tricks, and I’ve even written some ideas about it, but nothing that could be called truly robust. :frowning:

Hope this helps some. :wink:

Cass

Originally posted by Vexator:
[b]

  • how can i perform frustum and occlusion culling for meshes that can potentially cast shadows? i cannot cull an object just because it’s outside the frustum, as it could still cast a shadow into the frustum… how do you do it instead?
    [/b]
    Hi –

You need to cast shadows for objects that intersect the convex hull formed by the view frustum and the light source. There are a bunch more details in the GDC presentations linked on the front page of terathon.com.

– Eric Lengyel

wow thanks for the helpful replies :slight_smile: i’ll read through your presentation, eric, thanks. regarding popping self-shadows: i took a look at the shadow volumes demo in the nvidia sdk and they are dividing the geometry of the shadow caster into polygons that face the the light and polygons that do not… for the ones that do not face it, they call this:

glStencilFunc(GL_GEQUAL, 129, ~0); // don’t shadow polys that backface the light and are only in one shadow volume

why? and how can i implement this if i’m using vbo’s, where i do not have access to single vertices? @cass: what are those tricks you have? would you share them with me?

thanks :slight_smile:

The trick in the NVIDIA demo is mine.

Assuming you have to know which way each face of the model points when generating the shadow volume, you can use that to generate two separate index lists for rendering the faces that face toward then the faces that face away from the light.

You don’t need a new VBO for vertex positions, but you do need new index lists.

Oh, and the reason it’s not really robust is that you’re relying on shading to darken the face, not the shadowing calculation.

Shading isn’t guaranteed to do the right thing, but it usually does. And the artifacts are less objectionable because they don’t tend to pop.

ok i’ll give it a try. it’s ok if it works in most cases, still better than those popping artefacts :stuck_out_tongue:

btw another question - so far i was using immediate mode to draw my shadow silhouette, like this (yes i know, i do not have to recompute the silhouette every frame and i do not need the caps in all cases, still have to fix this):

void Group::ExtrudeSilhouette( const Vector3D translation )
{
	// for all polygons..
    for( unsigned int i = 0; i < m_vpkPolygons.size(); i++ )
	{
		// get polygon
		Polygon *pkPolygon = m_vpkPolygons[i];

		// check if it's visible
        if( !pkPolygon->m_bIsVisible )
			continue;
		
		// construct volume's caps
        glBegin( GL_TRIANGLES );
		glVertex3fv( m_vpkVertices[(i*3)] );
		glVertex3fv( m_vpkVertices[(i*3)+1] );
		glVertex3fv(m_vpkVertices[(i*3)+2] );
		glVertex3fv( m_vpkVertices[(i*3)]+(m_vpkVertices[(i*3)]-translation)*100.0f );
		glVertex3fv( m_vpkVertices[(i*3)+2]+(m_vpkVertices[(i*3)+2]-translation)*100.0f );
		glVertex3fv( m_vpkVertices[(i*3)+1]+(m_vpkVertices[(i*3)+1]-translation)*100.0f );
        glEnd();

		// for every vertex..
        for( unsigned int j = 0; j < 3; j++ )
		{
			// get neighbor
			int k = pkPolygon->m_aiNeighbors[j];

			// if there's no neighbour or the neighbour isn't visible..
			if( (!k) &#0124;&#0124; (!m_vpkPolygons[k-1]->m_bIsVisible) )
            {
				// get the edges..
				Vector3D &kVertex1 = m_vpkVertices[(i*3)+j];
				Vector3D &kVertex2 = m_vpkVertices[(i*3)+(j+1)%3];

				// extrude them..
                Vector3D kVertex3 = kVertex1+(kVertex1-translation)*100.0f;
                Vector3D kVertex4 = kVertex2+(kVertex2-translation)*100.0f;

				// and construct volume
                glBegin( GL_TRIANGLE_STRIP );
				glVertex3fv( kVertex1 );
				glVertex3fv( kVertex3 );
				glVertex3fv( kVertex2 );
				glVertex3fv( kVertex4 );
                glEnd();
			}
		}
	}
}

now i thought it’s probably faster to use a vertex array, like this:

void Group::ExtrudeSilhouette( const Vector3D translation )
{
	vector<Vector3D> m_vpkSilhouetteVertices;

	// for all polygons..
    for( unsigned int i = 0; i < m_vpkPolygons.size(); i++ )
	{
		// get polygon
		Polygon *pkPolygon = m_vpkPolygons[i];

		// check if it's visible
        if( !pkPolygon->m_bIsVisible )
			continue;
		
		// construct volume's caps
		m_vpkSilhouetteVertices.push_back( m_vpkVertices[(i*3)] );
		m_vpkSilhouetteVertices.push_back( m_vpkVertices[(i*3)+1] );
		m_vpkSilhouetteVertices.push_back( m_vpkVertices[(i*3)+2] );
		m_vpkSilhouetteVertices.push_back( m_vpkVertices[(i*3)]+(m_vpkVertices[(i*3)]-translation)*100.0f );
		m_vpkSilhouetteVertices.push_back( m_vpkVertices[(i*3)+2]+(m_vpkVertices[(i*3)+2]-translation)*100.0f );
		m_vpkSilhouetteVertices.push_back( m_vpkVertices[(i*3)+1]+(m_vpkVertices[(i*3)+1]-translation)*100.0f );

		// for every vertex..
        for( unsigned int j = 0; j < 3; j++ )
		{
			// get neighbor
			int k = pkPolygon->m_aiNeighbors[j];

			// if there's no neighbour or the neighbour isn't visible..
			if( (!k) &#0124;&#0124; (!m_vpkPolygons[k-1]->m_bIsVisible) )
            {
				// get the edges..
				Vector3D &kVertex1 = m_vpkVertices[(i*3)+j];
				Vector3D &kVertex2 = m_vpkVertices[(i*3)+(j+1)%3];

				// extrude them..
                Vector3D kVertex3 = kVertex1+(kVertex1-translation)*100.0f;
                Vector3D kVertex4 = kVertex2+(kVertex2-translation)*100.0f;

				// and construct volume
				m_vpkSilhouetteVertices.push_back( kVertex1 );
				m_vpkSilhouetteVertices.push_back( kVertex3 );
				m_vpkSilhouetteVertices.push_back( kVertex2 );
				m_vpkSilhouetteVertices.push_back( kVertex2 );
				m_vpkSilhouetteVertices.push_back( kVertex3 );
				m_vpkSilhouetteVertices.push_back( kVertex4 );
			}
		}
	}

	glEnableClientState( GL_VERTEX_ARRAY );
	glVertexPointer( 3, GL_FLOAT, 0, &(m_vpkSilhouetteVertices[0]) );
	glDrawArrays( GL_TRIANGLES, 0, m_vpkSilhouetteVertices.size() );
	glDisableClientState( GL_VERTEX_ARRAY );
}

however, this is much slower, 25fps compared to 35fps - wtf?

well drawing your geometry with vertex arrays should never be slower then immediate mode ( but they are often not much faster ). So I think that your problem is probably with generating the array for VAs.

The problem may be connected with your usage of vector<Vector3D> m_vpkSilhouetteVertices;

I’m not sure how the vector class is implemented but I think that each time you call push_back() on it the whole array is reallocated and it can slow down the whole process ( and you call this method really often ). You can try to pre-allocate it with reserve() method ( just compute how much memory you will need, it doesnt have to be accurate so you can just compute max possible memory you will need ). Maybe the problem is elsewhere but you can give it a try.

BTW, push_back() in standard vector, based on array allocator, is fast enough. If capacity is not enough, it doubles it, so, I hope, push_back isn’t a bottleneck.

But, just for the case, it’s a good idea to reserve needed capacity.
Also, your vertex array is in system memory, this is awfully bad, try using VBOs instead.

If your vertex light position changes rarely (less, then once per frame) - you shouldn’t do vertex-array filling every time you draw your extruded silhouette.

ok i tried what you suggested and pre-reserved the memory instead of pushing back new data - didn’t increase performance, though :stuck_out_tongue:

Also, your vertex array is in system memory, this is awfully bad, try using VBOs instead.
jep i’m using vbo’s now. i actually thought it’d be better to use standard va’s in this case, as i’m updating the geometry every frame. so is it no problem to call glbufferdata() that frequently?

If your vertex light position changes rarely (less, then once per frame) - you shouldn’t do vertex-array filling every time you draw your extruded silhouette.
yes, i’m going to optimize this, thanks :slight_smile:

edit - just in case i made a stupid mistake, here’s the updated code:

void Group::ComputeSilhouette( const Vector3D translation )
{
	// clear silhouette edges array
	m_vpkSilhouetteEdges.clear();

	// for all polygons..
    for( unsigned int i = 0; i < m_vpkPolygons.size(); i++ )
	{
		// get current polygon
		Polygon *pkPolygon = m_vpkPolygons[i];

		// check if it's visible
        if( !pkPolygon->m_bIsVisible )
			continue;
		
		// get the edges..
		Vector3D &kVertex1 = m_vpkVertices[(i*3)];
		Vector3D &kVertex2 = m_vpkVertices[(i*3)+1];
		Vector3D &kVertex3 = m_vpkVertices[(i*3)+2];

		// extrude them..
		Vector3D &kVertex4 = kVertex1+(kVertex1-translation)*100.0f;
		Vector3D &kVertex5 = kVertex3+(kVertex3-translation)*100.0f;
		Vector3D &kVertex6 = kVertex2+(kVertex2-translation)*100.0f;

		// and construct volume's caps
		m_vpkSilhouetteEdges.push_back( kVertex1 );
		m_vpkSilhouetteEdges.push_back( kVertex2 );
		m_vpkSilhouetteEdges.push_back( kVertex3 );
		m_vpkSilhouetteEdges.push_back( kVertex4 );
		m_vpkSilhouetteEdges.push_back( kVertex5 );
		m_vpkSilhouetteEdges.push_back( kVertex6 );

		// for every vertex..
        for( unsigned int j = 0; j < 3; j++ )
		{
			// get neighbor
			int k = pkPolygon->m_aiNeighbors[j];

			// if there's no neighbour or the neighbour isn't visible..
			if( (!k) &#0124;&#0124; (!m_vpkPolygons[k-1]->m_bIsVisible) )
            {
				// get the edges..
				Vector3D &kVertex1 = m_vpkVertices[(i*3)+j];
				Vector3D &kVertex2 = m_vpkVertices[(i*3)+(j+1)%3];

				// extrude them..
                Vector3D kVertex3 = kVertex1+(kVertex1-translation)*100.0f;
                Vector3D kVertex4 = kVertex2+(kVertex2-translation)*100.0f;

				// and construct volume
				m_vpkSilhouetteEdges.push_back ( kVertex1 );
				m_vpkSilhouetteEdges.push_back( kVertex3 );
				m_vpkSilhouetteEdges.push_back( kVertex2 );
				m_vpkSilhouetteEdges.push_back( kVertex2 );
				m_vpkSilhouetteEdges.push_back( kVertex3 );
				m_vpkSilhouetteEdges.push_back( kVertex4 );
			}
		}
	}

	if( !glIsBuffer(m_uiSilhouetteId) )
		glGenBuffers( 1, &m_uiSilhouetteId );

	glBindBuffer( GL_ARRAY_BUFFER, m_uiSilhouetteId );
	glBufferData( GL_ARRAY_BUFFER, m_vpkSilhouetteEdges.size()*sizeof(Vector3D), &(m_vpkSilhouetteEdges[0]), GL_STATIC_DRAW );
}

void Group::RenderSilhouette()
{
	glEnableClientState( GL_VERTEX_ARRAY );
	glBindBuffer( GL_ARRAY_BUFFER, m_uiSilhouetteId );

	glVertexPointer( 3, GL_FLOAT, 0, 0 );
	glDrawArrays( GL_TRIANGLES, 0, m_vpkSilhouetteEdges.size() );

	glBindBuffer( GL_ARRAY_BUFFER, 0 );
	glDisableClientState( GL_VERTEX_ARRAY );
}

It is not bad, if you update VBO each frame.
I advice you to read this document: http://developer.nvidia.com/attach/6427

ok great, thank you!
@eric: i’d like to implement the scissor optimization that you’re discussing in your article, but i got lost after the third or fourth transformation… could you explain in other words to me how to compute those scissor rectangles? thanks in advance!

Vexator

see this example - scissor and depth bound test with sources and demo: http://download.developer.nvidia.com/dev…olume_intersect

Vexator,

If I understand what you’re asking about, the scissor optimization is basically computing a screen-space bounding rect (or volume if you use depth bounds test) that conservatively covers the intersection of the light volume and the shadow volume of an object.

With bounded light sources that are fully attenuated at some distance away from the light, it is wasted effort to do stencil incr/decr when you know for sure the light won’t illuminate there.

This is an important way to reduce fill costs for SSV rendering, though it does (unfortunately) come at some CPU cost.

Thanks -
Cass

For the scissor test, you’re basically just projecting the light’s bounding sphere onto the image plane and calculating the enclosing rectangle. You have to be careful to calculate the right tangent planes to the sphere, and there are a bunch of corner cases to watch out for. There’s a somewhat old derivation of the bounding rectangle here:

http://www.gamasutra.com/features/20021011/lengyel_01.htm

It can be simplified a little. A slightly better derivation can be found in the 2nd edition of Mathematics for 3D Game Programming and Computer Graphics.

thank you, i downloaded the demo and will try to implement the tests :slight_smile: but i’m having problems organizing all those visiblity checks atm:

my first pass is depth only, then follow the light passes, one for each light. this worked great before i implemented shadows. now i’m agonizing over this for hours: i have only one depth test, so i can only decide one per frame which objects are to be drawn. but actually, this can differ for every light. an object which casts visible shadows in light pass #1 does not neccessarily have to do the same in light pass #2. but i can’t enable the depth mask for every light pass as this results in additively blended passes Oo how to manage this?