I have written a Shadow Volume calculation routine for a game I am working on, but it appears to me that the algorithm is still not efficient enough. Shadow volumes are a rather expensive shadow implementation, but to me they seem a lot more intuitive and robust to work with and they also offer better visuals (pixel-accurate shadows, voluminous shadowing) than shadow maps, which would be the cheaper, more established method.

First, a small introduction to Shadow Volumes:

Other than shadow maps, which are textures, shadow volumes (as the name implies) are 3D meshes. They are defined by vertices with position and normal like any other mesh and have to be rendered with a draw call at some point. The vertices of the shadow mesh are determined by a caster mesh, which obscures a volume from a light. Because the shape of the shadow changes when either the light or the caster changes its position or rotation, shadow volumes need to be highly dynamic and often have to be recalculated each frame (but you can cache the shadow volumes for the shadows cast by a static object which is obscuring a static light), which means an algorithm which calculates these shadow volumes faster can be very valuable.

Once the shadow mesh is calculated, it is drawn to a stencil buffer in projected space. Here we increment the stencil buffer value for every front facing fragment by first culling the backfaces of the shadow mesh, and then decrementing the stencil value for every backfacing fragment by culling the frontfaces. in our lighting calculation we only light the fragments whose stencil buffer value is 0.

How do you calculate the light volume?

There are several slightly different approaches to this, but they all follow the same routine:

- find the front-/backfaces of a mesh, from the perspective of a light (by dotting the face normals with the light direction Vector)

- find the silhouette of the mesh from the light´s perspective by either:

- searching through all edges of a mesh and comparing the 2 adjacent faces. if one of them is front- and the other backfacing, this edge is part of the silhouette

- searching through all edges in the front- or backfaces and finding the index pair (edge) which only occurs once (it is an edge to the other side of the mesh, therefore part of the silhouette)

- while we search for the silhouette indices, we need to remember to retain the order which they had in thebackfaces, to ensure correct drawing order.

- after the silhouette indices have been found, we fetch the actual corresponding vertices and extrude them to infinity by integrating vertices with xyz values of the vector from the light to that vertex and a w value of 0.

- finally (not really) we have to generate proper indices for the shadow vertices. here we orient ourselves on the indices for the silhouette indices, which still have the correct drawing order preserved.

After all the shadow volume calculations we need to include the shadow volumes into our lighting calculations. We can do this by creating a screen sized, single integer component texture (ideally a stencil texture) for every light and drawing the shadow volumes cast by a light into the corresponding texture. Then, when we iterate through our lights during the lighting calculations, we can use the texture to determine if the fragment should be included or skipped by the lighting calculations.

Phew, this is a lot to calculate for roughlyevery mesh * every light... I would like to open this problem up for public discussion here on the forums, hopefully we can come up with some optimizations for this algorithm.

Here is the algorithm i have working right now, but its still kind of slow

calculating the shadows of

10 cubes (each 12 faces) with 1 light took about 6 ms;

100 cubes with 1 light: 60 ms;

1000 cubes with 1 light: 600 ms;

10 spheres (each 960 faces) with 1 light: 60 ms;

100 spheres with 1 light: 740 ms;

1000 spheres with 1 light: 7120 ms;

10 cubes with 2 lights: 10 ms;

100 cubes with 2 lights: 100 ms;

1000 cubes with 2 lights: 1150 ms

10 spheres with 2 lights: 140 ms;

100 spheres with 2 Lights: 1410 ms;

1000 spheres with 2 lights: 14190 ms;

before the frame loop i have to create an info structure of every mesh to be used in the shadow calculations:

Code :struct MeshInfo { std::vector<std::array<unsigned int, 3>> faces; //3 indices std::vector<std::array<unsigned int, 2>> edges;//2 indices into [I]faces[/I](not to vertices!) std::vector<std::array<glm::vec3, 2>> faceVertex; //for more accurate backface detection - face position and normal }; unsigned int Lighting::makeMeshInfo(unsigned int pIndexOffset, unsigned int pIndexCount) { puts("Generating Shadow Mesh"); MeshInfo sMesh; unsigned int faceCount = pIndexCount / 3; sMesh.faceVertex.resize(faceCount); sMesh.faces.resize(faceCount); std::vector<unsigned int> faceIndices; faceIndices.resize(pIndexCount); //allocate face indices for (unsigned int i = 0; i < pIndexCount; ++i) { //check whether index to position has already been used in faces unsigned int ind = OpenGL::allStaticIndices[pIndexOffset + i]; glm::vec3 vPos = OpenGL::allStaticVertices[ind].position; for (unsigned int j = 0; j < i; ++j) { unsigned int jim = sMesh.faces[j / 3][j % 3]; if (vPos == OpenGL::allStaticVertices[jim].position) { ind = jim; j = i; } } sMesh.faces[i / 3][i % 3] = ind; } //generate face normals and positions for (unsigned int i = 0; i < pIndexCount; i += 3) { glm::vec3 n = glm::normalize( OpenGL::allStaticVertices[OpenGL::allStaticIndices[pIndexOffset + i]].normal + OpenGL::allStaticVertices[OpenGL::allStaticIndices[pIndexOffset + i+1]].normal + OpenGL::allStaticVertices[OpenGL::allStaticIndices[pIndexOffset + i+ 2]].normal); glm::vec3& a = OpenGL::allStaticVertices[OpenGL::allStaticIndices[pIndexOffset + i]].position; glm::vec3& b = OpenGL::allStaticVertices[OpenGL::allStaticIndices[pIndexOffset + i+1]].position; glm::vec3& c = OpenGL::allStaticVertices[OpenGL::allStaticIndices[pIndexOffset + i+2]].position; glm::vec3 p = (a + b + c) / 3.0f; sMesh.faceVertex[i/3][0] = n; sMesh.faceVertex[i / 3][1] = p; } //find the faces of every edge sMesh.edges.resize(faceCount * 3); unsigned int edgeCount = 0; for (unsigned int f = 0; f < faceCount - 1; ++f) { for (unsigned int r = f; r < faceCount; ++r) { unsigned int found = 0; for (unsigned int fi = 0; fi < 3; ++fi) { for (unsigned int ri = 0; ri < 3; ++ri) { if (sMesh.faces[f][fi] == sMesh.faces[r][ri]) { ++found; ri = 3; if (found > 1) { fi = 3; std::array<unsigned int, 2> arr = { f, r }; sMesh.edges[edgeCount] = arr; ++edgeCount; } } } } } } sMesh.edges.resize(edgeCount); allMeshInfos.push_back(sMesh); return allMeshInfos.size()-1; }

every frame, for every light, i loop through all meshes and their entities

Code :for(lightIndex : allLights){ vec3 lightPos = allPositions[lightIndex]; for(mesh : allMeshes){ MeshInfo& mInfo = allMeshInfos[mesh.meshInfo]; for(entityIndex : mesh.entities){ mat4 entityMatrix = allMatrices[entityIndex]; calcShadowVolume(mInfo, entityMatrix, lightPos); } } }

using the light position and entity matrix, the backfaces of the MeshInfo mInfo can be determined.

This is one of the 2 heaviest functions of the whole calculation, due to all its loops

Code ://find silhouette by dotting face normals and light direction std::vector<unsigned int> Lighting::findSilhouette(unsigned int pMeshInfoIndex, glm::vec3 pObjectSpaceLightPos) { const MeshInfo& shadowMesh = allMeshInfos[pMeshInfoIndex]; unsigned int faceCount = shadowMesh.faces.size(); std::vector<unsigned int> backfaces; unsigned int backFaceCount = 0; backfaces.resize(faceCount); //determine back & front faces for (unsigned int i = 0; i < faceCount; ++i) { if (glm::dot(shadowMesh.faceVertex[i][0], glm::normalize(pObjectSpaceLightPos - shadowMesh.faceVertex[i][1])) < 0.0) { backfaces[i] = 1; ++backFaceCount; } else { backfaces[i] = 0; } } std::vector<std::array<unsigned int, 2>> silhouetteFacePairs; unsigned int silhouetteEdgeCount = 0; silhouetteFacePairs.resize(backFaceCount*3); //for every edge, compare the participating faces for their facing //store the faces of a silhouette edge in silhouetteFacePairs and //order them like this: (frontface, backface) for (unsigned int e = 0; e < shadowMesh.edges.size(); ++e) { if (backfaces[shadowMesh.edges[e][0]] != backfaces[shadowMesh.edges[e][1]]) { std::array<unsigned int, 2> arr; if (backfaces[shadowMesh.edges[e][0]]) { arr = { shadowMesh.edges[e][1], shadowMesh.edges[e][0] }; } else { arr = { shadowMesh.edges[e][0], shadowMesh.edges[e][1] }; } silhouetteFacePairs[silhouetteEdgeCount] = arr; ++silhouetteEdgeCount; } } silhouetteFacePairs.resize(silhouetteEdgeCount); //find common indices in face pairs //the resulting 2 indices give the silhouette edge //keep the order in the backface //compare each index of the backface with every index in the frontface std::vector<unsigned int> silhouette; silhouette.resize(silhouetteEdgeCount*2); unsigned int silhouetteIndices = 0; for (unsigned int p = 0; p < silhouetteEdgeCount; ++p) { //for every face pair unsigned int found = 0; std::array<unsigned int, 2> edge; int swap = 0; const std::array<unsigned int, 3>& front = shadowMesh.faces[silhouetteFacePairs[p][0]]; const std::array<unsigned int, 3>& back = shadowMesh.faces[silhouetteFacePairs[p][1]]; //for every index in the backface for (unsigned int b = 0; b < 3; ++b) { //iterate through every index in frontface for (unsigned int f = 0; f < 3; ++f) { if (front[f] == back[b]) { edge[found] = back[b]; ++found; f = 3; } } if (found == 2) { b = 3; }else if (found == 1) { if (b == 1) { edge[1] = back[2]; swap = 1; b = 3; ++found; } }else if (found == 0) { b = 3; edge[0] = back[1]; edge[1] = back[2]; } } silhouette[silhouetteIndices++] = edge[0 + swap]; silhouette[silhouetteIndices++] = edge[1 - swap]; } //silhouette indices of form: 0, 1, 1, 2, 2, 3, 3, 4 silhouette.resize(silhouetteIndices); return silhouette; }

finally, the shadow mesh has to be put together by generating the vertices at infinity and the indices to connect it all to make sense.

Code ://tie shadow mesh together unsigned int Lighting::genShadowMesh(unsigned int pEntity, unsigned int pLight, std::vector<unsigned int>& pSilhouetteIndices) { //get vertex world positions of the mesh silhouette and extrude them away from the light unsigned int silhouetteIndexCount = pSilhouetteIndices.size(); unsigned int shadowIndexCount = silhouetteIndexCount * 3; unsigned int localIndexOffset = allLightShadowVertices.size(); const Light& light = allLights[pLight]; glm::vec3 lightPos = light.position; glm::mat4 matrix = Component::positionArray[pEntity].matrix; std::vector<unsigned int> shadowIndices; shadowIndices.resize(shadowIndexCount); //extract unique global indices from silhouetteIndices, generate vertex at infinity //transform global indices to local (relative to shadowMesh) indices std::vector<unsigned int> uniqueGlobalIndices; uniqueGlobalIndices.resize(silhouetteIndexCount); unsigned int shadowVertexCount = 0; std::vector<unsigned int> localSilhouetteIndices = pSilhouetteIndices; for (unsigned int i = 0; i < silhouetteIndexCount; ++i) { //index into global vertex array unsigned int globalIndex = pSilhouetteIndices[i]; //plain index from the silhouetteIndices unsigned int localIndex; unsigned int prev = i; for (int find = i - 1; find >= 0; find--) { //backwards search for previous occourence of this globalIndex if (pSilhouetteIndices[find] == globalIndex) { prev = (unsigned int)find; find = -1; } } //prev is either index to previous occourence of globalIndex or i if (prev < i) { // index has been processed before //transformed index is in localIndices at prev localIndex = localSilhouetteIndices[prev]; } else { //this is the first time this index is encountered localIndex = shadowVertexCount + localIndexOffset; //generate local index uniqueGlobalIndices[shadowVertexCount /2] = globalIndex; //add unique global vertex to global vertex list for vertex generation shadowVertexCount += 2; //add cap and infinity vertex in advance (all even numbers are cap indices, all uneven are infinity } localSilhouetteIndices[i] = localIndex; } uniqueGlobalIndices.resize(shadowVertexCount / 2); std::vector<ShadowVertex> shadowVertices; shadowVertices.resize(shadowVertexCount); for (unsigned int i = 0; i < shadowVertexCount/2; i++) { glVertex v = allStaticVertices[uniqueGlobalIndices[i]]; ShadowVertex cap; cap.pos = matrix * glm::vec4(v.position, 1.0f); cap.normal = glm::vec3(matrix * glm::vec4(v.normal, 0.0f)); //BLUR?? ShadowVertex inf = cap; // generate vertex at infinity inf.pos = glm::vec4(glm::vec3(cap.pos) - lightPos, 0.0f); shadowVertices[i*2] = cap; shadowVertices[(i*2) + 1] = inf; } /* Vertex Array index space 0, 1, 2, 3, .. 2----0----6----4----8 | /| /| /| /| | / | / | / | / | | / | / | / | / | |/ |/ |/ |/ | 3----1----7----5----9 final indices 0, 2, 3, 0, 3, 1 4, 6, 7, 4, 7, 5 8, 4, 5, 8, 5, 9 */ for (unsigned int i = 0; i < silhouetteIndexCount; i += 2) { shadowIndices[i * 3] = localSilhouetteIndices[i]; //0 shadowIndices[(i * 3) + 1] = localSilhouetteIndices[i + 1]; // 2 shadowIndices[(i * 3) + 2] = localSilhouetteIndices[i + 1] + 1; //3 shadowIndices[(i + 1) * 3] = localSilhouetteIndices[i]; //0 shadowIndices[(i + 1) * 3 + 1] = localSilhouetteIndices[i + 1] + 1; //3 shadowIndices[(i + 1) * 3 + 2] = localSilhouetteIndices[i] + 1;//1 } allLightShadowVertices.insert(allLightShadowVertices.end(), shadowVertices.begin(), shadowVertices.end()); allLightShadowIndices.insert(allLightShadowIndices.end(), shadowIndices.begin(), shadowIndices.end()); allShadowVertexCount += shadowVertexCount; allShadowIndexCount += shadowIndexCount; return shadowIndexCount; }

and to give you some eye candy, here is the result so far, showing the shadow volumes: http://i.imgur.com/aafJQlC.gifv

(i think some buffer upload is not being synchronized properly, that why some cubes and planes are jumping around :P )

The final plan is to draw these volumes to a stencil buffer and not in the color buffer, and then not illuminate the fragments inside the shadow volume by the light.

But back on topic: the algorithm is still too slow.. as you can see i am running at ~22 FPS with 10 shadowed objects in the scene, 9 being cubes and one being a sphere, and I am only using one light!

Maybe you guys can help me figure out where there are flaws or rooms for optimization in this algorithm. I aswell will update this thread when i come up with anything which shows any improvement.

Thanks!