Octrees+Frustum Culling+Occlusion Queries

I’m currently working on an engine that uses octrees with frustum culling.On top of that,I use NV_OCCLUSION_QUERY.What I do is traverse the tree
top-to-down and front-to-back,draw the bbox of the node and do an occlusion query.If the node does not pass the query,I reject it and all of its children.If it passes,i continue down the tree and render any terminal nodes I encounter.
The problem is,when the camera is inside the node,
only the back face of the bbox is drawn(the front face is clipped agains the near plane).There are some cases where the back face is completely obscured,so the node is rejected,but still some children of this node closer to the camera are visible.I end up rejecting parts of the world that are supposed to be drawn.
Part of this problem is the fact that polys can be shared by two or more nodes.
I could try to split the polys,so every node is an indepedent entity,but I really would not want that.
I’m wondering if anyone has encountered this.I’m thinking of using NV_DEPTH_CLAMP to draw all the faces of the bbox,but i’m not sure this is completely right.I would appreciate any opinions from people with experience about this stuff.

Perhaps you could do a special case if the camera is inside the current node. Just assume the node is visible. Should be easy enough :slight_smile:
</font><blockquote><font size=“1” face=“Verdana, Arial”>code:</font><hr /><pre style=“font-size:x-small; font-family: monospace;”>void
Node::render()
{
if (::camera.is_inside(this)

Sorry, the forum software doesn’t like me, or it doesn’t like my posting. I’m definitely logged in. Don’t know what’s going on.

Originally posted by mikeman:
The problem is,when the camera is inside the node,
only the back face of the bbox is drawn(the front face is clipped agains the near plane).There are some cases where the back face is completely obscured,so the node is rejected,but still some children of this node closer to the camera are visible.I end up rejecting parts of the world that are supposed to be drawn.

I’m having trouble seeing how the backfaces of this node’s bbox could be totally obscured but children are still visible. Maybe you could elaborate. What are the backfaces being obscured by? (presumably, other nodes could not be “in front” of this node since we’re inside it).

BTW, for debugging purposes, you can insert a translation into the projection matrix to get a birds eye view of what’s being culled and why. Suggest drawing the frustum too and inspecting what’s really prevent those backfaces from being counted as visible. I suspect the problem is something else, but that’s just a gut impression.

Originally posted by mikeman:
Part of this problem is the fact that polys can be shared by two or more nodes.
I could try to split the polys,so every node is an indepedent entity,but I really would not want that.

That’s a thorny issue with quad/octtrees. Sharing objects across nodes can easily lead to drawing the objects multiple times as well as missing them. My suggestion is to split if you can, but the simpler solution is to propogate objects up to the parent node until they no longer span nodes.

Avi

Well, I’m also not sure why the back faces of the object would be obscured but the front faces wouldn’t (I would definitely suggest depth clamp).

As for the polygon sharing thing, yeah, that is a thorny issue. However, the “correct” way to solve that is to promote what ever is crossing the boundary to the next level (this totally doesn’t work for subdividing terrain). There are other ways to solve that, like loose octrees (Game Programming Gems 2 I think).

The other method (the only one I can think of while staying inside octants… i.e. with terrain) is to just subdivide your geometry along seems (there are other trade offs to this approach). As long as you aren’t trying to fit too few polygons into a octant it would probably be fairly successful (in fact, it is, I used that method in my last project).

EDIT: My brain is slow, I repeat too many things… sorry, hope it helps anyways.

Originally posted by zeckensack:

Perhaps you could do a special case if the camera is inside the current node. Just assume the node is visible.

This would be my first guess. The same would be true for a bsp traversal in portal rendering or for a vis matrix evaluation; if the camera is in the leaf, then draw the leaf. It’s simply a trivial acceptance case.

You could avoid multiple renderings of shared leaf data in the tree by simply embedding an integer counter in each render geometry. Each time you visit the geometry, bump the counter by 1, and only draw it if the counter equals the frame count, for instance. Depending on the geometry, this could be more efficient than furthur subdivision. And as you mentioned, that is not your first choice.

Im planing to do similar think with octrees. First, I have func that delvier 3 arrays of oct leafs (near, medium and far) based on frustum and distance (center of oct-leaf box) from camera.

Pass 1:
Render will draw near, medium then far leasf and for each oct leaf it will test occlusion. Discard all “invisible” leafs. This is actually “ZFill and ambient” pass.

Pass 2: Generate shadow map from light and render over existing framebuffer shadow map from light using glBlendFunc(GL_ONE, GL_ONE). In this pass I’ll also do occlusion query and eliminate rest of “invisible” leafs which are not detected in first pass (because arrays are not sorted).

Pass N:
Same as pass 2 but without occlusion test

Final pass:
Render scene with shaders and textures.

In all other passes I’ll render only “visible” oct leafs.

This algo can be more optimized (sorting near, medium and far leafs by distance). Or you can use expensive shaders only on near leafs and simple shaders for medium and far leafs.

yooyo

Hey -

The whole Occlusion queries for visibility in a tree traversal - isn’t it a bad thing?

I mean, when starting at the top of the tree, 8 nodes are there, they are submitted in an instant, and then we need the results right after, to figure out which of them was visible.

This effectively forces the gpu to catch up with us, which should be a bad thing?

Or maybe I am missing something?

Cyranose: My suggestion is to split if you can, but the simpler solution is to propogate objects up to the parent node until they no longer span nodes.
What most people do is use loose octrees, where a thing is classified based on its center, but the octree nodes are extended to include the entire things (and thus some area covered by more than one octree branch).

Querying becomes slightly more complex, and insertion/update needs to either dynamically recalculate branch extents, or just use a fixed extent inflation; at some point you don’t push an object further down no matter what (i e, say that you never accept more than 5% inflation). Pushing a world-size sphere centered on the origin down would be death :slight_smile:

But on the whole, everybody’s using the loose octrees these days, because they’re simple to implement, don’t require duplication of entitities, and perform very well in real-world situations.

Originally posted by jwatte:
[QUOTE]
But on the whole, everybody’s using the loose octrees these days, because they’re simple to implement, don’t require duplication of entitities, and perform very well in real-world situations.

Well, not everyone. :slight_smile:

I think loose octrees are a bit of a misnomer anyway. If, on the one hand, you customize the division planes of a fixed-size cell to something other than the half-way point, then fine. There are other names for that sort of graph, but “loose octree” is descriptive enough to fit.

But if you’re making a hierarchy of AABBs around objects and collections of objects with no big restrictions on the outer bounds of those boxes, then that’s an AABB tree, which is something else and nothing new. At that point, there’s no huge reason to limit each node to eight children either. There will be overlaps, and so multiple branches may need to be followed to answer point queries (which was one thing we are always trying to avoid with strict octrees).

For those of us with bandwidth issues on the plate, strict octrees and quadtrees also have a huge bonus – paths through the tree are completely implicit. Nodes don’t need to know anything except who their four or eight children are and which objects are fully contained. AABB trees, otoh, offer much more flexibiity for dynamic scenes.

Avi

I don’t think you’ve quite gotten the essence of what the loose octree is about. It’s different from the two suggestions you make.

First, it uses a fixed division plane. It doesn’t move division planes like, say, a KD-tree.

Second, it’s not an arbitrary collection of boxes. At its base, it’s just an octree, with fixed division planes.

The addition is that each node contains an inflation factor, which is either a fixed percentage, or the size of the biggest thing that’s inserted in that node or one of its children, or the maximum inflation of any of its children (which is the tightest fit, but a pain to implement).

When you update the tree, you make sure to update the node inflation factors appropriately. Things are classified based on their center point, using un-inflated octree node values, so it’s a pure octree that way.

When you query a tree, you follow any node whose inflated size intersects the query volume.

If you still don’t get it, I suggest going to some of the beginner tutorials on the web, which have code.

AABB trees have big problems when it comes to figuring out good clustering of sub-children, especially dynamically. Loose octrees are well defined and well behaved in this respect, just like regular octrees. The gain is that, at the cost of keeping an inflation factor per node, you gain the ability to store single instances of objects, never having to split or copy objects that straddle boundaries, while still being able to push them reasonably far down the tree.

Originally posted by jwatte:

If you still don’t get it, I suggest going to some of the beginner tutorials on the web, which have code.

What I don’t get is why you apperently feel you need to be insulting to make a largely algorithmic point. You’ve done this before and I wish you’d stop. It serves no one.

On the point, which is now largely off the original question, there is little functional difference between taking an AABB with, say, a min/max position vs using an inflated cell, with a min/max XYZ as derived from its normal size and an inflation factor. Perhaps a few bytes of storage. (I don’t think I need to explain that a cell of an octtree, inflated or not, as long as it’s not arbitrarily rotated, is a subclass of AABB.)

But if the boxes can be inflated over each other, the boxes can overlap. If the boxes can overlap, then multiple branches must be followed when a query point occurs in an overlap zone. That’s my main point, which I think you’ve already conceded.

The degree of overlap will largely determine how effective or ineffective the “loose octree” is vs other techniques, since, as you know, one of the main points of any tree is reducing the number of nodes that must be searched. And, as I said, one of the main benefits of further relaxing things to the point of properly calling this an AABB tree, is that cells in the tree can move more freely, without having to expand and contract nodes, which may cause “ripples of change” to propogate up and down the system more frequently than is needed.

I’m sorry if you find me attacking; I was attempting to clarify some points about loose trees which you did not address in your post.

We were talking about querying volumes, not points, through the trees, which means you need to query some number of neighboring nodes anyway in the case where the nodes split, as volumes can span boundaries in any kind of tree. The increase in the number of nodes you need to traverse in a loose octree is, at worst, a small constant.

You still get O(lg N) queries amortized cost, which is the big gain. Additionally, you save substantial memory compared to general AABB trees, and you save substantial computation, too, in the case where the tree is dynamic. Runtime for modern algorithms is largely dominated by cache misses, so saving “a few bytes” per nodes can make a big difference.

Originally posted by jwatte:
[b]I’m sorry if you find me attacking; I was attempting to clarify some points about loose trees which you did not address in your post.

We were talking about querying volumes, not points, through the trees, which means you need to query some number of neighboring nodes anyway in the case where the nodes split, as volumes can span boundaries in any kind of tree. The increase in the number of nodes you need to traverse in a loose octree is, at worst, a small constant.

You still get O(lg N) queries amortized cost, which is the big gain. Additionally, you save substantial memory compared to general AABB trees, and you save substantial computation, too, in the case where the tree is dynamic. Runtime for modern algorithms is largely dominated by cache misses, so saving “a few bytes” per nodes can make a big difference.[/b]
Thanks. Okay. Interesting conversation. I agree that “loose octrees” are a decent balance between strict octrees and completely generalized AABB trees. I can understand the motivation for the hybrid. However, I’d suggest that once we’re talking about volume vs. tree queries, it could still be cheaper in memory and instructions to propogate spanning objects up the tree until they’re fully contained in strict octnodes. Strict octrees will still be smaller than loose ones, plus you have to count the cache misses for the extra branches you traverse to solve a query.

A side note: I’m sure you’ve optimized the hell out of your strict oct/quadtrees, but my experience has been that people often implement them by testing a volume against each child node’s box, where all that’s really needed is two or three axis-aligned half-space tests (i.e., add/sign tests) per parent and everything else is implicit.

With independently bloated children, it swings back to at two or three extra half-space tests per child (or at worst, a box test per child), which would seem about as bad as an AABB tree in terms of CPU branching cost. In the “propogate objects up to fix spans” method, the only cost is increased false-positives from those formerly spanning objects. But even with an earth-sized tree, for example, I found that cost to be insignificant compared to the savings.

I’ll try comparing my optimized ‘strict’ tree to a loose one next chance I have – it’ll be interesting to see the comparison. FWIW, my experience with really big trees is that bloating the test volume just a little (akin to bloating the cells just a little) had a big mutiplier on the O(logN) factor in terms of real time, which is why I’m skeptical of the loose approach. But it’s best to test.

BTW, not to throw a blanket on this, but I’m not entirely sure how loose octrees would work for the original question of rendering the tree as occlusion queries. The overlaps would seem to be problematic, no? (not that I’m expecting the occlusion query to be any big win anyway…)

Cyranose:
So you haven’t implemented a loose Octree scheme youself? Be sure they work as advertised.

JWatte:
However I didn’t know the variant where the children can be independently bloated. The original loose octree suggestion by Thatcher Ulrich was to inflate the size of each octree node by 1/2 the distance of their center points. This way the benefits are:

  • Which node to store dynamic objects is determined and implicit. The level in the hierarchy is given by the size of the object.

  • Sizes and positions of nodes are implicit.

  • Each parent node still is a bounding volume of all their childs.

I would be happy to be enlighted by new theory though :slight_smile:

I read a post or two about Thatcher’s idea back in (maybe) 1999 and 2000 and did a quick search to refresh myself recently. To answer your question: I’ve implemented a half dozen variants on AABB and forms of ‘strict’ or ‘loose’ trees (not his variant). So I’ve tried to discuss options and variations on a theme, tradeoffs and costs.

I like your explanation of his variant. Having a fixed bloat down the entire tree would give most of the benefits a fixed scheme while being a bit more tolerant of spans. Would still seem to require some up propogation of objects that exceed even the bloat, though. Is that correct?

I haven’t implemented it, but having objects that are “bigger” than an AABB the size of the bloat amount (at what ever level you are dealing with) would mean they are not guaranteed to stay in that level.

I think the idea is to boost objects up one layer of the tree if they are bigger than the bloat, because then they are always guaranteed to be inside at least one node if they are smaller. It would cause more objects to be in higher nodes, but still way less than otherwise… because they can stay in one level their entire lives. Otherwise, it’s still the same as an octree that’s a lot less strict (which is good too).

Anyone have any suggestions for references for these types of algorithms? I need to implement something at some point soon, and many of them seem straight forward, but I’m sure there are gotchas.

It feels like we’re on the same page at this point.

Including the fact that trying to use the results of a render query during the same frame as the query was issues will cause too much stalling to be commonly useful. Or at least commonly high-performance.

Just out of curiosity, do you also use loose octrees for your collision detection or some other kind of parallel tree structure?

If you want to test if a point collides something, you can send it down the ‘rigid’ dividing planes of the loose octree and it falls out into a leaf. But because the boundary volumes are loose, you need to check neighboring leafs also to see if their ‘expanded’ volumes contain the point.

And for raycasting. In rigid octrees you have the generalization of bresenhams algorithm to pick the next cell to test, but with loose octrees you end up having to test a lot of neighbors, yes?

I suppose there must be some quick way to lookup neighbors without having to follow pointers around…

Originally posted by Cyranose:
Would still seem to require some up propogation of objects that exceed even the bloat, though. Is that correct?
Nope, the hiearchy level is determined by the size of the object (unless objects are dynamically changing sizes that is, my maybe you can have an upper bound on size for them).

So you go:

// get level from the exponent of the size of the object

int level;
frexp( c * obj.bounding_sphere_radius, &level );

push_object_down_tree( obj, level );

where c is some constant depending on the size of the root node.