Q: Space Partitioning Trees.

Hello, I would like to present some Questions i got about space Partitioning in OpenGL.
First and foremost i want to ask about the data structure types that are used to divide the space in slices. I am interested in those:

  • (BSP) K-d trees
  • Octree
  • Quadtree

As far as i know the k-d trees are based on traditional BST with some modification, but the idea is the same. Octrees/Quadtrees , have 4/8 leafs per node. Here are my two Questions:

  1. Can i get more indepth explanation on how exactly do they work? How are they constructed ?
  2. Which tree is useful for what ?

My goal is to make a culling and ray-triangle(mesh) intersection systems in my engine.
As far as i can figure i will need one tree to divide the mesh’s triangles efficently so i can perform ray-triangle intersection on that mesh - fast. Currently i am using bullet physics to perfrom ray-object intersection (and i presume that bullet has advanced structures to make that efficent).
For the culling i have objects i assume i will use the AABB provided by the bullet for each model to test if that box is outside the frustum ? But i also have huge terrains containing alot of vertices and i assume that i should devide that with octrees.
Also isnt it better to push everything on the scene (objects,terrains etc) in a tree and make the culling there.

Edit: Also i was wondering how would an octree actually work with VBO’s, containing vertex data and IBO’s containing indices. Furthermore, is it possible to batch the entire render process, since the different model types (terrain,models,anim-models) use different shaders and materials. I cant very well change them if i have 1 draw batch call. But what if i have an octree for each geometry(models,terrains, etc) is that efficient at all. I assume that this depends on the size of the geometry , and for smaller objects i should use bounding boxes. A Wild guess - the size is actually dependant on the scale of the object relative to the camera frustrum , so in case the model is smaller i should just use bounding box , and if its bigger i should push it into a tree and eventually cull it ?? I am probably totally wrong here, but I Just started reading about space partitioning and most of the things are kinda blury.

Edit2: Also if i want to test triangles of a mesh relative to a ray , what data structure should i use to perform the test fast and how would it be working/constructed

Edit3: Just thinking about the culling i came up with this: Let’s say each model has a built up octree and AABB, you may have tree options:

  • AABB outside frustrum - discard rendering
  • AABB inside frusturm - directly render
  • AABB partly inside frustrum - test the octree to see which parts you should render
    Is that the right approach ?

BSP trees, Kd trees and octrees are all methods of representing a hierarchical subdivision of space. Being hierarchical means that if there’s no intersection between a given node and a region (e.g. the view frustum) or a ray, you know that none of the children intersect either.

BSP trees are the most general. Each node partitions the space with arbitrary plane. Kd trees have the restriction that the planes are axis-aligned, and successive levels cycle through the axes. Octrees are further constrained in that each plane passes through the centroid of parent region, splitting into two equal halves. A consequence of that is that groups of 3 consecutive levels split the space into 8 equal cubes, so you normally consider groups of three planes (one for each axis) simultaneously.

Kd trees and octrees work best when the geometry tends to be axis-aligned. Octrees work best when the geometry is aligned to a regular grid (e.g. geometry generated from voxel data by the Marching Cubes algorithm).

If you use such approaches with arbitrary geometry, you either have to “cut” the geometry along the partition planes so that each portion lies within a specific region, or you have to allow for the fact that each object (or triangle) lies within multiple regions.

Bounding hierarchies don’t have this issue. But unlike space-partitioning trees, the child nodes in a bounding hierarchy typically overlap…

I was looking more for an answer to my previous post , what i dont seem to get is if the nodes arent supposed to contain data(read that on another thread) , but only the leaves how is the testing perfromed ? what do you test with ?
I cant seem to grasp how they work on algorithmic level not geometrical .
For example i want to take a model and push it into an ocree , partition its vertices (triangles?) and perfrom more efficient ray-triangle intersection.

  • how would that work ?
  • what happens if the model is translated in space (or is translatING in space during the run time)
  • etc.

PS: Feel like a good animation or pictures of the process will be useful

The bounds of the region corresponding to the node.

At the root node, the region is the entire scene. Each node defines a plane which splits the region into two halves.

[QUOTE=Asmodeus;1279075]For example i want to take a model and push it into an ocree , partition its vertices (triangles?) and perfrom more efficient ray-triangle intersection.

  • how would that work ?
    [/QUOTE]
    Each node of an octree splits a cube into 8 smaller cubes. You already know the two points where the ray intersects the faces of the parent cube (except at the root node, where these are calculated explicitly). You find where the ray intersects each of the 3 subdivision planes, and ignore any intersection points which lie outside of the bounding cube. You now have up to 5 points (the 2 you started with plus up to 3 new ones), all of which lie on the faces of (some of) the child cubes.

The ray will intersect each child cube on either two faces or none (in which case, it doesn’t intersect that cube at all). For each child cube the ray intersects, you repeat the process, recursively descending the tree. If you only want the nearest intersection, you’d process the child cubes from front to back.

Each of the 8 child cubes lies entirely on one side or another of each of the three partitioning planes. If two cubes are on opposite sides of a given plane, the one on the same side as the viewpoint is in front of the one on the opposite side (more accurately, it cannot be behind; either it’s in front or there is no overlap).

When processing a cube which contains geometry, you test the ray against the geometry. If the ray doesn’t intersect a given cube, it doesn’t intersect any geometry bounded by that cube, so you don’t need to test it.

[QUOTE=Asmodeus;1279075]- what happens if the model is translated in space (or is translatING in space during the run time)
[/QUOTE]
The octree would be in object space. To test for intersection of a ray with a transformed object, you transform the ray into the space of the object and perform calculations in object space. Consequently, the geometry remains fixed relative to the octree.

For arbitrarily-changing geometry, you need to weigh up the effort of reconstructing the tree whenever the geometry changes against the effort saved by using the tree.

IIRC, the idTech engines (which were based upon BSP trees) used BSP trees for static geometry and also for objects which moved along a constrained path (doors, elevator platforms, etc). They treated the swept path as a single object for the purpose of generating the tree, so that only the corresponding subtree(s) needed to be modified at run time as the object moved.

Is there any source you can apply ? A website , tutorial of some sort , or code snippets ?
Currently i managed to build an octree , where for each node i have saved the box’s min/max (similar to AABB so i can use for the ray to test with) , leaves contain the vertices and thats it pretty much.
Wouldn’t i want to just do something like:

  • test node AABB against Ray
  • if its true ,
  • if node has children
  • test against the node’s children
  • else repeat recursivly until leaf is found with the collection of vertices

This however is not very efficient since i get all the nodes that the ray passes thru’ , i just tested that and it works but i detect like 4-5 nodes on average and i loop over their vertices to find the intersection , but thats not optimal.



bool traverse(curr_node,ray)
{
if(ray_aabb(curr_node.aabb,ray))
{
   if(curr_node.subdivided())
     {
        for each child from curr_node
           traverse(child,ray)
     }
   else
     {
      //leaf with vertices collection found 
      //loop for triangles and test with the ray
 
      return true;
      }
}
return false;
}

Sorry to repost , for some reason i can not edit my last Post, so i want to add another question regarding culling the geometry , i will probably use the same Octree structure. From what i gather i should use the same approach, use the node’s AABB to test against the frustrum. If it fits keep digging until you find a leaf with the nodes and push those leaves in a list. Repeat the process. Then use the list , take all vertices from all leaves and push them into a VBO, and send them to be rendered on the GPU. My Question here would be:

  1. Is that efficient ? Rebuilding a new VBO every frame (or whenever the camera/frustum changes position). Isn’t that too expensive
  2. Is the idea the right approach to this problem or there is something else ? Better than this ?

Also any suggestions on my Previous Question about more reliable more efficient way to ray trace tru’ the nodes,. perhaps ?

You wouldn’t normally perform culling at the level of individual polygons. Below a certain level of granularity, testing whether or not to draw will be more expensive than drawing unconditionally. Enabling or disabling groups of polygons can be performed with e.g. glMultiDrawElements().

I dont think you understood what i meant. I dont want to perfrom culling at poly levels. I was talking about perfroming culling against the AABB (of the octree’s leaf) and the frustrum , if it passes get all vertices in the leaf and push them into a VBO , send them to GPU.This will be recursive function, similar to the recursive function above for the ray,so it will find all leaves that intersect the frustrum then i can simply push that data(vertices - from the visible leaves) into a vbo But is rebuilding the VBO even efficient (every frame that is, or when the camera moves)
For example if i have 5-6 visible leaves (each containing 500+/- triangles) pushing that much data into a vbo each frame is … honestly dont know , i think there are people more experienced that can share what’s best.
In case the entire object is visible there is an option where i can also keep all vertices in a vbo (cached when loading the engine) . So i just bind that VBO and send it , without rebuilding the full amount of vertices, but thats besides the condition if random leaves are visible each frame!

… unnecessary. You can use glMultiDrawElements to draw 5-6 separate ranges from the (static) VBO. The octree search would append one element onto the count and indices arrays for each node which passes.

Well, so rebuilding the VBO is not a practice with culling ? What is then ? I am just looking for a definitive answer, method.
Can you elaborate

… unnecessary. You can use glMultiDrawElements to draw 5-6 separate ranges from the (static) VBO. The octree search would append one element onto the count and indices arrays for each node which passes.

Also the other obvious solution to that is maybe to have a VBO/VAO for each leaf , then only rebind the leaves which intersect the frustum. So here arises the Question is rebinding bettter than rebuilding ? Guess yes ?



bool view_Culling(CFrustum& frusrum, COctree *node)
{
if (!node)
    return false;

if (frusrum.CubeInFrustum(node->GetCenter().x, node->GetCenter().y, node-       >GetCenter().z,(node->GetWidth()/2.0f)))
{
    if (node->IsSubDivided())
    {
        view_Culling(frusrum, node->OctreeNodes()[TOP_LEFT_FRONT]);
        view_Culling(frusrum, node->OctreeNodes()[TOP_LEFT_BACK]);
        view_Culling(frusrum, node->OctreeNodes()[TOP_RIGHT_BACK]);
        view_Culling(frusrum, node->OctreeNodes()[TOP_RIGHT_FRONT]);
        view_Culling(frusrum, node->OctreeNodes()[BOTTOM_LEFT_FRONT]);
        view_Culling(frusrum, node->OctreeNodes()[BOTTOM_LEFT_BACK]);
        view_Culling(frusrum, node->OctreeNodes()[BOTTOM_RIGHT_BACK]);
        view_Culling(frusrum, node->OctreeNodes()[BOTTOM_RIGHT_FRONT]);
    }
    else
    {
        //use prebuilded vbo here
        //or push these vertices here in a list later build a dynamic vbo
        //from them
        node->DrawOctree(node);
        return true;
    }
}
return false;
}


I will post this as a final note at what i want to acomplish, using multiple levels of tests.
The idea is that i have some kind of a BVH(maybe octree of AABBs) which will contain the Bounding Volumes of each object in a given scene. Then i will perfrom frustum tests on that structure to determine the nodes (and respectivly the models in that node) that are visible or not. Those that are not visible are simply not drawn, Those that are visible have 2 options:

  • They are Partly visible (their bounding volume is bigger than the frusum but still visible - Terrains,Big Buildings etc…) - Here is where the Octree from above comes into play. They will be tested against an Octree to determine exaclty which parts of them are visible. I wont go deep into my implementation but to clarify i can say that each object has modeldata property , modeldata can be reused among many objects. Each modeldata contains - vbo,ibo, vertices. Now it will contain prebuilt octree from the vertices so i can use it when i need to determine what to draw when the model is partly visible. I will probably test in object space since the vertices are already there and i will need to translate the frustum also in object space.
  • They are entirely visible (their bounding volume is smaller than the frusum) - Directly sent to be drawn
    Currently that is going to be applied to static , non-moving objects.

ALSO: How would i fit multiple objects bounding boxes in an octree (leaf/node?!) for the first test pass, where i should test against the objects as a whole. I cant actually grasp how i can devide the world space and put the bounding boxes(spheres) of the objects inside that octree

I worked out the above Questions after some time. But i was wondering, since my geometry is split between all leaves, so each leaf has some chunk of the terrain vertices (for now just vertices,no indices involved to avoid complications). So i am using flat out buffer where all vertices are ordered according to the indices, ready to be drawn. But since my geometry is split , how sould i specifiy offsets to the glVertexAttribPointer (or rather how would one obtain those offsets per leaf)
I want to draw with just specifying offset to the buffer without having to rebind or recreate a vbo for each leaf every time.
Thanks in Advance !

EDIT: What i have tried so far:

  • prebuilding 1 VBO per Leaf - Relatively Fast but i feel i can do better
  • bind VBO
  • render VBO
  • repeat for every visible leaf in Frustum (getting about 20-25 bind/draw calls on average depends on where i am looking ofc)
  • getting list of all visible nodes , rebuild a new vbo every frame containing all visible node’s vertices - Very Slow (even with gl_dynamic_draw)
    *get a list of all visible nodes
    *push each node’s vertices in an array
    *dynamicly create VBO with all vertices
  • bind the big VBO , render (1 bind , 1 Draw)
  • Here is the next option i want to try But dont really know how
  • bind the main/global Model’s vbo
  • for each visible leaf drawArrays/Elements with some offset and vertex count from the global Model vbo

I realize i cant very well avoid multiple draw calls but i can avoid dynamicly building buffers or rebinding each time i want to draw which maybe just enought to improve the perfromance.

If you’re using OpenGL 3.2 or later, you can draw as many sub-ranges of a VBO as you wish with a single glMultiDrawArrays() call.

With earlier versions, you can generate an index array dynamically then use a single glDrawElements() call; whether or not that’s faster than multiple draw calls will probably depend upon the number of primitives per draw call.

[QUOTE=GClements;1279206]If you’re using OpenGL 3.2 or later, you can draw as many sub-ranges of a VBO as you wish with a single glMultiDrawArrays() call.

With earlier versions, you can generate an index array dynamically then use a single glDrawElements() call; whether or not that’s faster than multiple draw calls will probably depend upon the number of primitives per draw call.[/QUOTE]

Great yea, glMultiDrawArrays may be what i am looking for. But yet again i am stumbled upon finding the offsets in each leaf.

If you use such approaches with arbitrary geometry, you either have to “cut” the geometry along the partition planes so that each portion lies within a specific region, or you have to allow for the fact that each object (or triangle) lies within multiple regions.

Objects don’t have to only exist in terminal nodes. You can just assign that object to the next level up node that completely encloses it. That’s what I do in Leadwerks.

While true, that can result in the object’s bounding box being orders of magnitude larger than is necessary (e.g. an object at the very centre of the octree would belong to the top-level node and thus always be “visible”).

Just out of curiousity. If i have an octree which divides certain geometry in leaves and i have a number of vertices per leaves that represent part of the model. How would one use one general VBO (that contains the entire model’s vertex data) to draw the visible leaves (using offsets in drawElements/Arrays).
I want to avoid having 1 VBO per leaf rather i want to have one general VBO that contains the entire model and just draw ranges from that VBO. But how would i generate the offset needed per leaf ???

The data for each node would be a (start,count) pair (or perhaps multiple such pairs) indicating the range(s) of indices corresponding to the geometry for that node. So to draw a given node, you’d append the node’s values to the indices and count arrays passed to glMultiDrawElements().

However, if you can have geometry belonging to multiple nodes, you’d first want to remove any duplicates prior to rendering.

Yea thats what i figured , but how would one find start. I have the count available, but how about the start. Should i get the first vertex from the node to be the start vertex ?
Also cant i use glDrawElements/Arrays to specify offset such as:
glDrawArrays(GL_TRIANGLES,first,count) ?