PDA

View Full Version : Issue with (compiled) vertex arrays



Fugit
08-05-2001, 01:33 AM
I'm writing a quadtree engine. Until recently I was using a glBegin()/glEnd() pair to do a GL_TRIANGLE_FAN for all of the nodes to be drawn. I was getting around 70fps most of the time.
I decided it would be far more efficient to use vertex arrays, and so I converted all the terrain points to have a vertex, normal, and texcoord array element. Then, I changed the drawing code to basically this:

glBegin(GL_TRIANGLE_FAN);
glArrayElement(Node->TriFanIndex);
glArrayElement(Node->FanPointIndex[0]);
..
..
glEnd();

This work(s) fine, but I only get 40-50fps, rising to 70 when I move right to the edge of the terrain and most of it is frustum-culled out of the quadtree. I thought this was strange, and so added the compiled vertex array extension support, but this made no visible difference at all.
Please, can you help? http://www.opengl.org/discussion_boards/ubb/smile.gif

-Kieren

Nutty
08-05-2001, 02:03 AM
Dont ever use glArrayElement. If you're gonna use that, you may as well just use standard immediate mode commands.

What you should do is use glDrawElements, or glDrawArrays. To do this within a quad tree, you'll have to setup vertex arrays per leaf node of the quadtree. So that you can easily draw an entire array without having to extract the polys in that node.

One possible solution is to create a display list per leaf node, so that when traversing the quad tree, you only have to call the display list. (Though this might use too much vid memory to be practical)

Hope that helps.

Nutty

P.S. Compiled Array extension doesn't really do that much, except cache transformed vertices for if they're used again. Only ever used in indexed mode. How big this cache is, I dont know. It seems that this extension should be good for multi-pass rendering, prodived the cache of transformed vertices was big enuff. Doing a single pass, you're very unlikely to get any speed gain out of it.

[This message has been edited by Nutty (edited 08-05-2001).]

Fugit
08-05-2001, 02:19 AM
Woohoo, glDrawElements() (index-based) just what I was looking for http://www.opengl.org/discussion_boards/ubb/wink.gif Thanks

Fugit
08-05-2001, 02:39 AM
Hmm, I nearly forgot, I can't use one (big) glDrawElements() call, as I'm using triangle fans at the moment. Calling it once, per triangle fan, gives me 55-60fps, which is OK, but would it be better to use triangles instead, less (or one) glDrawElements() call, but using more memory? (for the indices)
Thanks again,

Kieren

Fugit
08-05-2001, 02:46 AM
Reading over what you said again, Nutty, I have a few things I should add http://www.opengl.org/discussion_boards/ubb/smile.gif
First, every node is a "leaf" node (assuming that means it contains polys) - I use the quadtree for dynamic LOD. Also, because of this LOD, nodes when rendered can have up to 4 extra polys to remove artifacts ('cracks') in the terrain... so I can't really do big static display lists, or arrays, even though the actual terrain is static.
So in your opinion, do you think it would be a better idea to (1) use glDrawElements to draw each triangle fan, or (2) add GL_TRIANGLE indices to an array, then draw that array ... although that uses more memory?
Thanks http://www.opengl.org/discussion_boards/ubb/smile.gif

Fugit
08-05-2001, 06:25 AM
Dawww, just tried building a big index list and calling glDrawElements()... 35fps on average http://www.opengl.org/discussion_boards/ubb/frown.gif
I've tried building the buffer then drawing all in one go, drawing + resetting every 1000, 100 triangles, that just makes it worse.
Please, someone help? :I

The Legend
08-05-2001, 06:38 AM
Have you ever thought that cvas may be the wrong way to speed up your engine?

When the scene always changes, using arrays & displaylists efficently gets tricky.

Nutty
08-05-2001, 06:39 AM
Hmmmmmm......

If every node, is a leaf node, i.e. it contains polys, then I assume your quadtree is only 1 level deep?

Is this right?

Nutty

P.S. Why not use Tri-strips instead of Tri-fans?



[This message has been edited by Nutty (edited 08-05-2001).]

Fugit
08-05-2001, 06:50 AM
Actually, the quadtree is up to 7 levels deep.
Basically, for each node, I calculate an LOD, which corresponds to a recursion depth.
If I want to draw a node at LOD 4, I simple recurse no further than depth 4 for that node, and draw the polygons in that node - which are infact an approximation (pretty much) of the nodes below it..
The problem with tri-strips is, well, I can't one to approximate a terrain with different LODs... can I? :eek:

Fugit
08-05-2001, 06:54 AM
I know this will be helpful in describing what I do at the moment (doesn't include recursion code).
You should also note that I call UpdateQuadtree() first, which does all the LOD, culling, etc., then RenderQuadtree(), which traverses and renders the quadtree.




if (*Node->EdgeIndices[0] > 1)
{
ArrayIndices[NumArrayIndices++] = Poly->FanBaseIndex;
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[0];
ArrayIndices[NumArrayIndices++] = Poly->ExtraFanPointIndices[0];

ArrayIndices[NumArrayIndices++] = Poly->FanBaseIndex;
ArrayIndices[NumArrayIndices++] = Poly->ExtraFanPointIndices[0];
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[1];
}
else
{
ArrayIndices[NumArrayIndices++] = Poly->FanBaseIndex;
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[0];
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[1];
}
if (*Node->EdgeIndices[1] > 1)
{
ArrayIndices[NumArrayIndices++] = Poly->FanBaseIndex;
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[1];
ArrayIndices[NumArrayIndices++] = Poly->ExtraFanPointIndices[1];

ArrayIndices[NumArrayIndices++] = Poly->FanBaseIndex;
ArrayIndices[NumArrayIndices++] = Poly->ExtraFanPointIndices[1];
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[2];
}
else
{
ArrayIndices[NumArrayIndices++] = Poly->FanBaseIndex;
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[1];
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[2];
}
if (*Node->EdgeIndices[2] > 1)
{
ArrayIndices[NumArrayIndices++] = Poly->FanBaseIndex;
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[2];
ArrayIndices[NumArrayIndices++] = Poly->ExtraFanPointIndices[2];

ArrayIndices[NumArrayIndices++] = Poly->FanBaseIndex;
ArrayIndices[NumArrayIndices++] = Poly->ExtraFanPointIndices[2];
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[3];
}
else
{
ArrayIndices[NumArrayIndices++] = Poly->FanBaseIndex;
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[2];
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[3];
}
if (*Node->EdgeIndices[3] > 1)
{
ArrayIndices[NumArrayIndices++] = Poly->FanBaseIndex;
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[3];
ArrayIndices[NumArrayIndices++] = Poly->ExtraFanPointIndices[3];

ArrayIndices[NumArrayIndices++] = Poly->FanBaseIndex;
ArrayIndices[NumArrayIndices++] = Poly->ExtraFanPointIndices[3];
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[0];
}
else
{
ArrayIndices[NumArrayIndices++] = Poly->FanBaseIndex;
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[3];
ArrayIndices[NumArrayIndices++] = Poly->FanPointIndices[0];
}

if (NumArrayIndices / 3 >= INDEX_ARRAY_SIZE)
{
// draw, reset
glDrawElements(GL_TRIANGLES, NumArrayIndices, GL_UNSIGNED_INT, ArrayIndices);
NumArrayIndices = 0;
}


I'll cry soon.. *sniff*

-Kieren

08-05-2001, 06:55 AM
hi

perhaps you should try to split your terrain up in several "vertex-buffers". some buffers will never be touched in a particular frame. that saves some memory bandwith, i think . ..

if you're using a GPU (geforce, radoen) use the appropriate extensions (fences for geforce) to make use of parallelism.

just my 2 cents

freakyboy

Fugit
08-05-2001, 12:01 PM
That made no difference either.. grr :|

...*sniff*...*sob*...

zed
08-05-2001, 12:07 PM
basically youre not drawing enuf polygons in one call
using fans u would do i assume 8 tris per call.

try to draw as many tris per call (but not to many, this varies but if u stick to under 4000 vertices u should be ok)

without using extensions
fastest is glDrawArrays()
then glDrawElements()
..
begin..end()

BUT if the geometry changes (ROAM) using begin..end might be quickest.

CVA's occur a perfromance hit, they only should be used if u draw the geometry with multiple passes.

in my game i believe i render blocks of 4x4 quads. rendered as tris is 4x4x2 = 32 tris per call i one degenrant tri_strip, but only cause i change textures quite a bit + cause i use quite fine culling.

from my experiments (which ive done quite a bit)

with these sized blocks 8x8,16x16,32x32
a 16x16 block is the quickest. 16x16x2 = 512 tris plus degenerants

08-05-2001, 01:03 PM
I just (finally) solved a similar problem with Bezier patches. I'm also doing dynamic LOD with crack fixing at the edges, and I wanted to figure out how to do this without changing my vertex arrays or sending unecessary verts to OpenGL.

After much experimentation, I came up with an algorithm that I like http://www.opengl.org/discussion_boards/ubb/smile.gif Instead of storing the verts for each patch linearly (top to bottom, left to right, or whatever), I store them in the order that they are used, from lowest LOD to highest. For example, the lowest LOD just uses the four corner verts, so those are 0, 1, 2 and 3 in the array. The next LOD adds five internal verts, so those are 4, 5, 6, 7 and 8 in the array. And so on.

This way, your indices for any given LOD will not be spread randomly around the vert array (which results in unecessary transformations).

I don't know if this will be helpful to you, and the details of implimentation are non-trivial (at least it took me a while to work out :-) But in short, figuring out how to use vertex arrays efficiently with dynamic tessellation is tricky, so I sympathize http://www.opengl.org/discussion_boards/ubb/smile.gif

Fugit
08-05-2001, 11:00 PM
Well, thanks for all your input... but I've decided to create my own mixture http://www.opengl.org/discussion_boards/ubb/smile.gif
Basically, I will fill up a buffer with triangle fans - probably by specifying the base point, number of fan points, then the fan points. This is because I tried using glBegin()/glEnd() as an alternative to glDrawElements() (i.e., I drew the buffer myself) with a slight performance INCREASE http://www.opengl.org/discussion_boards/ubb/smile.gif
So, I thought, why not just optimize that more and use fans? http://www.opengl.org/discussion_boards/ubb/smile.gif I'll try that when I get back from London today, I'll try to post the results here... I'm sure you're all simply dying to know the outcome.. :|
Well, bye bye http://www.opengl.org/discussion_boards/ubb/smile.gif

Thanks again everyone

-Kieren