trees and polygons and tnl

Hi!

I just got close to a t&l capable card (elsa gladiac) so I tried to code my nvidia tree routine, but:
I could not raise the polygon count over 35000. After that the whole stuff is slowed down dramatically.
I watched the results in the treemark: about 128000 polygons with 21 fps. HOW?
What should I do to create something that fast?
How can I take the advantages of hw t&l? All I saw around was saying don’t do anything your driver will take care. But my code is so slow.

Here’s what I do:
I am creating a cube from 6 quads. I have a simple routine to make a 3d matrix of cubes (lest’s say 10x10x10).
I put it into a vertex list and simply call
drawArrays.

(I tried to lock the array, but there was no significant speedup).

By the way I use the 5.22 detonator driver.

thx.

Go to nvidia site and grab their t&l faq. There is might be something you are doing wrong.

There is just one type of data that is hardware accelerated right now and it is :

3 float for vertices, 3 float normals, 2 float for uv, and 4 something(I don’t remember) for colors

The restriction you mentioned only applies to the format which is accelerated using compiled vertex arrays (two sets of s,t texture coordinates, 4 bytes for the color, and one x,y,z point–though it’s possible more formats are supported in recent driver releases.) As I understand it, compiling is a noop for all other formats, but every format is hardware accelerated.

There are lots of things that might keep you from getting optimal performance on the GeForce. Like enabling 8 spotlights, or using two-sided lighting, or certain texture coordinate generation modes. Are you using triangle or quad strips? They’re faster than quad or triangle lists. Definitely take a look at the GeForce OpenGL T&L Performance FAQ here:
http://www.nvidia.com/Marketing/Developer/DevRel.nsf/FAQ_Frame

–Chris

Originally posted by mandroka:
[b]Here’s what I do:
I am creating a cube from 6 quads. I have a simple routine to make a 3d matrix of cubes (lest’s say 10x10x10).
I put it into a vertex list and simply call
drawArrays.

(I tried to lock the array, but there was no significant speedup).

By the way I use the 5.22 detonator driver.

thx.

[/b]

From what I know you should use Triangles not Quads for your object. Last i heard hardware has better acceleration for triangles than quads.

On the Detonator2?? drivers, I had various hardware locks in Win2k with my development and Quake3A.

There’s an nVidia-specific extension that lets you put your verts into video-memory or AGP memory, I believe that will get you a speedup. The tradeoff is that you have to exercise a little care when using it - reading the verts may be slow, and modifying them is bad.

Have a look at the spec here: http://oss.sgi.com/projects/ogl-sample/registry/NV/vertex_array_range.txt

A simpler option might be to stick your tree in a display-list. The tradeoff here is that you can’t edit a display-list at all once you’ve created it.

Mike F

[This message has been edited by Mike F (edited 06-12-2000).]

Thanks for the replies.

I’ve made a driver update (grabbed from elsa).
I’ve found several bugs in the 5.22, but the new one is working well.

But that was not a soulution for me. So I created a really small vertex list for only one cube. Locked that. And rewrote the cube matrix generator. Voila! It worked.

Here is the code:

void drawScene()
{
int i=0, j=0, k=0;
glPushMatrix();
glTranslatef(0,0,-120);
glPushMatrix();
glRotatef(h1,0,1,0);
glRotatef(p1,1,0,0);
for(k=0; k<xnum; k++)
for(j=0; j<xnum; j++)
for(i=0; i<xnum; i++)
{
glPushMatrix();
glTranslatef(-2num+i4,-2num+j4,-2num+k4);
glDrawArrays(GL_QUADS, 0,polynum*4);
glPopMatrix();
};
glPopMatrix();
glPopMatrix();
glFlush();
}

Where to move from here? Any ideas?

It looks like your code could really benefit from rendering more quads per glDrawArrays call. State changes (which happen in your code each time you change the modelview matrix) are expensive. For optimal performance you should render as many polygons as possible between state changes. Depending on the value of xnum, you may get better performance by rendering a slab or even an entire cube with a single call to glDrawArrays.

–Chris

Well, I’ll try to put a couple of cubes into my vertex array and look for the results.

But now it is time to do work…

Well, I did the following:

I put a 2d matrix of cubes into my vertex array. (30*30 pieces)

In the inner loop I just put it out several times (30). So I got 303030*6 polygons (i do not use backface cull) at 9.5 fps.
It means about 162000 polys. Not bad.

I tried to turn off texturing… it meant 5 fps.

Any ideas to make it faster. I’ll make the source will be available soon.