Speed of the rendering

My engine is now in the stage where it looks ok, but it’s damn slow!
If I render my scene where is about 2000 polygons (with 512-1024 sized textures) and use
multipass lightmapping, fps is at close range something like 20 on a 1,4 GHz with GeForce2.
So when my engine gets further and there will be more larger maps and models that has thousands of polygons the rendering speed will be down.
If I use octrees (or something like that) and somekind of LoD, can that be enought to speed my engine? What ‘speed-up’ methods the commercial 3D games use to get good fps in a large levels?

Before thinking of advanced optimizations, are you sure you’ve achieved all the basic tricks, like :
-using interleaved vertex arrays
-converting your meshes to strips/fans (use NvTriStrip)
-glDisable(GL_NORMALIZE);
-disabling GL_BLEND when you’re rendering opaque triangles
-state-sorting
-glEnable (GL_CULL_FACE);
-using floats instead of doubles

Sorry if these questions look silly, but 20 fps for 2000 tris is indeed to low for your hardware. Now, LoD is a good technique, and it’s indeed used in many games, but you should be able to render the highest LoD at more than 20 fps ! So LoD can’t alone solve all your problems. For octrees, I don’t know much about them, but I thought it was about visible-surface determination. Here again, it won’t solve all your problems. For example, if your model is right in front of you, I can’t figure how octrees might help (but here I might be mistaken). Last, you know, 512x1024 is a big size for a texture. Such a texture occupies 1,5 MB of texture memory ! (if it’s 24bit RGB) . All your texture shouldn’t be that large, even for the highest LoD, unless you have very few textures.

Hope this helps
Morglum

hi!
don’t get offended by what i say but i think your present engine’s stage is just the beginning
to speed up your engine you should use some combination of technics depending on what kind of world you have
what i would do at the beginning of my engine is

  1. build bsp for my geometry
  2. implement frustum culling using bsp
  3. make good hash for triangles you render every frame to minimize glBindTexture
  4. implement sectors & portals to become level-size independent
  5. work on special effects, different ways of texture mapping, lighting etc…

that’s (points 1 to 4) what i guess makes computer games fast(?) nowadays
good luck!

Could you explain to newbie some of those terms:

-interleaved vertex arrays
-glDisable(GL_NORMALIZE);
-state-sorting

And about BSP, I dont really know anything about it. I have read something about it somewhere, but newer what they are used for.

Have you seen the features of incoming Soldier of fortune II? It has á 3000 polygons in each player models and à 1500 in view weapons. It uses 1024 sized textures, curved surfaces, large and detailed out-door environments and other hard things.
So how is that possible to run on any PC?

Originally posted by blender:
Have you seen the features of incoming Soldier of fortune II? It has á 3000 polygons in each player models and à 1500 in view weapons. It uses 1024 sized textures, curved surfaces, large and detailed out-door environments and other hard things.
So how is that possible to run on any PC?

1)Geometry transfer optimizations like VAR/DRE/CVA/Display lists or a mixture of some of these techniques
2)Relaxed but efficient culling that doesn’t eat too many CPU cycles and instead relies on the relatively abundant geometry and clipping capacity of modern hardware
/edit
3)They’ve found a good tradeoff between points 1 and 2

[This message has been edited by zeckensack (edited 04-10-2002).]

AFAIK, SOF2 only use those big textures IF you have a decent gfx card that can handle it, something like a GF4 TI card, or a GF3 500 or so. They also use some fancy tricks, that are way over your head for a beginner.

Use google.com to search for those terms you do not know, there are allot of docs about BSP, vertex arrays, and all that stuff.

*** interleaved vertex arrays :
do you know what’s a vertex array ? simple : that’s an array of data about your model (containing vertex position, texcoords, normal vectors…) that you send to opengl in only a few API calls (glVertexPointer, glTexCoordPointer…) and that you get drawn in only one call ( glDrawArrays / glDrawElements ). It’s very efficient.
An interleaved vertex array is a vertex array in which data is organized in such a way that is can be drawn even faster.
As Elixer says, you should look for a tutorial. There’s certainly one on nehe.gamedev.net . This topic is also well covered in the “opengl superbible”.

*** glDisable (GL_NORMALIZE);
disables automatic normalization of normal vectors. If your normal vectors are already normalized, GL_NORMALIZE is a pure waste of time.

*** state-sorting
in opengl, state-changes, like calls to glEnable/glDisable, glBindTexture, glMaterial… can be expensive. For example, if you’ve 1000 tris : 500 tris with texture 1 and 500 with texture 2, you should do:
glBindTexture (1);
render_all_tris_having_texture_1 (1);
glBindTexture (2);
render_all_tris_having_texture_1 (2);

instead of
for (i = 0; i < 1000; i++)
{ glBindTexture (texture_of_triangle (i));
render_triangle (i);
}

The first program is much faster because is call glBindTexture only twice, whereas the second program calls it 1000 times.
got it ?

Hope this helps
Morglum

Hi !

Couldn’t glDisable( GL_NORMALIZE) create weird output as the normals may get incorrect after transformations ?, and most modern hardware does not have any extra cost for using GL_NORMALIZE, I do belive that for example all nVIDIA cards runs at the same speed with or without GL_NORMALIZE, at least that’s what I read somewhere.

But as usual, I might be wrong here
Mikael

Originally posted by blender:
My engine is now in the stage where it looks ok, but it’s damn slow!
If I render my scene where is about 2000 polygons (with 512-1024 sized textures) and use
multipass lightmapping, fps is at close range something like 20 on a 1,4 GHz with GeForce2.
So when my engine gets further and there will be more larger maps and models that has thousands of polygons the rendering speed will be down.
If I use octrees (or something like that) and somekind of LoD, can that be enought to speed my engine? What ‘speed-up’ methods the commercial 3D games use to get good fps in a large levels?

One problem I had was that logging was on and slowed down rendering to a crawl.

Make sure you turn off any logging that you’ve put in for debugging purposes.

I’d also say, use compiled vertex arrays too.

Luke.

As i understood it, you only need re-normalization when you are applying nonuniform scaling.

Michael

You need normalization for all kinds of scaling.

Hmmmmm. I thought uniform scaling doesnt affect the normals, at least theoretically. Why should OpenGL scale the the normals when if is uniform scaling?

Originally posted by Michael Steinberg:
Hmmmmm. I thought uniform scaling doesnt affect the normals, at least theoretically. Why should OpenGL scale the the normals when if is uniform scaling?

Becuase it can change the magnitude of the normals. i.e. you scale up x,y,z by 2, the normals would become twice as long.

There’s also an option (an extension ) to rescale the normals instead of a full normalize. This of course only produces useful results for uniform scaling.

Deissum: I’m nitpicking here, but aren’t normals transformed by the inverse modelview? That would mean larger scaling factors would make the normals smaller. This fits with my experience, ie scaling >1 produces darker lighting results.

Deissum, yes, in the case where opengl transforms the normals as well. As i understood the mathematics, rotations dont rescale the normals, as uniform scales dont need to. So, in the case where we have a modelview-matrix with translation, rotation and uniform scaling, we wouldnt even have to work on the normals. Hmmmm. But from what you said i think opengl does it anyways.

Hi people,
I am trying to get to work a 3d engine as well and it is unencouraging that you have a good 3d acceleration card which you thought it could handle millions of triangles per second and when you try to render a model with merely 4000 triangles you obtain a frame rate lower than 20 fps. How is that possible? I know there are a lot of things to optimize, and I will try, but simply as it is now, it should give a decent fps, according to the advertisement of you 3d card. In my case I have a 3D prophet with Kyro II, and 64 Mb, and a 1300 AMD. It should be enough to amaze me, but it is not. I wonder if it would be better with directx.
Really frustrating.

Regards

A good way to optimise culling that I have found is instead of using the standard 6 Planes, use 3.

cull everything behind, too far away and then generate a viewing cone .

Basically obtain the angle from the viewpoint to the object you are testing and if it is within a predefined angle it should be drawn.

frustum culling in all quake’s is done using bsp, thus it’s VERY fast to cull only faces you want to

you use 5 planes (left, right, top , bottom & near)

for each plane you keep an extra information on what corners (2 corners) of bounding box of your leaf need to be tested (as you may have noticed you don’t need to test 8 corners of bounding box, but only 2 of them)
->look into quake3 code BoxOnPlaneSide function (in assembler to speed up a bit)

then going recursively down in your bsp tree you keep another extra information (usually flag) which says against what planes present node’s (or leaf’s) bounding box need to be tested - that’s because there’s no need to test bounding box against any plane which gave you positive result of testing for it’s parent bounding box

if you’re looking for some easy to understand code of engine that does all these things take a look at “titan” engine (http://talika.fie.us.es/~titan/titan/)- i found it very helpful when learning about bsp and other interesting technics
hope it may help you

mikael_aronsson, GL_NORMALIZE induces an important slowdown on my athlon xp+geforce3 ti200.

As Bob says, the normals have to be renormalized after ANY scaling , uniform or not.