PDA

View Full Version : Speed of the rendering



blender
04-08-2002, 09:12 PM
My engine is now in the stage where it looks ok, but it's damn slow!
If I render my scene where is about 2000 polygons (with 512-1024 sized textures) and use
multipass lightmapping, fps is at close range something like 20 on a 1,4 GHz with GeForce2.
So when my engine gets further and there will be more larger maps and models that has thousands of polygons the rendering speed will be down.
If I use octrees (or something like that) and somekind of LoD, can that be enought to speed my engine? What 'speed-up' methods the commercial 3D games use to get good fps in a large levels?

Morglum
04-09-2002, 12:23 AM
Before thinking of advanced optimizations, are you sure you've achieved all the basic tricks, like :
-using interleaved vertex arrays
-converting your meshes to strips/fans (use NvTriStrip)
-glDisable(GL_NORMALIZE);
-disabling GL_BLEND when you're rendering opaque triangles
-state-sorting
-glEnable (GL_CULL_FACE);
-using floats instead of doubles

Sorry if these questions look silly, but 20 fps for 2000 tris is indeed to low for your hardware. Now, LoD is a good technique, and it's indeed used in many games, but you should be able to render the highest LoD at more than 20 fps ! So LoD can't alone solve all your problems. For octrees, I don't know much about them, but I thought it was about visible-surface determination. Here again, it won't solve all your problems. For example, if your model is right in front of you, I can't figure how octrees might help (but here I might be mistaken). Last, you know, 512x1024 is a big size for a texture. Such a texture occupies 1,5 MB of texture memory ! (if it's 24bit RGB) . All your texture shouldn't be that large, even for the highest LoD, unless you have very few textures.

Hope this helps
Morglum

MickeyMouse
04-09-2002, 01:36 AM
hi!
don't get offended by what i say but i think your present engine's stage is just the beginning
to speed up your engine you should use some combination of technics depending on what kind of world you have
what i would do at the beginning of my engine is
1. build bsp for my geometry
2. implement frustum culling using bsp
3. make good hash for triangles you render every frame to minimize glBindTexture
4. implement sectors & portals to become level-size independent
5. work on special effects, different ways of texture mapping, lighting etc...

that's (points 1 to 4) what i guess makes computer games fast(?) nowadays
good luck!

blender
04-10-2002, 06:23 AM
Could you explain to newbie some of those terms:

-interleaved vertex arrays
-glDisable(GL_NORMALIZE);
-state-sorting

And about BSP, I dont really know anything about it. I have read something about it somewhere, but newer what they are used for.

blender
04-10-2002, 09:14 AM
Have you seen the features of incoming Soldier of fortune II? It has 3000 polygons in each player models and 1500 in view weapons. It uses 1024 sized textures, curved surfaces, large and detailed out-door environments and other hard things.
So how is that possible to run on any PC?

zeckensack
04-10-2002, 10:51 AM
Originally posted by blender:
Have you seen the features of incoming Soldier of fortune II? It has 3000 polygons in each player models and 1500 in view weapons. It uses 1024 sized textures, curved surfaces, large and detailed out-door environments and other hard things.
So how is that possible to run on any PC?

1)Geometry transfer optimizations like VAR/DRE/CVA/Display lists or a mixture of some of these techniques
2)Relaxed but efficient culling that doesn't eat too many CPU cycles and instead relies on the relatively abundant geometry and clipping capacity of modern hardware
/edit
3)They've found a good tradeoff between points 1 and 2 http://www.opengl.org/discussion_boards/ubb/wink.gif

[This message has been edited by zeckensack (edited 04-10-2002).]

Elixer
04-10-2002, 11:32 AM
AFAIK, SOF2 only use those big textures IF you have a decent gfx card that can handle it, something like a GF4 TI card, or a GF3 500 or so. They also use some fancy tricks, that are way over your head for a beginner.

Use google.com to search for those terms you do not know, there are allot of docs about BSP, vertex arrays, and all that stuff.

Morglum
04-10-2002, 10:38 PM
*** interleaved vertex arrays :
do you know what's a vertex array ? simple : that's an array of data about your model (containing vertex position, texcoords, normal vectors....) that you send to opengl in only a few API calls (glVertexPointer, glTexCoordPointer...) and that you get drawn in only one call ( glDrawArrays / glDrawElements ). It's very efficient.
An interleaved vertex array is a vertex array in which data is organized in such a way that is can be drawn even faster.
As Elixer says, you should look for a tutorial. There's certainly one on nehe.gamedev.net . This topic is also well covered in the "opengl superbible".

*** glDisable (GL_NORMALIZE);
disables automatic normalization of normal vectors. If your normal vectors are already normalized, GL_NORMALIZE is a pure waste of time.

*** state-sorting
in opengl, state-changes, like calls to glEnable/glDisable, glBindTexture, glMaterial... can be expensive. For example, if you've 1000 tris : 500 tris with texture 1 and 500 with texture 2, you should do:
glBindTexture (1);
render_all_tris_having_texture_1 (1);
glBindTexture (2);
render_all_tris_having_texture_1 (2);

instead of
for (i = 0; i < 1000; i++)
{ glBindTexture (texture_of_triangle (i));
render_triangle (i);
}

The first program is much faster because is call glBindTexture only twice, whereas the second program calls it 1000 times.
got it ?

Hope this helps
Morglum

mikael_aronsson
04-10-2002, 11:27 PM
Hi !

Couldn't glDisable( GL_NORMALIZE) create weird output as the normals may get incorrect after transformations ?, and most modern hardware does not have any extra cost for using GL_NORMALIZE, I do belive that for example all nVIDIA cards runs at the same speed with or without GL_NORMALIZE, at least that's what I read somewhere.

But as usual, I might be wrong here http://www.opengl.org/discussion_boards/ubb/biggrin.gif
Mikael

Lucretia
04-11-2002, 04:39 AM
Originally posted by blender:
My engine is now in the stage where it looks ok, but it's damn slow!
If I render my scene where is about 2000 polygons (with 512-1024 sized textures) and use
multipass lightmapping, fps is at close range something like 20 on a 1,4 GHz with GeForce2.
So when my engine gets further and there will be more larger maps and models that has thousands of polygons the rendering speed will be down.
If I use octrees (or something like that) and somekind of LoD, can that be enought to speed my engine? What 'speed-up' methods the commercial 3D games use to get good fps in a large levels?

One problem I had was that logging was on and slowed down rendering to a crawl.

Make sure you turn off any logging that you've put in for debugging purposes.

I'd also say, use compiled vertex arrays too.

Luke.

Michael Steinberg
04-11-2002, 04:52 AM
As i understood it, you only need re-normalization when you are applying nonuniform scaling.

Michael

Bob
04-11-2002, 05:07 AM
You need normalization for all kinds of scaling.

Michael Steinberg
04-11-2002, 05:15 AM
Hmmmmm. I thought uniform scaling doesnt affect the normals, at least theoretically. Why should OpenGL scale the the normals when if is uniform scaling?

Deiussum
04-11-2002, 01:40 PM
Originally posted by Michael Steinberg:
Hmmmmm. I thought uniform scaling doesnt affect the normals, at least theoretically. Why should OpenGL scale the the normals when if is uniform scaling?

Becuase it can change the magnitude of the normals. i.e. you scale up x,y,z by 2, the normals would become twice as long.

zeckensack
04-11-2002, 02:56 PM
There's also an option (an extension http://www.opengl.org/discussion_boards/ubb/confused.gif ) to rescale the normals instead of a full normalize. This of course only produces useful results for uniform scaling.

Deissum: I'm nitpicking here, but aren't normals transformed by the inverse modelview? That would mean larger scaling factors would make the normals smaller. This fits with my experience, ie scaling >1 produces darker lighting results.

Michael Steinberg
04-11-2002, 08:58 PM
Deissum, yes, in the case where opengl transforms the normals as well. As i understood the mathematics, rotations dont rescale the normals, as uniform scales dont need to. So, in the case where we have a modelview-matrix with translation, rotation and uniform scaling, we wouldnt even have to work on the normals. Hmmmm. But from what you said i think opengl does it anyways.

BitBang
04-12-2002, 01:32 AM
Hi people,
I am trying to get to work a 3d engine as well and it is unencouraging that you have a good 3d acceleration card which you thought it could handle millions of triangles per second and when you try to render a model with merely 4000 triangles you obtain a frame rate lower than 20 fps. How is that possible? I know there are a lot of things to optimize, and I will try, but simply as it is now, it should give a decent fps, according to the advertisement of you 3d card. In my case I have a 3D prophet with Kyro II, and 64 Mb, and a 1300 AMD. It should be enough to amaze me, but it is not. I wonder if it would be better with directx.
Really frustrating.

Regards

DemonDaz
04-12-2002, 02:11 AM
A good way to optimise culling that I have found is instead of using the standard 6 Planes, use 3.

cull everything behind, too far away and then generate a viewing cone .

Basically obtain the angle from the viewpoint to the object you are testing and if it is within a predefined angle it should be drawn.

MickeyMouse
04-12-2002, 03:17 AM
frustum culling in all quake's is done using bsp, thus it's VERY fast to cull only faces you want to

you use 5 planes (left, right, top , bottom & near)

for each plane you keep an extra information on what corners (2 corners) of bounding box of your leaf need to be tested (as you may have noticed you don't need to test 8 corners of bounding box, but only 2 of them)
->look into quake3 code BoxOnPlaneSide function (in assembler to speed up a bit)

then going recursively down in your bsp tree you keep another extra information (usually flag) which says against what planes present node's (or leaf's) bounding box need to be tested - that's because there's no need to test bounding box against any plane which gave you positive result of testing for it's parent bounding box

if you're looking for some easy to understand code of engine that does all these things take a look at "titan" engine (http://talika.fie.us.es/~titan/titan/)- i found it very helpful when learning about bsp and other interesting technics
hope it may help you

Morglum
04-12-2002, 04:29 AM
mikael_aronsson, GL_NORMALIZE induces an important slowdown on my athlon xp+geforce3 ti200.

As Bob says, the normals have to be renormalized after ANY scaling , uniform or not.

Furrage
04-12-2002, 05:56 AM
I have never been fortunate to use an OpenGL accelerated card, but based on what I've read you have to be careful that you are not doing something to put you in software mode. 4000 tris should not render so slowly. So it might be good to look at your pixel format and whatever is enabled to see if it makes a difference. Someone with a GeForce 2 or Radeon could probably do a generic test with 2000 textured tris and see what their frame rate is like.

blender
04-12-2002, 06:28 AM
I'm been thinking that is it possible to do simply a LoD that decreaces the polygons from brushes when they go further away from the camera?
If the engine would somehow take directly polygons off from the brush and snap the resulted non-linked verts or something like that.

BTW, does mipmaps affect like some kind of LoD? Like, more further away the textured polygon goes, the smaller the texture goes, affecting to fill rate.
If mipmapping doesn't do that, could that in someway be implemented easily?

[This message has been edited by blender (edited 04-12-2002).]

blender
04-12-2002, 06:32 AM
Furrage, the slowness might be becouse of multipass lightmapping WITEHOUT vertex arrays.

I'm going to make my engine use EXT_compiled_vertex_array, and sort the brushes by surfaces (solid, blended etc.).

Deiussum
04-12-2002, 07:48 AM
Originally posted by zeckensack:

Deissum: I'm nitpicking here, but aren't normals transformed by the inverse modelview? That would mean larger scaling factors would make the normals smaller. This fits with my experience, ie scaling >1 produces darker lighting results.

Yup. You're right. Normals are mutliplied by the inverse matrix, so in my example the magnitude of the normals would be halved.

Deiussum
04-12-2002, 07:52 AM
Originally posted by Michael Steinberg:
Deissum, yes, in the case where opengl transforms the normals as well. As i understood the mathematics, rotations dont rescale the normals, as uniform scales dont need to. So, in the case where we have a modelview-matrix with translation, rotation and uniform scaling, we wouldnt even have to work on the normals. Hmmmm. But from what you said i think opengl does it anyways.

Normals are always multiplied by the inverse of the model view matrix. In the case of rotations and translations, the magnitude isn't affected.

If you take rotations as an example... they have to be transformed by the matrix. Say you have a simple quad facing the screen, normals pointing towards you... now you rotate the quad 180 degrees... if the normals weren't affected by the matrix in this case, they'd still be pointing at you and the lighting wouldn't have changed.

blender
04-14-2002, 07:58 AM
Does anyone know how to adjust mipmapping?
I mean that how can I set the distance where the texture changes into smaller one.

Miguel_dup1
04-14-2002, 08:30 AM
A recommendation I have found useful is the use of average normals... Good stuff http://www.opengl.org/discussion_boards/ubb/smile.gif

And when using LOD, I recommend the use of techniques such a ROAM or Quad(used by tribes).

Both of these enhance the performance abruptly...

ROAM or Quads implmementations may become a Max Pain, but hey, once they are done you live la vida loca. There are also LGPL implementations of both of these techniques....

For average normals? Simply take all polygons that share a single vertex and only use that one in the glNormal()
It takes the overhead of computing normals the way you would do, but with an extra step of determinig if there are polygons with a common vertex.

http://www.opengl.org/discussion_boards/ubb/smile.gif

Michael Steinberg
04-14-2002, 10:18 PM
Deissum, yes, everything i meant applied to length. Translation and rotation do not affect length. Uniform scaling need not to affect length. However, OpenGL applies the general case.

blender
04-16-2002, 02:16 AM
Here is some of my old code:




for(n=0; n<NumFaces; n++)
{
pFace = &amp;Face[n];
pTextr = &amp;Texture[pFace->Id];
pLmap = &amp;Lightmap[pFace->Lid];

glDisable(GL_ALPHA_TEST);
glDisable(GL_BLEND);
glBindTexture(GL_TEXTURE_2D, *pTextr->Texture.iData);

glBegin(GL_TRIANGLE_FAN);
for(m=pFace->Start; m<pFace->Start+pFace->Num; m++)
{
glTexCoord2f(Verts[m].Tv.x, Verts[m].Tv.y);
glVertex3f(Verts[m].Pos.x, Verts[m].Pos.y, Verts[m].Pos.z);
}
glEnd();

glEnable(GL_ALPHA_TEST);
glEnable(GL_BLEND);
glBindTexture(GL_TEXTURE_2D, *pLmap->Texture.iData);

glBegin(GL_TRIANGLE_FAN);
for(m=pFace->Start; m<pFace->Start+pFace->Num; m++)
{
glTexCoord2f(Verts[m].Lv.x, Verts[m].Lv.y);
glVertex3f(Verts[m].Pos.x, Verts[m].Pos.y, Verts[m].Pos.z);
}
glEnd();


Then the new using vertex arrays:




glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);

for(n=0; n<NumFaces; n++)
{
pFace = &amp;Face[n];
pTextr = &amp;Texture[pFace->Id];
pLmap = &amp;Lightmap[pFace->Lid];

glVertexPointer(3, GL_FLOAT, 0, Brushes.Vert_Array);
glTexCoordPointer(2, GL_FLOAT, 0, Brushes.TCoord_Array);
glLockArraysEXT(0, NumVerts);

glBindTexture(GL_TEXTURE_2D, *pTextr->Texture.iData);
glDrawArrays(GL_TRIANGLE_FAN, pFace->Start, pFace->Start+pFace->Num);

glUnlockArraysEXT();
}

glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);


It takes a while, and the it crashes.
At start-up I have put all the vertex and
texture coordinate information into the arrays, but havent yet sorted them so the coordinates would apply to the vertexs.
Still why does it crash? I havent had much time to get into using vertex arrays byt sill...

04-26-2002, 07:29 AM
Well...

I won't go into OpenGL optimisation because so many people know this stuff better than I do here. However, when optimising, you should really really really get your hands on a "profiler". A profiler will tell you where most of the time is spent executing your code. It's really great because quite often, most of the time is not used by the graphic card but by your own code. In any case, just look for the biggest time spender in the profiler and optimise that one as much as you can, often until it is no longer the biggest source slowdown. Then mode on to the one who's now the biggest slowdown and so on. I cant stress this enough, don't get into modifying your software blindly optimising anything/everything.

PkK
04-28-2002, 05:39 AM
You can find a free profiler at AMDs website. It profiles programs compiled with VC++ 6 or .net and runs only on AMD processors. Probably some free compilers for GNU/Lnux, working on gcc-compiled programs exist, too.

barsanpipu
04-29-2002, 02:21 AM
I'm not that much of an OpenGL expert but here is something that might be wrong:

I've noticed that by using display lists you loose ALOT of speed. I don't know why, as display lists should in theory be faster. I have a Gforce2MX 200 64MB and without display lsits I can render about 300000 filtered-textured, no backface culled, non trianglestripped polys and get about 5 FPS. I haven't optimised nothing and i still have GL_NORMALIZE enabled. So this is tottally unoptimised. Running the same thing from a display list i get 0.5 FRS; (with GL_COMPILE not GL_COMPILE_AND_EXECUTE)
Why, i can't say?
If someone can...

bakery2k
04-29-2002, 08:33 AM
Are you re-building the display list every frame?

blender
04-30-2002, 06:04 AM
Can I use ARB_multitexture with vertex arrays? What I mean, is can I give two texture array ponters?