optimizations

in the quest for higher performances, i’d like to optimize the way i send to opengl informations.
in particular, i’d like to minimize the workload of transfers to t&l cards like the geforce.

since i want my models to be fairly accurate (many polygons), i minimize transfers to opengl by performing a object-space polygon culling (also known as early culling)
once i have selected the pvs of polygons, i have to send them to opengl, and there are three ways to go:
-straight opengl with glVertex, glTexcoord, …
-display lists
-vertex arrays

as far as i know these methods are the only existing… maybe there are anothers?

the objects are moving, so i won’t rely on exploiting coherency in some way: polygons will frequently be in or out the pvs.

what is the best way to go?
…first method apart, i’m currently using it

how about using some “mesh to strip” algo every frame?

thank you people

Dolo//\ightY

This is interesting, I’m searching something too on this topic.
What I would like to experiment, is the use of some algorithm to compute progressive level-of-detail meshes. I have a code example I found somewhere in the net… let me know if you are interested.
Also, I read that using compiled vertex arrays is one of the best ways to send geometry.
i.e. with CVAs you can exploit vertex sharing, and vertex processing occurs only once in multipass rendering.

[This message has been edited by paolom (edited 02-17-2000).]

yes i’m interested paolom !

i made some benchmarking, these are the results.

config:
Pentium II 350 128 MB windows 98
riva TNT agp, windowed 400x400, 32 bpp
early culling, release version
model is textured and chromed, 770 verts, 1482 faces

straight opengl: 90,92
display lists (compile): 85,48
display lists (compile&execute): 8,35
vertex arrays: 91,72
compiled vertex arrays: 88,87

it seems the way to go is with vertex arrays.

the model is a starship, with normals and texture coordinates per face (3 of each)
the model is then chromed with a second pass using opengl sphere mapping and additive/screen blending (GL_ONE,GL_ONE)
the geometry for the chroming pass is not evaluated again: this to exploit vertex array pre-transform (let’s call it this way)

it’s strange: with display lists in compile mode i create the model’s list an then i draw it with glCallList(), and it runs smooth.
when using compile/execute mode for glNewList() opengl is supposed to create the list and draw things on the fly, so should be optimized somehow and probably faster than a separate call to glCallList() after the list has been created…
but it run with snail-speed…

as always, it depends on implementation !
i’ll benchmark it on the g200 at work, and on the geforce (when it comes in…)

Dolo//\ightY

Hi dmy!

“…a second pass using opengl sphere mapping and additive/screen blending (GL_ONE,GL_ONE)…”

How do you avoid a z-buffer-fight in the second pass? I tried to put a texture onto a sphere and in a second pass a kind of reflection with blending (I do not want to do multitexturing!). But sometimes the second Texture is in front, sometimes the first one, which looks more then ugly. I helped myself by scaling the second sphere a little bit larger but this will not work with more complex objects. Please could you tell me a solution?

Thanks.

I read a Nvidia paper about alternative methods and their speeds on GeForce, but I can’t find it. Tried to look for it on Nvidia site without success but I’m pretty sure it can be found there somewhere. http://www.nvidia.com/developer/

Originally posted by Marc:
How do you avoid a z-buffer-fight in the second pass? I tried to put a texture onto a sphere and in a second pass a kind of reflection with blending (I do not want to do multitexturing!). But sometimes the second Texture is in front, sometimes the first one, which looks more then ugly. I helped myself by scaling the second sphere a little bit larger but this will not work with more complex objects. Please could you tell me a solution?

If the spheres are the exactly same for both passes you should use glDepthFunc(GL_EQUAL) for the second pass.
But using this approach, I have problems with some (not all) NVidia drivers for my TNT card:
If I enable lighting for only one pass and disable it for the other, there are z-buffer artefacts! If the lighting conditions are equal (on or off) for both cases, everything is okay!

Kosta

For both passes lighting is disabled; glDepthFunc(GL_EQUAL) doesn’t work (I think I have tried out every possible combination for glDepthFunc for the first and the second pass); and I’m using the nVidia Detonator 3.68-driver on a Viper 550-board (TNT) on Win NT 4.0.
How does that thing with polygon offset work? Could it be a solution?

Have you tried GL_LESS for the first and GL_LEQUAL for the second?

Yes, but it doesn’t fix the problem, it only gets a little less bad
By the way the sphere is put into a displaylist and called twice with different textures and the second time with blending enabled, so the vertex-coordinates are absolutly the same.

Originally posted by Marc:
By the way the sphere is put into a displaylist and called twice with different textures and the second time with blending enabled, so the vertex-coordinates are absolutly the same.

Can you post the code? So we can all search for a bug.

Have you tried to turn on lighting calculations for both passes?

Kosta

Err… I have the code at home. I’ll try to post it tomorrow.
Lighting should remain off because the object looks like self-illuminated (glowing ball reflecting environment).

No code here, but I managed the problem with glPolygonOffset (no help-topic in VC5.0)!!! I don’t understand the parameters but I fiddled out a good set . Thank you all for your contribution.

A possible alternative solution to Marc’s problem - disable z-buffer writes on the first pass, and re-enable them for the second. Depth testing is still enabled for both passes.

glPolygonOffset is notorious for being a hack, and is likely to give different results on different systems.

I don’t want to do depthsorting (and all transformations) by myself, so I have to leave z-buffer writes activated. Otherwise the object (a sphere) will not be rendered correctly in the first pass.

I simply use GL_LEQUAL all the time (no switching) and any time they over lap it works fine. As I recall to many state changes can effect peformance and changing the z buffers state may also cost performance so I try to use only one and it works for me. This should not be a real problem as far as I know, but I think this may not work with all implementations (like your TNT). The best test I have tried is when seperate specular is not available and I use an additive (one,one) redraw of specular only on top of the texture object. This has worked on 4 opengl implementations that I have tried it on, TNT2U, GeForce256, Glint 500tx, and realizm. I have not tried overlapping texturemapped polygons on top of each other yet though, so I don’t know for sure.

What does the problem look like visually, do some parts look correct and other parts look uneffected, or does it look like discoloration?

hi all

yes, marc’s problem it’s strange…
as the Citizen does, i use GL_LEQUAL too.
i always make tests on a matrox G200, a nvidia riva TNT, and the GeForce. (NV15 chip, am i right?)

it works perfectly, no artifacts at all, in any condition.

about polygon offset… well… i don’t like hacks. it works very good, yes, but it could make some problems pop out, like that z-fight in halflife, down in the sewers…

Dolo//\ightY

I use GL_LEQUAL too without any problems until I made those spheres. They are build out of quad-stripes and without polygon offset most of the single quads look okay but on some quads are lines where the ‘inner’ sphere is in front of the ‘outer’ one. While animating the whole thing flickers rather disturbing.