Transform and Lighting, bandwidth caching and interactivity...

I’m currently working on my own game engine and I am using data that is not compiled into a BSP tree. With that in mind, a level that I am loading has alpha channel transparency in some textures, and the level is broken down into many objects. I’m trying to figure out how I should incorperate transparency, and a way to optimize collision detection. The manner I was thinking of doing this was to use OpenGL to handle transformation and lighting, and to grab the results into world space, then transform solid ones into camera space, then sort and render transparent polygons, and then flip the page, then using the world space polys and calculate collision detection from them.

What I am wondering about though is performance. Letting opengl transform and light the objects and then grabing those transformed objects, will that still use like for example Radeons or GeForces GPU, and if it does I would imagine that bandwidth between the video card and memory would double, how much would you think that would make a diffrence? I realise I can reduce bandwidth by keeping a track of object changes and using the world cache if the object has not changed.

Does any one have any further experience dealing with this issue?

Hi,
You do realize that ‘sorting polygons’ without structures like BSP’s or portals isn’t that simple?
And to draw transparent textures it’s almost always required to sort polygons (or you’ll get visual artifacts)
As for ‘grabbing’ stuff from your 3d card, that is not adviseable… 3d hardware is not designed, and certainly not optimized to actually receive data back from the card…

All that matters is your target platform and what it’s capabilities are…
if you’re targetting cards that have a GPU, then use matrices for all your rotations and transformations etc.
Don’t worry about caching, trying to do anything but the most optimized path will get you low framerates…
These days 3d cards are incredibly fast…
Just think of this, i made a test engine that loaded a q3 map and rendered it to the screen without any bsp, just plain brushes…
I got 90fps or higher…
on a Geforce1…
Ofcourse, in a game you’ll need to do a lot more, like collision detection etc.
But it does show that drawing a little bit more than what you’re seeing and letting the GPU do all your transformations is a lot more effective than doing the transformations on the CPU and desperatly trying to draw only the polygons that you’ll see…

Keep in mind that moving data is probably a bigger performance drain than doing calculations (which preferably you’ll want to do using the GPU)
Try to do everything in as few steps as possible, and above all time everything and try every possibility you can think of, sometimes you might be suprised.
Try using structures like brushes instead of polygons, at least for collision detection…

Interestingly enough JC is using straight brush data parsed out of .map files in doom3…
Somehow he’s calculating edge coherency, t-spans & having at the very least some structure to do vsd with, and all of this when loading the level…
So it should be fairly fast…

That is why I was asking, to find out what is the most optimized path.

For collision physics, I need to view things in world coordinates, so that I may compare object motion and geometry to see if they collide, where they collide and then handle the collision. A scene/level, will be composed of places, but there are a lot of child objects to the world, such as instances of boxes crates and barrels, they will need to be transformed into world coordinates, which is either redundantly (once for rendering, another for physics calculations), or transformed only one time during rendering but allowing physics calculations access to an objects world transformed state.

Another part that is hard to deal with is games that use the mouse to select objects, this can be done by grabbing the geometry after projection before it is rendered, then see which polygon of which object is closer to the mouse, using simple math. While the other way to deal with that (and I have tried this before), is to use ray tracing, treat the mouse as a ray projected on the view plane through the view volume and see which polygon of which object it hits the closest. This also has diffrent ways of dealing with this in terms of redundant calculations, such as either translating polygons to screen coordinates when OpenGL already does this, or if using ray tracing, translating the ray into object space would be the fastest way, but still require some time setting up the matrices.

My concern as you can see is with redundant calculations. If there is a number of redundant calculations then this may not be such an optimized path as it seems. And I definetly agree that bandwidth is a strong issue with video cards, as I recall reading about the X Box being capable of rendering at high poly counts, but because of bandwidth will be limited, and they recommend using subdividable surfaces (beziers, nurbs, etc) to push the bandwidth down and polycount up. At the price of getting a GeForce today, I’d say that a number of people will have it in their computer some time in the future. So I’ll have to agree with you that caching is probably not the best route to go.

I still have to deal with transparency sorting though. It may not be so bad if I only use transparency minimaly in certain cases (in BSP trees, and particle systems), I could also simply use additive transparency for particle systems instead, but blending quality would be effected, and particle effects probably wont look as good or be limited to certain kinds of particles.