PDA

View Full Version : Heavy slowdown when rendering front-to-back



Jan
03-08-2004, 09:25 AM
Hi there

Since the Advanced Forum still seems to be down, iīll post my question here. It isnīt that advanced anyway.

So, i started a new engine. I began with rendering my sectors as simple as possible. Sorted by texture and than brute-force, letting the GPU do the rest (bf-culling, depth-sorting, etc.).

Speed was as expected. For 6500 textured triangles (no shaders), rendered in 8 batches (8 texture switches) i got 190 FPS.

A z-only pass speeded it up, after i added shaders.
Now i thought i could speed up the efficiency of that z-only pass by rendering it front-to-back. No problem. No textures, no color-writes, only a few big batches, just the order of the indices changed.

However, instead of speeding up a bit, it slowed down from 125 FPS down to 35 !!!

This is all on a Radeon 9600XT.

I read ATIs SDK and there i found a passage, which says that random vertex-accesses are worse then sequentiel updates, because of the pre-T&L cache.
Anyway a slowdown of 90FPS ??? Is this still expected behaviour?

The SDK also says, that aligning data on 32 bytes will increase random access speed. My vertex-data is 64 bytes big. I use VBO, so the driver should be able to align it very well, no?

I donīt understand this heavy slowdown. Anyway, BSP-trees seem to lose their advantage in 3D rendering, because of the heavy cache misses they cause.

Jan.

dorbie
03-09-2004, 12:49 PM
Remember you're introducing other overheads for example those texture state changes that you can no longer sort for. It's not entirely clear at what level you depth sorted. Seems like it may have been primitive level at best, you know that's going to be expensive.

Jan
03-09-2004, 02:01 PM
No, in a z-only pass i donīt use any textures. I still use the same amount of glDrawRangeElements-calls, etc. everything the same, but the order of the elements has changed. So, no other overhad, except for the random vertex access.

Jan.