VAR - good on GF2mx, bad on GF3ti

I have a large(ish) heightmap (about 1000x1000 verts) which i am rendering using VAR.

On my GF2mx, i get a hefty speed increase (on the level of 3x faster) but switching to a GF3Ti500 the performance between VAR and regular drawElements calls is negligable (with VAR coming out slightly slower).

I am not using fences as of yet (since everything fits into AGP mem) - could this be the source of the issue?

Has anyone had similar problems?

Are the GF2 and GF3 on the same PC ?

no, the gf3 machine is slightly faster (only a few mhz in it). other than that, the only real difference is the gf3.

It seems very odd that there is such a huge gap when using var in one instance and then a performance loss in the gf3 case.

I am going to try the gf3 in the slower machine soon…maybe it is just the extra beef in the current gf3 system (but i doubt it).

It could be differences in Memory bandwidth affecting the result, and/or AGP configuration.

The cause is most definitly differences between the two systems.

Nutty

Check if you have your motherboard drivers installed (for AGP access) on the PC with the GF3. Check also your BIOS settings.

I dont think it is a hardware problem, since there is a visible improvement in performance in the learning_var sample on the gf3 machine when VAR is turned on.

I think opla is correct.

Maybe the AGP aperture size is too small on the gf3 machine to handle your terrain. But it could be enough for the learning_var sample.

Just a thought.

Do swap the cards and run the tests as well. If VAR is behaving differently it is more likely due to the mobo chipset than the GF2/GF3 difference.

One additional thing is, make sure you’re using indices of GL_UNSIGNED_SHORT for your drawing calls. GF2 only supports that size, so the driver forces ints to shorts. GF3 supports both ints and shorts as indices so if you send in ints, that what gets sent to the hardware.

Thanks -
Cass

> One additional thing is, make sure you’re
> using indices of GL_UNSIGNED_SHORT for
> your drawing calls. GF2 only supports
> that size, so the driver forces ints to
> shorts.

When I profiled this, almost a year ago, it was faster to use UNSIGNED_INT indices instead of UNSIGNED_SHORT indices. When I used UNSIGNED_SHORT, some percent of my driver time would be spent in a loop converting shorts to longs, and another equal percentage would be spent converting longs to shorts. When I sent in longs, only the second loop was present in the profile.

Granted, this could have changed, and it might make sense to re-run the profile.