NVIDIA perf issue with compressed textures
I'm running an application on two different systems, one is a laptop with a NVIDIA GF 9800M (Win7 32-bit, Core 2 Duo, 4GB RAM), the other is a desktop with a NVIDIA GF 470 (Win7 64-bit, i7, 6GB RAM). You'd think the app would run just as good, or even faster, on the more powerful desktop system. Interestingly the application runs significantly slower on the desktop: about 6 FPS on the laptop, and less than 0.5 FPS on the desktop.
(The problem was first reported to me from a client using a Quadro card. I'm waiting to get system specs from them, and will follow-up here when I have that info.)
The app is OpenSceneGraph-based, so it opens an OpenGL 2.1 context. The dataset consists of 6 million vertices, all texture mapped, either triangles or tri strips. There is over a GB of texture data in DXT1 format and it's a mix of RGB textures and RGBA textures. The textures are in .dds files.
When the app comes up in a default view, the framerate is OK on both systems. But as I move the eyepoint closer to the model, the framerate suddenly drops significantly on the desktop. Note that OpenSceneGraph does some small feature culling, so it's likely that, as the eye moves closer, parts of the dataset begin to render that previously didn't render, probably requiring more texture data to be resident in GPU RAM. At least, that was my theory. But the fact that the problem is sompletely absent on the 9800M seems to contradict this.
The problem goes away if I disable texture mapping entirely. The performance is also acceptable if I render to a smaller window, but this doesn't necessarily imply a fill-limitation, as OSG's small feature culling renders less data in this case.
My laptop is running 260.99 drivers. On my desktop, both 260.89 and 275.33 exhibit the issue. The current release 280.26 seems to make the issue worse: the performance is poor right from the first frame, while the application is still in the initial view position.
Does anyone have any suggestions for how I can move forward to work around this issue? I'm considering converting all the textures to non-compressed to see if that changes the behavior, or limiting their mipmap base level size.
But, ultimately, I'd really like to know why my older, slower GF 9800M performs better with this dataset than my newer more expensive GF470...
Thanks for any help.