My new GF4 slower than my old GF3!?!

Well, before I pull out ALL of my hair, perhaps I should ask here…

I am working on a mapping application for a military training product. I had the code working great while using my GeForce3 (Visiontek) which I’ve been using for 8 months or so.

I was just today handed an eVGA GeForce4 Ti4600 to test in its place.

I installed it and ran my app. It was terrible. I went from 50-60 fps down to 1 fps or worse.

The app is NOT optimized, got plenty of optimization to do, but it was running great on the GF3 with a representative data set. When I use the exact same data set in the new card, performance goes into the toilet. SO… I reinstalled the older GF3 drivers to see if it might make a change. Nothing.

GF3 Driver: 15.70 (win2k)
GF4 Driver: 28.32 (win2K/XP)

The app creates a winding GL_QUAD_STRIP which splits the entire earth into 6 degree wide by 8 degree tall quads (except at poles, they are 2 degrees tall) texture mapped with a single 512 x 512 low-res texture using a cylindrical (?) projection. Actually there are TWO quad strips, one for the northern hemisphere and one for the southern.

Then bitmaps and geometry info for smaller areas of the earth, say Phoenix Arizona or Northwest Poland, are read in and individual quads are defined for them. Each of the quads is textured with a unique 256 x 256 24bit texture.The whole thing is stored inside a single display list (with surface normals for lighting)

The app reports that the individual textured quads number less than 500 and the quads in the quad strip can be calculated to be around 1440.

Once again, it ran fantastic with my GF3 with 64MB ram, but the GF4 Ti 4600 with 128 MB is SUCKING on it, bad.

Am I being naive here? I can answer any questions you may have about specifics such as lighting, GL_TEXTURE_ENV modes, whatever.

I got the GF4 mainly for the added memory because the more I have the more of the earth I can map in hi res. I am not doing anything with vertex shaders, pixel shaders, or any advanced features or extensions at all.

Thanks in advance for any light you may shed on this problem.

MikeM

Wow, opportunities for my first two posts both in the same day. What are the odds?

Actually, this probably won’t solve your problem, but something similar happened to me recently so I thought I’d offer it up. We had loaded a proof-of-concept type application for a demo on the computer provided. To our horror it barely ran at all. Same video card, processor, etc. as our development machine. Eventually we discovered the bit depth was set lower than we usually used. I believe it was 16pp instead of 32bpp. Setting it to the usual depth brought our frame rate back up.
As I recall, it wasn’t exactly OpenGL related, but rather a section of code that copying texture data to a memory device context - when the formats were both effectively BGRA, it was fast. But since the memory device context was created compatible with the display, the app was having to convert down to 16bpp, and thus the drop in speed.

Now that I’ve typed all this, it’s probably going to turn out to be a driver issue or something. Still, someone someday might find this useful…

Chris Bond

Chris,

You don’t think your response is gonna help me, yet you STILL WASTE MY FREAKING TIME with all that crap!

You got SOME NERVE!

Sheesh!

Just kidding! your idea about bit depth was dead on. I changed from 32 bpp to 16 and it runs like a champ again! Thanks a ton!

MikeM

If nothing else has changed in your code, the dramatic performance drop makes it appear as though your app is using a software renderer.

Just out of curiosity Mike, when you start up your application could you print out the vender,renderer and version lines for the current OpenGL implementation, you might get a surprise.

Another piece of advice. Even though there are often a lot of uninformed or somewhat off topic reply’s to posts, you don’t encourage people who might have some insight into your problem by responding so negatively. I don’t think that any of us wish to foster that sort of environment.

I frequently get misunderstood when I try to use my signature caustic sarcasm!

It was sort of a barb at the precipitous drop in manners around here lately. Totally meant in a joking manner!

I thought it might be funny for folks to read the first few lines and say… ‘oh no, here we go again…’

Anyway, it bombed!

Thanks Chris, again, for your quick reply and great suggestion. Let me know if I can do anything for you in return.

and heath, thanks for the other info. I will try it and post it sometime tomorrow.

Thanks again!
MikeM

Hey, heath, give the guy a chance!

Originally posted by MikeM:
[b]Chris,

You don’t think your response is gonna help me, yet you STILL WASTE MY FREAKING TIME with all that crap!

You got SOME NERVE!

Sheesh!

Just kidding! your idea about bit depth was dead on. I changed from 32 bpp to 16 and it runs like a champ again! Thanks a ton!

MikeM[/b]

Perfectly fine answer. I could almost feel the relief of a problem solved, stress levels dropping and stuff. I like it.

So what was the cause? I thought GF3/4’s were optimized for 32bit textures?

Yea, what WAS the cause? I’m looking into it this morning. My first guess was that with 16 bit textures I was just under the memory limit, but with 32 bit textures I was over the limit and everything was being paged between system and GPU memory for every frame. But… I always thought that the pixel format of the textures being loaded was absolute, based on the call to gluBuild2DMipmaps() or glTexImage2D(), and not dependent on pixelformat of the hglrc.

Anyway, that wouldn’t explain why it worked fine on the GF3 with a 32bpp display color depth.

(BTW, none of this is full screen. It is a windowed app)

One other thing that may or may not have any bearing on this… When I make the call to gluBuild2DMipmaps() I found through trial and error that I must set the ‘format’ parameter to GL_BGR_EXT, although the documentation (Red Book) only mentions GL_BGR. Anyone know why the Microsoft implementation renames this define and if it does the same thing as GL_BGR?

Thanks again for your assistance.

MikeM

[This message has been edited by MikeM (edited 04-11-2002).]

Re: BGR vs BGR_EXT

The MS headers are for OpenGL 1.1, which did not include BGR as a supported format in the basic specification. The BGR enumerant was an extension that was rolled into the core specification in 1.2 (?)

What depth is your desktop set to? I’ve never tried it, but using a 16bpp window context on top of a 32bpp desktop (or vice versa) might cause some performance issues.

Have fun,
– Jeff

It may be some other framebuffer property was requested which wasn’t available in hardware on GF4 at 32bpp. Stencil + Alpha + big depth buffer for example(this isn’t a REAL example, just an illustration of what I mean). I’m still surprised by the result though.

I did some testing…

I found that when the Windows desktop color depth is set to 32bpp, I get around 75 pixelformats available in the system, and 36 of the top 39 are 32bpp, hardware accelerated, double buffered, draw to window pixelformats.

When the Windows desktop color depth is set to 16bpp, I get 50 unique formats, some of them are even 32bpp color depth, but all of THOSE are software rendering, not hardware.

The pixelformat scoring system I use heavily emphasizes the importance of Draw-to-Window and Double-Buffering as well as a heavy bias for color depth >= 16bpp.

I’ve added code now to the system to do view frustum culling of large chunks of textured geometry, so the performance is screaming again. I don’t know what caused it, but I have added a logger which outputs the details of the selected pixelformat out to a file every time it runs, so if I run into these problems again, I can then tell if I slipped into a software renderer, and avoid blaming the hardware.

Man, I would HATE to work at nVidia. You create something as amazing as their cards are, and you STILL have lunkhead people (like me) constantly boobing about them…

MikeM