Getting lousy frame rates...

I appreciate in advance any help that I can get on this topic because it is frustrating me to no end…

I have used the NeHe model for my OpenGL application and I placed a frame rate counter in my application to see if I could measure how much my application slowed down each time I added an object to the display. For some reason the screen is running at close to 75 FPS when no object is displayed to the screen and then it steps down considerably (with noticable chop when changing the projection matrix to move the ‘camera’) once I add a simple object to the screen. When I add a second object the scene slows considerably again.

I tried running my application in the Visual C++ debugger to see what could be causing the slowdown. From what I could tell, the only call that was causing a considerable performance hit was when the native SwapBuffers function was called. I was getting code performance of close to 600fps when this line was commented out. Once I put the command back in, the performance dropped to less than 30fps.

Seeing as how I am running on a fast machine with a well accelerated graphics card I think that I should be seeing faster frame rates than this. Could someone please let me know how I could possibly speed up my OpenGL performance?

Thank You,
Jason

Let me guess… you have a Voodoo card and your application is running in a window?

It may be that I’m totally wrong, but I believe to recall that Voodoo cards are not able to render into a window, only full screen. So, one possibility might be, that your program renders everything with the OpenGL software rasterizers.
Another one, looking at what you say about swapping the buffers, sounds to me like it’s simply the double buffering in window mode.
In fullscreen, for swapping front and back buffer, simply 2 memory addresses have to be swapped (visible and invisible area of the video memory). In windowed mode, the back buffer has to be copied into the front buffer and displayed in the window (which will probably even include a GDI operation, urk…).

My suggestion: Try running it fullscreen and see if that helps any.

Dodger,

 Thank you for your reply!  I have been on so many forums that people either flame or do not reply on subject.  Thank you again...

Unfortunatly, I am not running a Voodoo card.  I am running under an NVIDIA TNT2 based graphics card with 32meg of video RAM.  I have also tried running at fullscreen (as shown in NeHe tutorials) and this has not cleared up the problem.  In fact I am unsure if the NeHe model for fullscreen is complete.  I get desktop artifacts (like my task bar) that show through my window.  I will keep on trying different things and I will see what I can get.  Keep on posting though...any help is appreciated.

Thank You,
Jason

Actually, even if you do have a Voodoo based card, you should be able to use the newest drivers for it and should get fairly good framerates even in windowed mode. In fact, on my Voodoo3 I get higher windowed FPS than full screen in NeHe’s apps. One thing: is your TNT AGP or PCI? Also, do you have any other vid cards on the system? If so, you need to go into display properties and disable them since your not doing any driver selection in-app. This will insure automatic hardware acceleration. Also, make sure you’re building your code using the “release” model in VC++ rather than Debug when testing actual speed as this can have a HUGE effect on the performance of your app. Of course, you probably know this already. We might be able to help you more if you would post your machine specs (CPU, RAM, Vidcard type, etc.) and also what kind and number of objects you’re adding (textured/colored etc.). Also, when you said you get 300FPS un-buffered, is this even while rendering an object? If you’re loading textures, be sure you’re loading them before the rendering loop so that you don’t have disk I/O with each frame rendering. Hope some of this helps.

OK, maybe some of this may help out more…

Machine specs :

AMD Athlon 750MHZ
128 MB RAM
TNT2 Video Card w/32 MB RAM AGP (only card in system)

Objects :

Approx 200 triangles per object in textured mode (although I get the same performance in wireframe mode)

All objects are loaded into display lists and the textures are only loaded once on application initialization and applied to the object in the compiled display list

Window :

640x480x16 Window running in either windowed or fullscreen mode (found that either one gave the same results)

When I mentioned earlier that I was getting 300 fps when running without the SwapBuffer command what I meant was that my entire rendering loop was running and I commented out the SwapBuffer line of code. When I ran the code in this manner I was getting 300 iterations of my main code with the render loop in tact. The only thing that I was not getting was the update to the display because I was not swapping the back buffer and the front buffer. When I ran my code full steam (with the SwapBuffer call back in place) the code iterations dropped to the above mentioned values.

What I was trying to do was to measure the amount of time my code required in order to execute and remove individual lines in order to determine what was causing the holdup. The reason that I was targeting the swap buffer command was that it was the portion that when removed allowed my code to run with acceptable speeds. If there is any other place that I should be looking please let me know.

I hope that this information is what you needed. Thank you for your help.

Thank You,
Jason

Hmmmm… Well, if you were rendering the same scene to the back buffer and the only difference between 300ips(iterations per second) and 30fps was that you were swapping the buffers, I would tend to agree that it looks as if instead of swapping the memory pointers, it’s actually trying to bitblt the content from the back buffer to the color buffer. Only problem with that is the fact that you should still be able to perform such a blt on a 640x480 surface and still maintain reasonable framerates. Especially on the aforementioned hardware! I could understand the rendering itself causing a problem with a lot of polygons and texturing without using backface culling but from what you’ve said you were doing the very same rendering when you got 300ips only without the buffer swapping. So since the rendering doesn’t seem to be the holdup it would seem that somehow it’s trying to do some sort of slower “swapping” operation (such as a bitblt) and perhaps it’s trying to do such in an unaccelerated manner. As to why your app would not be taking advantage of acceleration, I don’t know… And yet, it seems that it IS using hardware acceleration because I seriously doubt you’d be able to do the rendering to the back buffer at 300fps using the software renderer. One question, do NeHe’s unmodified apps exhibit this same behaviour?

Punchy,

Thank you for your attention to this subject. I am trying to get a working base for a graphics engine that I am starting and I would hate to have to scrap the code that I have written to switch over to DirectX (which is much more of a pain to work with than OpenGL even though I have experienced much better performance from that than with my current code)

The answer to your question is yes. Using the same iteration counter I tried doing the same thing in one of NeHe’s applications. First I tried using the empty app and I got a constant fps rate. Then I tried in one of his later tutorials and I experienced the same situation.

When running the base NeHe application (nothing displayed) with the swap buffers I was experiencing 74fps solid.

When running one of NeHe’s later tutorials (rendering a textured box to the display) I was cut down to approx. 30fps which is unacceptable for a game engine.

I have tried this on my computer here at work as well as on the one that I have at home. Both machines are fast…one is the above mentioned Athlon and the other is a PentiumIII ~500MHZ machine that also has a TNT based graphics card in it. Both situations gave me similar results. I know from the OpenGL Super Bible that SwapBuffers is a native Windows GDI function that can be slow, but without any other alternative to back buffer swapping I am at a loss. I do not see how companies can get 60fps+ in their games while using OpenGL…and I have SEEN those results running games like Counterstrike on my home PC. Is there something I am missing?

Thank You,
Jason

i say dont worry about it when youre begining with gl its a very variable thing eg thake the last two screenshots at this page http://members.xoom.com/myBollux
they both run at the same fps even though ones drawing less than 100 triangles and the other is drawing more than 11000 under götterdammerung

[This message has been edited by zed (edited 10-07-2000).]

What drivers are you using for your TNT2? Are you using the Detonator 3 drivers? My work machine is a PIII 600 with a TNT2 M64 and 128 Mb of RAM. My engine’s code is based more or less off of Nehe’s code and I get a framerate of ~160 fps with a 900 triangle model that is textured and lit. I have never used my fps counter code in Nehe’s stuff so I don’t know what the exact performance is (I will test this out on Monday first thing). FYI: The M64 model of the TNT2 uses RAM that is twice as slow as a regular TNT2 so you should be getting even better performance than me. Do you have VSync enabled?

Just wondering, I have the same framerate problem using GLUT, is using WindowsAPI any faster? Or does it matter?

The 75 fps sounds very likely to be the screen refresh rate.
If the SwapBuffers() is synced with the monitor refresh rate, you don’t get higher framerates than the monitor refresh setting.

With each time you don’t finish the drawing in the time of one frame (here 1/75 seconds) the framerate drops to 37.5 fps, and so forth (75/1, 75/2, 75/3, …)
Read also in the RedBook, Chapter 1 “The Refresh That Pauses”.

Try switching off the sync in the OpenGL control panel.

Relic could be right here. I need to check my settings but I also have a TNT2 M64 PCI no less and get much higher framerates than 30 on NeHe’s demos. I truly don’t know what could be happening other than a driver/setting problem or the refresh problem that Relic pointed out. Other than those possible causes, it’s tough to say since I can run the very same apps off of NeHe’s site and get better fps on SLOWER hardware (k62-380 w/tnt2 m64 pci). And when I use my Voodoo3 AGP, I get blazing fps! Try Relic’s suggestion and let us know if that does it. I’m interested in finding the solution to this since I’m new to OpenGL myself and would like to avoid these problems myself.

Thank you all for the help!

Relic, that sounds like it could very well be my problem when it comes to the screen refresh. I just have one question…how do you turn the screen sync off? I am running WindowsNT here at work and Windows98 at home and I am not sure where that setting can be switched.

Your help in this matter is GREATLY appreciated.

Thank You,
Jason

OK, here is the scoop…

I could not find the option using my current display driver to disable the vsync on the display so I downloaded the Detonator 3 drivers from NVIDIA for WindowsNT. After installing those drivers I found the options under the OpenGL Settings and I disabled the vsync for the display. When I ran my code I was getting much larger iteration counts when I was not displaying any objects (count that one as a success). To be exact, I went from ~75fps to ~175fps (actually not frames since I was not displaying my models to the screen only running the render code without objects) when the vsync was not enabled. Once I added a model to my screen I experinced a MAJOR performance hit…

When I wdded the first model to the screen the fps counter dropped to 24fps from the 175fps that I was getting, moving the camera (changing the projection matrix) caused the fps to drop to an impressive 7fps. That’s right 7fps. I am not sure if this card is working correctly with the Detonator 3 drivers or what but that really hurt my overall performance on this system. In order to fix this problem I installed the Dell supplied NVIDIA V770 drivers and I am currently getting WORSE performance than I was before I started changing drivers around. Now when I run the program everything is choppy and with one model on the screen I am getting only ~13fps out of the system. Here is the kicker…

…there are no options with this Dell provided driver to disable the vsync on the card.

I will keep trying different things on this system, but if anyone out there has any suggestions I am very open to them.

Thank You,
Jason

How many polys make up that single model you’re using? Also, do you have backface culling turned on? It really sounds like you’re not getting hardware acceleration. Anyone know how to tell if you are or not? Anyway, this is really strange. What about other OpenGL games? Do you have the same problem with them being slow or with any other screensavers or demos?

The problem is now fixed…

After installing the Detonator 3 drivers on my home machine I was still getting horrible frame rates with my application. Once I turned off vsync the application still had the same problems, however my iterations per second when no objects were displayed to the screen went up considerably. That led my to search deeper in my code for a problem…

It appears that the code that I was basing mine off of was loading the object texture into OpenGL INSIDE the compiled display list. You can already tell what was happening there…each time I called the display list, the same OpenGL texture commands were being executed over and over. This was also causing a color shift (don’t ask my why) when I would display the second object to the screen. Now my object class is correct and I only load the texture once upon object load from file.

Now I am getting frame rates between ~150 and ~200 fps (really, with vsync off I should say iterations per second since the monitor only displays the maximum fps of the refresh) from my code with full texture rendering.

Thank you all for the help!

Jason