Need better Performance!!!

I am currently working on a 3D driving simulator and have found the performance is well below what I would like it to be . I will give you a little info about how it has been implemented, first, and then give you my numerical data.

First, the game uses predefined city blocks to draw pieces of the world. All these predefined blocks are stored in display lists and when called are translated to their position in the world (Example: There are intersection pieces and straight road pieces). There are also predefined cars implemented the same way as the city blocks.

Now, on each piece there are about 150 polygons, mostly QUADS, and we draw at most 12 city blocks at a time. So this turns out to be about 1800 polygons per block, plus cars. The cars have about 100 polygons apiece, again mostly QUADS, and average about 6 cars per scene. So polygon count is on average about 2500 polygons per scene. Also we use no more than 5 textures per city block so possibly 60 textures per scene. Texture objects have been used to store the textures.

I am currently getting less than 20 fps running at 1152x864 resolution, in GLUT fullscreen mode, and a little over 20 fps at 640x480 resolution. These numbers seem really disappointing to me considering this is being run on a PIII 533 with a TNT2 card. What kind of frame rate should I be aiming for and how can I go about attaining this rate? The simulator is intended to run on as many machines as possible, so the faster it runs the better. Any help would be greatly appreciated.

Thanks in advance,
Kevin.

I can not tell you for sure what frame rate you should expect but could you give us some more info ?

For example, what is the size and depth color of your textures ???

If you are using 60 textures per frame, there is a chance that they do not all fit into your card’s memory.
In this case, you should always sort your objects per texture (which is not easy as you are using display lists…) and use only ONE glBindTexture per texture. This will avoid slow transfers between the main memory and the graphics card memory.

Have you tried disabling the textures to see if this is the bottleneck ???

Something else, are your objects stripped ?

Do you use OpenGL lights ?

I am pretty sure the problem comes from the textures but please, have a look at what happens when you disable them.

Eric

P.S. : if you need your software to be tested on other machines, you’ll find people for that in this forum.

Yeah, I agree with Eric. With such a relatively low poly count I just can’t figure any other bottlenecks.

I would extend the idea to sort primitives per material instead of per texture. This also reduces the number of state changes. Using vertex arrays might also be worth trying.

Yeah, agree with Hude. Sorting per material is a very good idea too ! (Well, indeed, I need to do it in my viewer… Perhaps my 300000 polygons would display faster ! )

Eric

Hello

Think I know a way to obtain ALOT better preformance. You said you were using GLUT fullscreen mode. However this is NOT “true” fullscreen, it’s a large window on the desktop. I suggest you go “true” fullscreen (if you are on Windows that is) with ChangeDisplaySettings()/ChoosePixelFormat() and similar functions. If this is the problem, I can guarantee you will gain loads of FPS here.

Send me a mail if you need the code for going fullscreen at any resolution/colordepth.

Bob

[This message has been edited by Bob (edited 03-22-2000).]

i don’t think he will gain so much more performances by simply changing the desktop settings… with 1152x864 the application runs at 20 fps, at 640x480 it still runs at 20 fps: this shows clearly that the bottelneck of the application is not the rasterizer stage.

in other words, currently it is not fill-limited.
so, to rasterize less pixels won’t help so much.

optimizations should be applied to the application or geometry stage.

however, opengl always renders to a window.
maybe it’s popup-style (no caption, resize frame, just the client area) and as large as the desktop, but it’s still a window.
glut switch to fullscreen by doing a popup window, but it hasn’t the ability to change the display settings itself.
game glut, instead, can switch modes, and it does by calling ChangeDisplaySettings().

Dolo//\ightY

I get your point dmy. Just want to say that I have tried running a window with size 640x480, then 640x480 fullscreen (same colordepth and so on), and there was quite a large FPS-gain. Not saying this IS the problem, just that it might be.

I think it has something to do with the system don’t have to care about windows, and other things to be draws outside the windows (the desktop or other apps for example). This would mean that keeping the size of the window but changing desktopresolution will affect preformance.

Well, enough for me. I agree with your possible faults, just came with a new one.

Bob

I agree with all said here. Here’s my 2cts. If it were me I’d remove GLUT from the mix and use my own stuff to create the windows, DC and RC.
I’d then change over to TRIANGLES instead of QUADS.
I’d use texture objects instead of regular tex mapping.
I’ve found Lights REALLY slow performance.
I already assume you’ve got culling on. Calc your own normals once and store+rotate along with the objects.
Try running on a machine with NO video card just to see if it’s helping at all. Depending on how you’ve coded it ( 60 textures ) you might not be making effective use of the board.
Hope this helps.
fs http://fshana.tripod.com

Thanks guys for all the input . Now, I’ll try to give you the answers to some of your questions.

Currently there is no lighting being used, surprisingling. I can’t believe the slow frame rate either .

Like I said before textures are in texture objects and all bindings are done in the display lists. This brings up a question of my own: If the bindings are in compiled display lists is the binding actually being done every time the list is called or is it just done the one time when the list is compiled? About the textures, they are 24-bit bitmaps, all very small no bigger than 256x256. I am loading them using AUX and using GL_MIPMAP_NEAREST_MIPMAP for most of them.

Now, the game used to have no textures at all and after adding all the textures the speed difference was hardly noticable, I profiled the code before and after and they were very close, except on machines with older 4MB cards. So I don’t think the textures are making that huge a difference, I think .

As for the Windows code we actually have another version running in a Win32 application not GLUT and it seems to be a little faster, visually. Sorry I don’t have the numbers for you it is still pretty buggie .

I think dmy may be on the right track, another guy working on the project removed one function from the application and the game was running at 115 fps. We’re not exactly sure why , because the function only included a couple of ‘if’ statements and about 4 assignments. So I think I’ll take a look at that today.

One more question for you guys about how the current drawing of the world is implemented. We have divided the world into a grid and only draw the squares around the user’s car as it drives around, all the grid squares are in display lists as I said before. Can any one think of a more efficient way of rendering the world?

Again thanks for your time,
Kevin

I am working with KGT on the simulator project. I have a windows version of the simulator that is running almost twice as fast.(Almost 40 FPS as opposed to < 20 FPS). I use the ChangeDisplaySettings() function. I am also using DirectInput as opposed to the winmm library, which might make a little bit of difference.

I totally agree with dmy, you should check your the code of your application.

The big problem should be in other calculations than graphics.

EDIT : HOLY! that is an old thread awaken from the dead. And I just saw that my post was useless. Shame on me

[This message has been edited by Gorg (edited 06-30-2003).]

Maybe too obvious, but have you installed recent Nvidia drivers for your TNT2? The MS ones that come with the system absolutely cripple performance.

How do other 3D OGL apps (try Quake 3) run on your machine? If they run as poorly as your code does, then the problem most definately lies in your system’s video subsystem.

I’d use texture objects instead of regular tex mapping.

Whats the difference?

Originally posted by KGT:
Now, on each piece there are about 150 polygons, mostly QUADS, and we draw at most 12 city blocks at a time. So this turns out to be about 1800 polygons per block, plus cars. The cars have about 100 polygons apiece, again mostly QUADS, and average about 6 cars per scene. So polygon count is on average about 2500 polygons per scene.
[skip]
a little over 20 fps

so, you say you render 'em all doesn’t matter if they’re visible or not? i guess plenty of polygons could be invisible in current camera area of view(frustum), etc. So why don’t you try to implement at least a frustum culling to start from?

[This message has been edited by matt_weird (edited 06-30-2003).]

This thread is 3 YEARS old! And now the answers starts to drop in . LOL!

Originally posted by dbugger:
This thread is 3 YEARS old! And now the answers starts to drop in . LOL!

LOL, indeed! …i just was following the topics…

I think the 20 FPS comes from the timer resoulution. Divide 1 second by 20 and you’ll get… 50 ms! Doesnt the windows WM_TIMER post messages in 55ms periods? T think that KGT uses(have been using?) WM_TIMER to refresh frames instead of calling PostRedisplay() inside display() function.

Standard Win32 timers are not guaranteed to have a resolution greater than the standard windows timer which remains at the 18.2 tick rate. In order to guarantee millisecond resolution, you need to use a multimedia timer. If you use a standard Windows timer, your program will be limited to 18.2 frames per second.

Additionally, if you need to signal your OpenGL thread to render the screen, you probably will want to use PostMessage instead of SendMessage. PostMessage will immediately switch to the other thread and process the message. Using SendMessage will cause an indefinite amount of time to pass before the OpenGL window thread will get a chance to process the message.

Originally posted by dbugger:
This thread is 3 YEARS old! And now the answers starts to drop in . LOL!

Hmm, I was wondering if something funny was going on with the dates on the server. But heck, better late than never.

Originally posted by Riff:
Additionally, if you need to signal your OpenGL thread to render the screen, you probably will want to use PostMessage instead of SendMessage. PostMessage will immediately switch to the other thread and process the message. Using SendMessage will cause an indefinite amount of time to pass before the OpenGL window thread will get a chance to process the message.

I though that SendMessage immediately switches to another thread, while the calling thread is stalled. But anyway, KGT: buy a better card, TNTs are pretty old