Code speedups and FPS

I have a strange problem, I would like to know what exactly is the problem in terms of the hardware. I am guessing it is the bandwidth of my video card that is the problem, but I don’t understand how it works exactly.

I spent some time the last few days, rewrote all of my math libraries with SIMD and get a huge performance boost. I timed them with the pentium timer, and I cut down the number of cycles by 50% or more on many operations. I’d say on average my math library is now 50% faster on a P3/P4.

I then compiled my simple level into a display list. I timed the original render, which was about 500,000 cycles. With the display list, it is now 100,000 cycles to draw the scene.

So in every respect I have at least doubled or even tripled the performance of my CPU code.

But the FPS I am getting is only marginally better. Should I be getting higher frame rates now that my code is so much faster? It seems weird to me. I am guessing I am maxed out on how fast my video card can draw? I am not locked to it in anyway I just run the loops as fast as possible and update the screen. I shuold be getting a huge boost in framerate but I’m not. Or should I?

Any help understanding this would be appreciated.

Well I just thought about something, this I have to test and see. I am not sure how long it takes to swap the GL buffers. Maybe this is why I am getting the same FPS.

one thing to look at is whether the vsync is on, it normally is by default, in which case you’ll always be capped to the monitor refresh rate - control it with wgl_swap_interval_ext or use your card’s control panel… otherwise, your bottleneck is elsewhere, not in those math libs.

You say you compile your scene into a display list. In this case, can I assume that your rendering code consists of a single glCallList() call and little more?

If so, it’s only logical that your performance stays the same, since it only calls your optimized maths routines during the display list compilation. The optimizations have no effect whatsoever on the compiled list or on the resulting framerates when rendering it. Your list probably does compile faster than before, though.

– Tom

Try rendering in a smaller window to see if you’re geometry or fillrate bound.
If it’s getting faster with a very small window, your card doesn’t have enough fillrate.
If not, you either have vsync on or the geometry engine or your app is maxed out.
If it’s your swap time, try to get a PFD_SWAP_EXCHANGE capable pixelfomat.

with computers (+ a lot of other things)
youre only as fast as your sloweest part (the bottleneck)
eg u can make all the optimizations to the cpu code u want but it will make not a single iota of difference if u were fillrate bound in the first place.
the basic rule is
‘find the bottlenecks before u optimize’

  • not
    ‘optimize something that u *THINK is whats causing the slowness’

*often what u think is wrong