FPS seems locked on Nvidia

Atlbiz · May 19, 2008, 12:44pm

I recently put an old Nvidia FX5200 in my pc (was using onboard Intel chip) and noticed the FPS on my scenes seemed to be locked around 60 and 30. There doesn’t seem to be much of a linear decrease as my scene becomes more complex. It doesn’t stay right at 60 but usually fluctuates between 59.0 and 60.5 and then suddenly drops to 30 with a similar variance. As the scene becomes more complex, it then drops somewhat linearly all the way down.

I thought this was odd and wondered if the card was trying to sync with the monitor refresh or something. Does anyone know what might cause this? The integrated Intel chip didn’t seem to do this so I’m thinking it’s more hardware/driver related than my code.

Thanks,
Reg

ZbuffeR · May 19, 2008, 1:26pm

Nvidia drivers often allow to either force off, force on, or let the app decide the vsync. Maybe the default is “vsync on” whereas on intel is was “vsync off”.
On windows to can use wglSwapInterval(0) (or 1) to force off (or on) the vsync.

remdul · May 19, 2008, 1:29pm

Uncheck V-sync on the Nvidia configuration panel.

http://en.wikipedia.org/wiki/V_sync

Atlbiz · May 19, 2008, 2:32pm

Thanks Z, that did the trick although now I’m getting some jagged lines on vertical edges during movement in the 30-60 fps range. Would this be due to not double buffering? I thought I’m set to double buffer although that part of my code was setup when I was really new to openGL so it may not be correct.

Thanks remdul but I already checked that when I first started trying to figure it out and I couldn’t find that option in the Nvidia control panel.

ZbuffeR · May 19, 2008, 2:54pm

Hey, it is either 15,20,30,60 fps OR unclamped fps+jaggies ! Just think about it a minute.

We had a discussion about a good middle ground : dynamically enabling vsync above the display refresh rate (60Hz in your setup, apparently) to avoid tearing, and disabling vsync under this limit to avoid halving FPS. It works very well, and apparently quite a lot of games do this nowadays. Both for consoles and on the PC.

Here it is, read the whole discussion by the way
http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=196733#Post196733

Atlbiz · May 19, 2008, 4:39pm

Good read. Thanks for pointing me in the right direction.

zed · May 20, 2008, 5:16pm

We had a discussion about a good middle ground : dynamically enabling vsync above the display refresh rate (60Hz in your setup, apparently) to avoid tearing, and disabling vsync under this limit to avoid halving FPS. It works very well, and apparently quite a lot of games do this nowadays. Both for consoles and on the PC.

Here it is, read the whole discussion by the way
http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=196733#Post196733

this though seems to suggest disabling vsync if its under 30hz, from 30-60hz it just remains what it currently is.

also, its been ages since ive seen screen tearing in any app (with vsync disabled) is this because i’m using a LCD, ie is the problem is move relevant to CRTs?

yooyo · May 21, 2008, 2:15am

@zed.

Tearing is more viible on LCD’s. Did you try to play Doom3 or Quake4 with vsync off? Tearing is so visible especially when you shoot from machine gun… strobe effect looks very bad.

With vsync ON and with double buffering FPS is 60, 30, 20, 15, …
With vsunc ON and with triple buffering FPS is 60 or below, but w/o tearing.

Comparing triple buffernfg with dynamicaly vsync control, triple buffering will offer a bit lower FPS but no tearing. Dymamic vsync control will offer bit higher FPS but with tearing.

The problem is that app cannot control is the triple buffering turned on.

ZbuffeR · May 21, 2008, 10:52am

No, the aim is mainly to vsync off between 60 and 30. Between 30 and 20 the drop is less noticeable.
But there is some kind of hysteresis (is that an english word ?) in action.
Here is a description :

start with vsync on
while effective FPS is 60, keep it on
now actual frame rate falls a bit to 59 FPS
so effective FPS goes down from 60 to 30 FPS
if last frame duration is equivalent to 30 FPS or less : set vsync off
next frame gets effective 59 FPS
now scene complexity decrease, effective/actual FPS goes to 61
if last frame duration is equivalent to 60 FPS or more : set vsync on
next frame gets effective 60 FPS

also, its been ages since ive seen screen tearing in any app (with vsync disabled) is this because i’m using a LCD, ie is the problem is move relevant to CRTs?

No, as said yooyo it is still present. I recently (reluctantly) switched to an LCD screen. But apparently recent nvidia drivers activate vsync by default, unless the app explicitely disable it.

ZbuffeR · May 21, 2008, 11:01am

And on the triple buffering front, I have been somewhat disappointed :

it removes tearing as well as vsync
no more “halving framerate suddenly” like vsync
– but I found the resulting irregular framerate is even more annoying that either vsync on or off with classic double buffer.

The perfect solution ?
A 200Hz display !
So vsync on means 200,100,67,50,40… Not very noticeable drops in the interesting range.

zed · May 22, 2008, 2:19pm

ok looked a bit better at quake3, 1600x1200 ~fps90 on a gf7600GS
very hard to see, i think the tearing is changing constantly, even with looking closely i’m just guessing its there, from my memory of tearing you’ll see a horizontal tear in the screen that would be pretty constant over a second or two, like u often see tvs doing in films.
also in my game as well very hard to see, like quake3 only visable if i move the camera very quick (but like q3a its hard to tell if thats tearing or something else)
also with q3a, with force vsync ON in nvidia display settings fps is still ~90fps (it should be capped at 60fps)

anyways enuf of that.
Zbuffer u talk about effective + actual FPS, how to measure them?
effective is just the standard
1 / (time spent on last frame) i assume
but actual, how to query that? is it even possible or are you just assuming say 43hz will be shown at 30hz

ZbuffeR · May 22, 2008, 3:10pm

By “actual” I mean the one related to the actual render time the video card (ie. what you would get if vsync was off). Of course it is nearly impossible to measure if vsync is on (maybe with the nvidia timer ?).

from my memory of tearing you’ll see a horizontal tear in the screen that would be pretty constant over a second or two, like u often see tvs doing in films.

It is due to almost similar framerate, but not synchronised. TVs will be synchronised by the electric outlet, but an handheld camera will not.

This almost constant line can mean that vsync is on but there is an extra delay after the vertical blank : I had a crappy VGA card 15 years ago that had this problem, very ugly.
Or it can mean that vsync is off but the actual FPS is very near the display refresh rate : I think Doom3 has vsync off by default, but with a framecapping at 60 FPS.

vs987 · May 23, 2008, 10:37am

well it would be a nice concept, but it has flaws

how u measure last frame duration ??

the fact is, u cant,
even when using EXT_TIMER_QUERY u have the latency
of the frames still pending in the command queue,
plus u cant know if the driver uses double-/triple buffering and have to consider extra latency,

the problem is that, that this vsynccontrol reacts too late.

an intelligent automatic vsync control in the gfx driver/hardware
would be nice. if SwapBuffers is going to block, skip waiting for the vsync except framerate exceeds display refreshrate.
the chance that SwapBuffers is going to block could be reduced with triple-buffering

this is not really the perfect solution, its more like a bruteforce approach.

the problem is even if u had 1000Hz, the GPU will wait sometime for the vsync and idles in that time, a perfomance drop up to 50% is always possible (if u had a target frame rate of 1000hz).

u may say ur target frame rate is around 30hz(^=33ms) and u have a 200hz(^=5ms) display,
but u can still loose in ur 33ms frame up to 5ms the gpu idling/waiting for a vsync, still around up to 15% performance (this can be prohibited for many cases with triple-buffering, at cost of higher latency but not much because of 200hz)

additionally to that, u only generate 30 frames per second
but u send 200 frames to ur display, which means alot of wasted
memory bandwidth (especially in HDTV times), slowing the whole memory system.

the perfect solution would be a refreshless display
somehow a “solid-state display” like the technologies for “renewing digital newspapers”, to what u can send frames when they are ready. there u have most efficient memory bandwidth/usage (u dont even need double-buffering, because the “frontbuffer” now resides on the display)

ZbuffeR · May 24, 2008, 2:27am

Hey, of course it has flaws, but for a mere human being, it solves quite easily most of the visible problems caused by all the other available solutions (vsync on/off/triple buffer).
Don’t believe my word, just test it, you will see. I was surprised it worked so well visually.

The “flaw” is that you dont measure current frame time but previous one (sometimes even older frames, depending on the buffering), so probably 2 or 3 more frames should be taken in account, with highest weight to recent ones. And by the way, this is the reason triple buffering is ugly, because by definition you get irregular frame rate : frame 1 : 1/30 sec, frame 2 : 1/60 sec frame 3 : 1/30 sec etc.

About the 200Hz display, I said it was perfect “in the interesting range” with bolding around 60fps, for us mere human beings. Of course if you truly need 200fps, you will need an even higer spec’ed display.

lol at the digital ink, all technologies I saw needed way more than a tenth of a second to refresh… It wont help

vs987 · May 24, 2008, 4:28am

i did implement it, before i did last post,
because i did like ur idea, but it doesnt work for me

but for me it gave much more “irregularities” when he decides
to switch vsync

one example on my 60hz lcd notebook:

my app gets 100 fps with vsync disabled
but enabling vsync doesnt give me 60 hz, it gives me only
around 40-50 hz on average
with ur algorithm, he will turn some frames vsync on and off
always swapping between the 100fps vsyncoff and ~45fps vsyncon
giving very ugly artifacts

that it works for you, there could be 2 reasons:

ur rendering is almost cpu-bound, so you have no latency measuring it,
ur workload per frame doesnt change much among adjacent frames

both is not true for my app.

this is wrong, triple-buffering doesnt look more ugly than vsync on. triple-buffering is one additional post-rendering buffer,
it allows you only to reduce the gpu idle time waiting for vsync.

frame 1 : 1/30 sec, frame 2 : 1/60 sec frame 3 : 1/30 sec <– this happens because of vsync on (also with double-buffering), and not specifically because of triple-buffering.

the only effect, triple-buffering gives, is higher framerates at cost of higher latency, but in no way it look more ugly than doublebuffering-vsync on

i didnt say i want 200fps, i only reminded you, that even if u have a 200hz display for a 60hz scene, u still loose performance through vsync (in the worst case up to 15%) + higher memory impact of the scanlinerefresh,
so its not perfect.

digital ink was only a example -.-, surely we need technologies for it, but thats the perfect way: no wasted performance, no tearing

but well it is a good idea,
if u get a bullet-proof implementation working,
many would be interested.
if i find ways to improve it in my app, i will post it here too

ZbuffeR · May 24, 2008, 7:45am

Good remarks indeed.
I will try to dig back and provide my small test with its sources.

I don’t have it right now, so all this is from memory :

even with this very simple example, I tried to make the GPU work much more than CPU, by drawing multiple fullscreen quads, as my gf 6800le has a quite low fillfrate.

-the load increase or decrease was monotonic, so the problematic switch period happened only once in a dozen seconds. Maybe too much ideal.

-I don’t remind if I put some glFlush/glFinish inside, but it may prevent the GPU from buffering too much draw commands.

If I can afford the time, I will try to patch one of the opensource quake3 implementations, such as ioquake3 or OpenArena

ZbuffeR · May 27, 2008, 1:49pm

If someone is still interested, here is my full code for a dynamic vsync demo.

should be cross-platform, uses lib GLFW, compiled on Dev-C++

//========================================================================
// === playing with dynamic vsync ===
//
// keys :
// v - sets vsync on
// b - (default) sets vsync in dynamic mode :
//	         when frame rate goes (a bit) above DISPLAY_REFRESH_RATE, vsync ON to eliminate tearing
//           when frame rate goes (a bit) under, vsync OFF to get more FPS
// n - sets vsync off
//
// try to follow the moving white bar with your eyes, I am not responsible for epilepsy, seizures, etc
//	
// edit DISPLAY_REFRESH_RATE so that it matches your desktop refresh rate
// then edit OVERDRAW so that, with vsync off, the FPS can go down to around 45 and up to 200+
//
// author ZbuffeR
// from a nice code snippet published by Mikkel Gjoel,
// from what I though was a good idea
//   Original discussion here :
// http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=196712#Post196712
// ************************************************************************/

#include <stdio.h>
#include <GL/glfw.h>

/////////////////////////////////////////////
// display native refresh rate
const double DISPLAY_REFRESH_RATE = 60.0;

////////////////////////////////////////////
// max number of black fullwindow quads
// this value (30) is good for 1280*1024 with a geforce 6800LE
// feel free to increase :-)
const int OVERDRAW = 30;




const int VSYNC_OFF = 0;
const int VSYNC_ON = 1;
const int VSYNC_DYNAMIC = 2;



//========================================================================
// main()
//========================================================================

int main( void )
{
    int     width, height, running, vsync;
    double  t, t0, fps,offset;
    char    titlestr[ 200 ];
	long int frames =0;
	double tlast;
	
	int vsync_mode = VSYNC_DYNAMIC;
	char* vsync_mode_string = "VSYNC_DYNAMIC";
	
    // Initialise GLFW
    glfwInit();

    // Open OpenGL window
    if( !glfwOpenWindow( 1280, 1024, 0,0,0,0, 0,0, GLFW_WINDOW ) )
    {
        glfwTerminate();
        return 0;
    }

    // Enable sticky keys
    glfwEnable( GLFW_STICKY_KEYS );

    // vertical sync (on cards that support it)
    glfwSwapInterval( 1 );
    vsync = 1;


    glDisable( GL_TEXTURE_2D );
	glEnable( GL_BLEND);

    // Main loop
    running = GL_TRUE;
    frames = 0;
    t0 = glfwGetTime();
    tlast=glfwGetTime();
//	glfwSetMousePosCallback( (GLFWmouseposfun)(checkmouse) );
    while( running )
    {
        // Get time and mouse position
        t = glfwGetTime();

        // Calculate and display FPS (frames per second)
        if( (t-t0) > 0.2 || frames == 0 )
        {
            fps = (double)frames / (t-t0);
            sprintf( titlestr, "[keys: v,b,n] mode:%s (%4.1f FPS)", vsync_mode_string, fps );
            glfwSetWindowTitle( titlestr );
            t0 = t;
            frames = 0;
        }
        frames ++;
        
		if (glfwGetKey( 'V')) {
			vsync =1;
			vsync_mode = VSYNC_ON;
			vsync_mode_string = "VSYNC_ON";
			glfwSwapInterval( 1);
			vsync=1;
		}
		if (glfwGetKey( 'B')) {
			vsync_mode = VSYNC_DYNAMIC;
			vsync_mode_string = "VSYNC_DYNAMIC";
		}
		if (glfwGetKey( 'N')) {
			vsync_mode = VSYNC_OFF;
			vsync_mode_string = "VSYNC_OFF";
			glfwSwapInterval( 0);
			vsync=0;
		}
		
        if (vsync_mode == VSYNC_DYNAMIC) {
		        if( t-tlast>1.0/(DISPLAY_REFRESH_RATE*0.60)) {
//Mikkel Gjoel:		        if( t-tlast>0.024) {
					//too slow
				  if(vsync) {
					vsync=0;
					glfwSwapInterval( 0);
				  }
				} else if( t-tlast<1.0/DISPLAY_REFRESH_RATE*1.15){
//Mikkel Gjoel:				} else if( t-tlast<0.012){	
					//too fast
				  if(!vsync) {
				    vsync=1;
				    glfwSwapInterval( 1 );
				  }
				}
		}
		
        tlast=t;

        // Get window size (may be different than the requested size)
        glfwGetWindowSize( &width, &height );
        height = height > 0 ? height : 1;

        // Set viewport
        glViewport( 0, 0, width, height );

        // Clear color buffer
        glClearColor( 0.0f, 0.0f, 0.0f, 0.0f);
        glClear( GL_COLOR_BUFFER_BIT );

        // Select and setup the projection matrix
        glMatrixMode( GL_PROJECTION );
        glLoadIdentity();

        // Select and setup the modelview matrix
        glMatrixMode( GL_MODELVIEW );
        glLoadIdentity();
		

		//make it slow...
		int i = 0;
		glColor3f( vsync/8.0f, (1-vsync)/8.0f, 0.0f);
		// from 1 to n fullscreen black quads
		for (i=0;i<=OVERDRAW*(1+sin(t));i++) {
        glBegin( GL_QUADS );
          glVertex2f( -1, -1.0f );
          glVertex2f( 1, -1.0f);
          glVertex2f( 1, 1.0f);
          glVertex2f( -1, 1.0f);
        glEnd();
		}
        glColor3f( 1.0f, 1.0f, 1.0f);
		// the fast moving white quad
		
        offset = sin(6*t);
    	glBegin( GL_QUADS );
          glVertex2f( offset, -1.0f );
          glVertex2f( offset+0.5, -1.0f);
          glVertex2f( offset+0.5, 1.0f);
          glVertex2f( offset, 1.0f);
        glEnd();


        // Check if the ESC key was pressed or the window was closed
        running = !glfwGetKey( GLFW_KEY_ESC ) &&
                  glfwGetWindowParam( GLFW_OPENED );

        glfwSwapBuffers();

    }

    // Close OpenGL window and terminate GLFW
    glfwTerminate();

    return 0;
}

I added the subtle color change in the background to track whether the automatic mode switch was messy or not (redish means vsync active, greenish means vsync inactive).
… and you are right, it seems quite messy

Taking further apart the thresholds seem to cure the mess, but the reaction becomes too slow, so it’s not ideal anyway.
Putting the low threshold to 0.5 * DISPLAY_REFRESH_RATE and high treshold to DISPLAY_REFRESH_RATE is not too bad, but it does not prevent the “framerate halving when just below refresh rate” of classic vsync, so it is not very interesting.

I will settle for a 200Hz display