vsynch on and cpu usage

With vertical synch on, swapBuffers+glFinish will wait for the next vertical retrace. The problem is that the driver is wasting CPU cycles just waiting for vsynch signal.

Is there a way to synchronize without using CPU cycles ?

"If you want to use the NV_FENCE extension, you can keep CPU usage down (<4% on my GF3) is to replace glFinish as follows:

//Draw
RenderScene();
glFinish();
Swap();

CHANGED TO:

//Init
glGenFencesNV(1, &m_uiFence);

//Draw
if (glSetFenceNV)
{
        glSetFenceNV(m_uiFence, GL_ALL_COMPLETED_NV);
        RenderScene();
        while (!glTestFenceNV(m_uiFence))
        {
                Sleep(1);
        }
}
else
{
        RenderScene();
        glFinish();
}
Swap();

"

You may also do the same doing a query, waiting until its result is available. Which would work on both Nv and Ati cards.

Is there a way to synchronize without using CPU cycles ?
Yes, before calling swapbuffer, use the CPU for whatever you want.

Calling glFinish causes the drievr to pause and wait until all GPU work is finished. Why do you call this? Do you read back something from the GPU?

tfpsly, do you mean that the NV_FENCE extension is supported under ATI cards ? It looks strange to me. Personally (but I only have nv cards) my cards do not support any ATI extensions. The extension registry only reports NV and APPLE fence.

Thanks to give information.

Originally posted by jide:
tfpsly, do you mean that the NV_FENCE extension is supported under ATI cards ?
No it is not. But using a query, you can achieve the result you want : query whatever you want at the end of your rendering, and as long as the query result is not available yet, do some sleep(1).

On Windows, Sleep(1) will force thread to sleep 10-15ms. This kind of timers are not accurate.

yooyo

Thank you, I wasn’t understanding it this way.

sleep (1) is just very expensive: one second is very too long. usleep might be best for that purpose if sleeping is really needed.

Yes, before calling swapbuffer, use the CPU for whatever you want.

Calling glFinish causes the drievr to pause and wait until all GPU work is finished. Why do you call this? Do you read back something from the GPU?
my application is multithreaded, there is a rendering thread plus other threads with lower priorities.
when the rendering thread has finished sending graphic commands, it has to wait for the GPU to finish its work and then to wait for the next vertical retrace before starting sending new commands.
Since the rendering thread has the highest priority, it must not use cpu cycles during the wait, to allow other threads to do some work.

sleep (1) is just very expensive: one second is very too long. usleep might be best for that purpose if sleeping is really needed
Sleep(1) means 1 ms sleep, not 1 second (look in msdn)

anyway, Sleep cannot be used because it is not precise enough.

what about high resolution multimedia timer (timeSetEvent) ?

in the rendering thread, before calling swap buffers :

1/ measure the time left until the next vertical retrace
2/ start a timer event for this period
3/ take a semaphore (the rendering thread goes to sleep so that other threads can use the cpu)

upon time expiration, the callback function is called which releases the semaphore.

could this work ?

Sleep(1) means 1 ms sleep, not 1 second (look in msdn)

anyway, Sleep cannot be used because it is not precise enough.

Under Linux this is the way I told and as I don’t have any installed Windows I simply can’t know that.

Originally posted by golem:

in the rendering thread, before calling swap buffers :

1/ measure the time left until the next vertical retrace
2/ start a timer event for this period
3/ take a semaphore (the rendering thread goes to sleep so that other threads can use the cpu)

upon time expiration, the callback function is called which releases the semaphore.

could this work ?[/QB]
Yes. I have used this technique with Windows XP for several years.

  1. Initialize a timer resource with CreateWaitableTimer(NULL, TRUE, NULL). CreateWaitableTimer
  2. Use IDirectDraw_GetScanLine() GetScanLine to determine when the next vsync will occur.
  3. Use SetWaitableTimer(timer, period, 0, NULL, NULL, 0) SetWaitableTimer and
    WaitForSingleObject(timer, INFINITE) WaitForSingleObject to sleep for a period of time. No callback is required.

The horizontal scan line timing must be determined to convert GetScanLine to time. This can be done using a calibration loop or a tool like the Powerstrip API. web page

Perhaps, I misunderstand the issue, but when the driver is waiting on vsync, isn’t it doing so by placing the thread into a wait state? It isn’t going to just sit there in a busy loop waiting for vsync time. Or I would hope that the driver is not busy looping. So it seems to me that it isn’t really an issue, especially if you have a separate rendering thread.

Perhaps, I misunderstand the issue, but when the driver is waiting on vsync, isn’t it doing so by placing the thread into a wait state? It isn’t going to just sit there in a busy loop waiting for vsync time. Or I would hope that the driver is not busy looping. So it seems to me that it isn’t really an issue, especially if you have a separate rendering thread
If you call glFinish right after swapBuffers and if vertical synch is turned on, glFinish does not return until the next vertical retrace. If you launch Task Manager, you will see that your process is using 100% of cpu (don’t forget to call SetThreadAffinityMask( GetCurrentThread(), 1); to lock your thread to one cpu)

If you don’t call glFinish after swapBuffers, TaskManager reports a cpu usage close to 0 (if you send very few graphic commands). The application is now blocked most of the time and is not using the cpu when it is blocked.

The reason i want to wait for the next vertical retrace before sending new commands is to control the latency, i.e. the time that elapses between the move of the joystick and the display of the image corresponding to the joystick position. If you don’t call glFinish, then you queue a certain number of frames in the driver FIFO (how many depends on driver implementation) and you cannot guarantee a fixed response time.