[WIN32] Waiting for VSync spikes processor?

Goofing around w/ my wglSwapInterval today, I noticed something that may be very, very bad: the wait performed in SwapBuffers when the swap interval is greater than zero is NOT a simple block - it actually spikes my CPU. (P4 2.0, Win2k, GeForce4 MX 420 (44.03), 512 Ram, Intel MB)

The reason I started goofing around w/ it was that I still sometimes get tearing when I set the swap interval to 1. (Yes, I have it set to On by default on my display settings panel). I hadn’t given it much thought in awhile, but then I remembered that back in the old 13h days, when you were actually examining bits, you waited for the VBL signal to go hot 2x to avoid tearing.

I tried wglSwapInterval( 2 ). When I did, winamp started underrunning its buffers. Didn’t seem right, so I poured salt into the wound, and set wglSwapInterval( 20 ). Other apps were suddenly getting starved of CPU usage and bus bandwidth. Nothing worked, and my CPU usage was pegged at 100%, with a high percentage of time spent in kernal mode (~80%), via TaskManager.

I assumed they were using a mutex (kernal mode object) to sync when they block the rendering thread, so that it just goes to sleep effeciently, but I’m running at 72 hz refresh rate. To wait for 20 VBLs takes 278 ms. On a P4, a lot can be done in 278 ms.

HOWEVER, then I got to thinking about the task scheduler and whatnot. If you were to block a thread and use a mutex for syncronization, you’re going to que the thread in the back of the task list. That’s at LEAST 10ms - a long time for that low-latency rendering thread to wait. The only way to keep the task scheduler out of it is to spike the CPU at this point, like spin counts on critical sections.

If the CPU is spiked during a VBL wait, what kind of processing can you possibly do during that -relatively long- time?

Anyone have any thoughts or suggestions? I’d like to avoid tearing, and it seems to me that I’ll have to set the swap interval to 2. That’s going to chew up other apps, and most annyoingly, Winamp. I’m still wary of splitting the threads into render thread and process thread, just because that brings the scheduler into play, mandating at least 10ms waits between thread switches, but I’d go for it if I could sleep the thread that’s blocking for the VBL, while it’s waiting for the VBL.

Thank you for your bandwidth,
– Succinct

Do have an IRQ assigned to the card?
I’m not sure whether NVIDIA chips can signal the blank period via interrupts, but that may just be it.

Are you using or have you tried double buffering?

I’m still wary of splitting the threads into render thread and process thread, just because that brings the scheduler into play, mandating at least 10ms waits between thread switches

That depends… A forced thread switch (signal/wait) usually takes about 2 microseconds (30-40 microseconds under Win9x). If both threads are working completely independent (no task syncronization), you may get problems if you do NOT sync to VBlank, since then you may end up in situations where you have 200 FPS during 10 ms, then 0 FPS during 10 ms for instance (meaning that you draw 1-2 frames during your time slice, but none during the other threads slice). If you do have VSync on, however, then your frame rate should never be higher than the thread switching interval (I have not actually tried this, but I think it should work OK).

On the other hand, I do not know if using two threads would solve your problem. Just my comments…