Goofing around w/ my wglSwapInterval today, I noticed something that may be very, very bad: the wait performed in SwapBuffers when the swap interval is greater than zero is NOT a simple block - it actually spikes my CPU. (P4 2.0, Win2k, GeForce4 MX 420 (44.03), 512 Ram, Intel MB)
The reason I started goofing around w/ it was that I still sometimes get tearing when I set the swap interval to 1. (Yes, I have it set to On by default on my display settings panel). I hadn’t given it much thought in awhile, but then I remembered that back in the old 13h days, when you were actually examining bits, you waited for the VBL signal to go hot 2x to avoid tearing.
I tried wglSwapInterval( 2 ). When I did, winamp started underrunning its buffers. Didn’t seem right, so I poured salt into the wound, and set wglSwapInterval( 20 ). Other apps were suddenly getting starved of CPU usage and bus bandwidth. Nothing worked, and my CPU usage was pegged at 100%, with a high percentage of time spent in kernal mode (~80%), via TaskManager.
I assumed they were using a mutex (kernal mode object) to sync when they block the rendering thread, so that it just goes to sleep effeciently, but I’m running at 72 hz refresh rate. To wait for 20 VBLs takes 278 ms. On a P4, a lot can be done in 278 ms.
HOWEVER, then I got to thinking about the task scheduler and whatnot. If you were to block a thread and use a mutex for syncronization, you’re going to que the thread in the back of the task list. That’s at LEAST 10ms - a long time for that low-latency rendering thread to wait. The only way to keep the task scheduler out of it is to spike the CPU at this point, like spin counts on critical sections.
If the CPU is spiked during a VBL wait, what kind of processing can you possibly do during that -relatively long- time?
Anyone have any thoughts or suggestions? I’d like to avoid tearing, and it seems to me that I’ll have to set the swap interval to 2. That’s going to chew up other apps, and most annyoingly, Winamp. I’m still wary of splitting the threads into render thread and process thread, just because that brings the scheduler into play, mandating at least 10ms waits between thread switches, but I’d go for it if I could sleep the thread that’s blocking for the VBL, while it’s waiting for the VBL.
Thank you for your bandwidth,
– Succinct