Asynchronous buffer swap

I am using GL for a Windows application that needs almost-real-time control of the CPU. The polygon count is very low and I’m using fairly simple graphics, so it seems that the current limitation on my graphics performance is the time I have to block to call SwapBuffers().

On an ATI FireGL2, this operation seemed to be really fast (~100 microseconds for a small window size), but I’m also trying to use a GeForce4-based card (for cost and for what appears to be better dual-display support), for which the buffer swap time seems to be over a millisecond for the same window size. That’s plenty fast to satisfy my frame rate demands, but that’s a millisecond during which I’d like to use the CPU. My questions :

  1. The first card is AGP; the GeForce4 is a PCI card. Is the difference in buffer swap time likely due to the difference in bus speed, or to a difference in the cards? (The FireGL2 does cost $800 more than the GeForce after all…)

  2. Is there any hope of calling swapbuffers asynchronously either using standard GL tricks or using manufacturer-specific extensions? I’ve heard this is likely to be a component of the GL2.0 standard, but is there any way to do this now? I will play around with some thread-based approaches to doing the same thing, but I feel like I can’t beat 1ms with that approach.

  3. Does anyone here know whether DirectX provides any support for asynchronous transfers that might not be available yet in OpenGL?

Thanks…

-Dan

Have you turned off VSync in the nVIDIA display control panel?

Also, I think GeForces swap using copy when not in full screen mode. It may be that ATI swaps using overlays, which is likely to be faster. But a millisecond is a long time!

Originally posted by dmorris:
I am using GL for a Windows application that needs almost-real-time control of the CPU.

“Real-time” and “Windows” are mutually exclusive, as far as I’m concerned…

Swap copy taking a millisecond on the HW side is nothing outrageous. But you’re looking at the CPU, I presume.

Note that what you may be noticing is our driver latency limitation; we will not return control to your application on SwapBuffers until the last SwapBuffers is done. If your app is hardware-limited, this will often be measurable.

It’s not at all clear to me what an “asynchronous” SwapBuffers would be???

  • Matt

Originally posted by mcraighead:
“Real-time” and “Windows” are mutually exclusive, as far as I’m concerned…

That’s what I say “almost-real-time”… but this particular issue seems to be fairly OS-independent anyway. And actually, Windows is much better for this sort of job than it used to be. If you have a decent machine and you put your process at the highest priority class, I’ve observed that you really don’t lose the CPU at all (maybe five or six 500us blocks over the course of a few seconds running time) if you don’t do anything else with your system. (I refer to Win2k; I have not even ventured to try such experiments on XP…)

It’s not at all clear to me what an “asynchronous” SwapBuffers would be???

Of course I don’t know the details of what really occurs during buffer swap, but it seems that the CPU doesn’t really need to be involved… this is basically a block transfer to the video buffer, which it seems could be managed by the graphics card and/or the system DMA controller without the CPU.

So this would seem like a good candidate for an asynchronous transfer, during which the user could do anything that doesn’t write to the framebuffer.

That’s what I had in mind, but I would believe that the reality of the buffer swap is much more complicated than I make it sound, so perhaps something like this has not yet been implemented. But that’s what I meant.

Also, turning off the vsync option made a big difference. I still would be curious - partially just as an academic question - about any way of not tying up the CPU during buffer swap, or whether any other manufacturers support this, but turning off vsync helped significantly.

-Dan

I solved this problem by mandating the prescence of another cpu.

Cas

In the sense that you describe, it seems to me that SwapBuffers is already asynchronous. It can return control to the user long before the HW swap occurs.

Another sort of “asynchronous” SwapBuffers might be called from a thread other than the rendering thread. This can also be done with the current SwapBuffers API.

  • Matt

Originally posted by cix>foo:
[b]I solved this problem by mandating the prescence of another cpu.

Cas [/b]

Wouldn’t it work with “just another thread” too. I mean, the SwapBuffers call can hardly be blocking the CPU, just the thread (?). Ok, if you’re transfering from main RAM to the video card, but this all happens on the video card, I suppose.