PDA

View Full Version : Very long swap times, frames getting dropped



Matt Phillips
03-01-2010, 11:01 AM
Hello,

I'm using OpenGL through Qt 4.6. I'm rendering relatively quite simple graphics--40 or so gluDisks about 75 pixels wide--and on some runs of the program, I'm losing up to 25% of the frames. I call the system clock immediately before and immediately after the swapBuffer() command, and when this happens the swap time is around 25ms (on a 60Hz monitor). This varies somewhat from run to run, but within a run it can be quite consistent, e.g. 25500 microseconds +/-100us for each dropped frame. (Though typically other swap times, e.g. 30ms, will be interspersed.) Anyway, the result is graphics so choppy as to be unusable. Like I said on some runs of the program it's worse than others, but I have no idea why--everything else on the system is the same.

Does anybody know what's up, and how I can eliminate this? I call glFinish() and glFlush() after my drawing is done, before I call swapBuffer(). This doesn't happen on a Mac. Thanks!

Matt

Stephen A
03-01-2010, 12:37 PM
You need to give more information for us to make any suggestions:

- what video hardware are you using?
- are you getting hardware acceleration?
- how large is your viewport? (this will only matter if you are not getting hardware acceleration or if you are using very old/underpowered hardware)
- is vsync enabled?

You should not call glFlush() or glFinish() before SwapBuffers(), as the latter contains an implicit call to glFinish() (and calling these functions introduces pipeline stalls that may reduce performance.)

Dark Photon
03-01-2010, 05:12 PM
On top of what Stephen A said:

- What does "glxinfo | egrep 'vendor|renderer|version'" print
- Are you using a compositing-capable window manager?
- If so, do you have compositing disabled?
(for KDE4: Start->Configure Desktop->Desktop->Desktop Effects->Uncheck Enable desktop effects->Apply)

Matt Phillips
03-02-2010, 09:10 AM
Hello,

Thanks guys. I'm using an nVidia GeForce GTS 250 in TwinView mode on a new-ish (circa last summer) HP with 8GB RAM and generally 'high end' specs. I'm running my program under Ubuntu 9.10. As for hardware acceleration, I'm not sure? The GTS is supposed to have good OpenGL support, and the first chunk of my glxinfo output suggests to me that things are pretty well enabled--


direct rendering: Yes
server glx vendor string: NVIDIA Corporation
server glx version string: 1.4
server glx extensions:
GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig,
GLX_SGIX_pbuffer, GLX_SGI_video_sync, GLX_SGI_swap_control,
GLX_EXT_texture_from_pixmap, GLX_ARB_create_context, GLX_ARB_multisample,
GLX_NV_float_buffer, GLX_ARB_fbconfig_float, GLX_EXT_framebuffer_sRGB
client glx vendor string: NVIDIA Corporation
client glx version string: 1.4
client glx extensions:
GLX_ARB_get_proc_address, GLX_ARB_multisample, GLX_EXT_visual_info,
GLX_EXT_visual_rating, GLX_EXT_import_context, GLX_SGI_video_sync,
GLX_NV_swap_group, GLX_NV_video_out, GLX_SGIX_fbconfig, GLX_SGIX_pbuffer,
GLX_SGI_swap_control, GLX_ARB_create_context, GLX_NV_float_buffer,
GLX_ARB_fbconfig_float, GLX_EXT_fbconfig_packed_float,
GLX_EXT_texture_from_pixmap, GLX_EXT_framebuffer_sRGB,
GLX_NV_present_video, GLX_NV_multisample_coverage
GLX version: 1.3
GLX extensions:
GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig,
GLX_SGIX_pbuffer, GLX_SGI_video_sync, GLX_SGI_swap_control,
GLX_EXT_texture_from_pixmap, GLX_ARB_create_context, GLX_ARB_multisample,
GLX_NV_float_buffer, GLX_ARB_fbconfig_float, GLX_EXT_framebuffer_sRGB,
GLX_ARB_get_proc_address
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: GeForce GTS 250/PCI/SSE2
OpenGL version string: 3.0.0 NVIDIA 185.18.36
OpenGL shading language version string: 1.30 NVIDIA via Cg compiler


About 100 OpenGL extensions are also listed. (Dark Photon this answers your question.)

My viewport is the size of the entire screen, 1920x1200, one of two same-sized monitors hooked up to the card. Civ 4, which is vastly more graphically demanding than what I'm doing, plays completely smoothly under Windows.

I was not using vsync when I made my original post, but when I tried it via the nVidia settings panel it didn't seem to make much difference (once I removed the 'glFinish()' command). I only had a few frame drops running it just now, but that's still >0 so I don't think the problem is solved (and maybe the next run, more...). I am double buffering.

As for my window manager, I'm using Gnome with all the desktop effects turned off. Interestingly my system can't seem to handle these effects--when I turn them on the taskbars disappear and the effects revert back to 'none' after a few seconds.

As a last note, putting these 40+ rings on the screen--which is basically not a whole lot more than 40+ calls to gluDisk--takes 6-7ms which seems like a *really* long time. On the other hand, I'm able to render a full-screen texture, which consists of a single small 1D texture repeated ~50000 times every frame--in less than 5ms per frame. I don't understand this at all. Maybe there's just some gross inefficiency in my code somewhere in the former case but I don't think so.

Anyway, thanks for your consideration!

Matt

Dark Photon
03-02-2010, 09:32 AM
My viewport is the size of the entire screen, 1920x1200
Shrink your window and retry. First step to making sure you're not fill-limited.

Stanley L
03-02-2010, 10:26 AM
Have you also tried other objects than the rings created by gluDisk and your full-screen quad?

There's an off chance that an implementation difference in the gluDisk function might be causing the loss in performance. A quick Google on gluDisk returned a person mentioning that they had to reduce slice and loop counts to get similar performance when using gluDisk under Linux (http://www.devmaster.net/forums/showthread.php?t=4326).

Matt Phillips
03-02-2010, 06:08 PM
Hello,

Dark Photon--tried that, all the way down to 800x500. It didn't change anything.

Stanley L--I tried reduced the number of slices to 4, still got some frame drops. I get frame drops also when there are few rings on the screen.

To get a general idea of what my system is capable of, graphics-wise, I downloaded Sauerbraten, a first person shooter written using OpenGL. It played on *both* screens--3840x1200--perfectly smoothly. glxgears runs full-screen, rendering 60fps (same as monitor refresh rate) smoothly, maybe some rare dropped frames.

Things have been a lot better for the last few runs, perhaps since enabling vsync, but still with some dropped frames, visible as 'stutters' in the animation. Timing is pretty important for my application, so I'd like to get rid of them all. The Qt code for swapBuffers() doesn't appear to do much more than do some validity and timing checks, and then call glXSwapBuffers():


void QGLContext::swapBuffers() const
{
Q_D(const QGLContext);
if (!d->valid)
return;
if (!deviceIsPixmap()) {
int interval = d->glFormat.swapInterval();
if (interval > 0) {
typedef int (*qt_glXGetVideoSyncSGI)(uint *);
typedef int (*qt_glXWaitVideoSyncSGI)(int, int, uint *);
static qt_glXGetVideoSyncSGI glXGetVideoSyncSGI = 0;
static qt_glXWaitVideoSyncSGI glXWaitVideoSyncSGI = 0;
static bool resolved = false;
if (!resolved) {
const QX11Info *xinfo = qt_x11Info(d->paintDevice);
QGLExtensionMatcher extensions(glXQueryExtensionsString(xinfo->display(), xinfo->screen()));
if (extensions.match("GLX_SGI_video_sync")) {
glXGetVideoSyncSGI = (qt_glXGetVideoSyncSGI)qglx_getProcAddress("glXGetVideoSyncSGI");
glXWaitVideoSyncSGI = (qt_glXWaitVideoSyncSGI)qglx_getProcAddress("glXWaitVideoSyncSGI");
}
resolved = true;
}
if (glXGetVideoSyncSGI && glXWaitVideoSyncSGI) {
uint counter;
if (!glXGetVideoSyncSGI(&counter))
glXWaitVideoSyncSGI(interval + 1, (counter + interval) % (interval + 1), &counter);
}
}
glXSwapBuffers(qt_x11Info(d->paintDevice)->display(),
static_cast<QWidget *>(d->paintDevice)->winId());
}
}


but I thought I would post it in case anybody sees anything. Thanks for your suggestions so far and I would definitely appreciate any more you might have--

Matt

Dark Photon
03-03-2010, 05:32 AM
...I downloaded Sauerbraten, a first person shooter written using OpenGL. It played on *both* screens--3840x1200--perfectly smoothly. glxgears runs full-screen, rendering 60fps (same as monitor refresh rate) smoothly, maybe some rare dropped frames.
Sure sounds like it's your app then, not the window manager, GL drivers, or OS.


Things have been a lot better for the last few runs, perhaps since enabling vsync, but still with some dropped frames, visible as 'stutters' in the animation. Timing is pretty important for my application, so I'd like to get rid of them all. The Qt code for swapBuffers() doesn't appear to do much more than do some validity and timing checks, and then call glXSwapBuffers()

Suggest you take Qt out of the picture and retry. Throw your code in a GLUT app and see if you've still got the problem.

Put some timing calipers in your code to see where all your time is getting lost.

Do you have lots of GPU memory committed? e.g. textures, VBOs, etc.? We know you have a big framebuffer and if you've got MSAA on then can be a huge memory eater. E.g. 1920x1200 @ 4X MSAA = ~140MB. I've seen random stutter when you're blowing past GPU memory and the driver is having to demand-page onto the GPU.

Matt Phillips
03-03-2010, 07:54 AM
Hi Dark Photon,


Sure sounds like it's your app then, not the window manager, GL drivers, or OS.

My default assumption--but frame drops are so low-level, it's hard for me to pin this to a problem in my code. I.e., I have


t1 = GetTimeStamp();
swapBuffers();
t2 = GetTimeStamp();

and what I'm calling a frame drop is when t2 - t1 is ~25ms. But MPAA isn't the issue--I checked--and I typically don't have anything else running except the Qt IDE, Firefox, and the companion gui process which controls the OpenGL, and Firefox is in a different desktop. The gui is pretty simple so I don't think there's any kind of external overload. I tried running my program from the desktop instead of through the IDE as I have been doing, and frame drops *may* have decreased--I'll repost if this and release vs. debug building makes a difference. Otherwise we sure have run through a lot of possibilities at this point...


Suggest you take Qt out of the picture and retry.

That's a good suggestion, as the graphics are all done within a single isolated process that doesn't have a gui. Unfortunately it would take time I don't have right now, but I will definitely keep this in mind.

Re timing, I've done that and found that in fact most of the 6-7ms I attributed to gluDisk draw time actually came from communication between my two processes (didn't see that coming--I thought local socket connections were supposed to be lightning fast). Oops, sorry about that.

Anyway, thank you--

Matt

Matt Phillips
03-03-2010, 10:24 AM
Lest I forget where this all started--the other thing which suggests something deeper than a coding flaw is going on is the fact that this doesn't happen on a Mac Pro with an ATI X1900. The rendering is overall much slower--I can't even get fullscreen textures rendered within a frame (though OpenGL on the Mac in general is fine) so I wonder if I'm even getting hardware acceleration there--but no dropped frames.

Matt

nigels
03-27-2010, 12:16 PM
I also think adapting the same draw code in a GLUT application is worth a try. There might be something particular to Qt 4.6 that is creating a context in a way that is hitting a slow path.

Matt Phillips
03-29-2010, 09:02 AM
Hi,

I just saw your post nigels--thanks. And to modify my original report, I am getting this on the Mac OSX as well, when I start running different variations on the same code. So it probably is a genuine Qt issue. Too bad, since Qt has solved the cross-platform text rendering problem (QGLWidget::renderText())... Anyway I'll get back to the boards if/when I do overhaul things to report whether I was able to get rid of long swap times by getting rid of Qt.

Matt

nigels
03-29-2010, 09:45 PM
It's worth looking into taking closer control of the kind of OpenGL context that being created. Qt might be assuming all kinds of (slow) options (for your GPU) in order to support all kinds of functionality that you might not actually need.

- Nigel