Spike in frame rendering time

Hi,

I’ve been working on an application which requires rendering of 3D scenes in OpenGL under hard real time constraints. It logs the time delta between the first and the last OpenGL rendering call to measure frame stutter, as illustrated below. When I run it on Geforce hardware, I observe a sudden, single and deterministic spike in rendering time. The frame where the spike occurs changes depending on the complexity of the rendered scene but running the application multiple times with the same scene always produce a spike at the very same frame. Nothing in the application code could explain such a spike so I suspect this is driver-sided (looks as though the driver is attempting to profile the application and after hitting a given threshold, an internal function is called which takes a relatively long time to execute). To further defend this theory, I’ve tested the same application on Radeon and Quadro hardware and experience no such behavior. I’ve tried different options in the Nvidia control panel to no avail (although turning off adaptive power management prevented a slight increase in rendering time following the spike).

How can I address this issue?

Thanks,
PM

[ATTACH=CONFIG]798[/ATTACH]

Are the times in the graph milliseconds (ms)? If so, wow --that’s some spike!

First step: make sure there’s no funny-business at work with the driver’s queuing of your OpenGL commands. For profiling purposes, change your frame loop to do this:


// Start of frame
FrameTimer::Start();
{
  RenderYourFrameHere();
  glFinish();
}
FrameTimer::Stop();

// End of frame
SwapBuffers(); 
glFinish();

I write it this way so that you don’t have to disable VSync (sync-to-vblank) to get consistent timings. Note that the frame timer should be a high-res timer (sub-ms accurate) which measures “wall clock time”.

Does the spike go away?

If not, it could be excess time consumption in your process, or it could be a background process kicking you off a CPU core for a while. Run a process monitor which sorts processes by CPU utilization and updates 10X a second. You can watch it to verify that your process is never preempted and if it is, by what (you may need to disable Sync-to-VBlank here to ensure your process has consistently high CPU utilization). If something else kicks you off, resolve that problem and retest.

If it’s not obviously a background process kicking you off the CPU, you can use timers to narrow down where in your application code this time consumption is occurring.

Just to clarify - the frames here are all the very same? In other words, with the exception of the spike, there is no other difference between the frame on which the spike occurs and all of the other frames between program startup and the spike, and the very same scene is drawn in all of them?

If so, and if you could maybe post a GLIntercept log of a single frame, it would help analyze what’s going on here. There may be a perfectly valid reason for the spike that’s nothing to do with aberrant driver behaviour; e.g if you create and destroy buffer objects at runtime your GPU memory could become fragmented and the spike frame could be your driver moving things around and coalescing free blocks. In the absence of something like that GLIntercept log it’s just guessing, but that’s one example of the kind of thing that could cause this without it needing to be a driver bug.

Are the times in the graph milliseconds (ms)? If so, wow --that’s some spike!

Yes, milliseconds.

If not, it could be excess time consumption in your process, or it could be a background process kicking you off a CPU core for a while. Run a process monitor which sorts processes by CPU utilization and updates 10X a second. You can watch it to verify that your process is never preempted and if it is, by what (you may need to disable Sync-to-VBlank here to ensure your process has consistently high CPU utilization). If something else kicks you off, resolve that problem and retest.

Thanks for the suggestions. I ran the application alongside process explorer and had the latter update as fast as it could. I’ve been unable to uncover any background process preempting the application, although it doesn’t rule out this possibility. But since the spikes happen on discrete frames and the behavior is deterministic (always the same frames) and found exclusively on Nvidia Geforce cards (and not ATI), I suspect the slowdowns occur within Nvidia driver. I generated a timeline using the Nsight performance analyzer and it seems the spikes happen in the middle of wglSwapBuffers calls, further reinforcing the suspicion about the driver performing some weird stuff behind the curtain. I already made use of performance counters to obtain precise frame timings.

Just to clarify - the frames here are all the very same? In other words, with the exception of the spike, there is no other difference between the frame on which the spike occurs and all of the other frames between program startup and the spike, and the very same scene is drawn in all of them?

Yes, it’s always the exact same scene and no GL resource allocation (or any other kind of allocation for that matter) is taking place after the first frame is rendered. A call is made to glGenerateMipmaps on every frame, which should theoretically not need to allocate anything beyond the first call, but even after commenting it out, I still have the spikes.

Here’s the GLIntercept trace you’ve asked for:

glBindFramebuffer(GL_DRAW_FRAMEBUFFER,1)
glClearColor(0.400000,0.400000,0.800000,1.000000)
glDrawBuffers(1,0042F954)
glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT)
glViewport(0,0,2048,2048)
glUseProgram(4)
glBindBuffer(GL_UNIFORM_BUFFER,1)
glBufferSubData(GL_UNIFORM_BUFFER,176,48,005CDF04)
glBindBuffer(GL_UNIFORM_BUFFER,0)
glBindBuffer(GL_UNIFORM_BUFFER,1)
glBufferSubData(GL_UNIFORM_BUFFER,128,48,005CE300)
glBindBuffer(GL_UNIFORM_BUFFER,0)
glBindBuffer(GL_UNIFORM_BUFFER,1)
glBufferSubData(GL_UNIFORM_BUFFER,64,64,005CE084)
glBindBuffer(GL_UNIFORM_BUFFER,0)
glBindBuffer(GL_UNIFORM_BUFFER,1)
glBufferSubData(GL_UNIFORM_BUFFER,0,64,005CE0C4)
glBindBuffer(GL_UNIFORM_BUFFER,0)
glBindBuffer(GL_UNIFORM_BUFFER,2)
glBufferSubData(GL_UNIFORM_BUFFER,0,80,005CE4A0)
glBindBuffer(GL_UNIFORM_BUFFER,0)
glBindVertexArray(2)
glDrawElements(GL_TRIANGLES,17412,GL_UNSIGNED_INT,00000000)
glBindVertexArray(0)
glBindFramebuffer(GL_DRAW_FRAMEBUFFER,0)
glBindTexture(GL_TEXTURE_2D,1)
glGenerateMipmap(GL_TEXTURE_2D)
glFinish()
glGetTexImage(GL_TEXTURE_2D,2,GL_RGBA,GL_UNSIGNED_BYTE,05D60020)
glFlush()
wglSwapBuffers(4C0117AD)=true 

Your help is truly appreciated, thanks!
PM

First, add a glFinish() immediately after wglSwapBuffers() and retest. That is not a hack, that is a production fix that is commonly required when you need consistent frame time consumption and minimal user latency responding to their input.

If that doesn’t affect your spike, you could try disabling multithreading in the NVidia GL driver and retest.

Here’s the GLIntercept trace you’ve asked for:

Nothing here that just jumps out at me and says “that’s it!” Some thoughts: 2048x2048 is a large image; if you reduce size does that change the spike? 17412 is quite a few tris in one batch. Any spike change if you reduce the num tris? Also how about if you use UNSIGNED_SHORT indices instead of UNSIGNED_INT? Your glFinish before glGetTexImage shouldn’t be necessary (it’s implicit). Also, glGet* calls are often slow particularly when (like in this case) they pull data back from the GPU to the CPU. More efficient is to do a readback through a ring buffer of PBOs and delay accessing the data. This allows the driver to pipeline the readback more efficiently and avoids you blocking the CPU thread for as long (or best case: at all) to access a frame read back. What happens to the spike if you comment out the glGetTexImage call? If gone then you just need to switch to a more efficient way to read back the data.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.