eglclientwaitsync waits for long time inconsistently

debinair · September 28, 2016, 2:34pm

I have 3 buffers to which I am rendering to and for each frame I am calling eglClientWaitSync() for the previous to previous buffer completion. but it waits for around 30 miliseconds in 1/10 frames. it does not occur for every frame.
How do I debug/ solve this issue?
I can see my application is vertex-bound as there are around 121 objects in a scene each with 1.5k vertices, i can also see the “vertex tiling compute” component in profiler is high. but if this is the case it should happen for every frame.

Dark_Photon · September 28, 2016, 7:06pm

I’m not sure you’ve provided enough info to point to exactly what’s wrong. Here are some related questions that come to mind:

Are you rendering to an EGL window surface supporting VSync and calling SwapBuffers?
Are you running with VSync enabled, and if so, what’s your VSync rate and SwapInterval (60Hz and 1?)?
Is your app, driver, and GPU set to run full-out for your application, rendering a frame every VSync period?
On frames where you don’t block in WaitSync for ~30ms, how much time do you block for?
Are you running double- or triple-buffered?
Do you ever use more than 1 buffer within a single frame’s command stream?

In the meantime, here are a few things to try…:

Pull up your app’s timing in a Mali profiling tool (it sounds like you’re using a Mali GPU), and look at how the CPU/app, vertex, and fragment work lines up with VSync periods. Is your vertex or fragment work ever “blowing a frame” (exceeding a VSync period)? If so, I would dig into that.

Try simplifying your app/CPU, vertex, and fragment load so that the GPU can easily finish a frame within the minimum number of VSync intervals, enable VSync, and with that simplified workload, see whether you ever wait in WaitSync in your draw thread (after the first 2-3 frames while you fill the pipeline). Do you? If so, how often and for how long?

Enable VSync, and bump your SwapInterval up (e.g. 1 -> 2 or 3). What do you see now? Does that keep you from blocking on WaitSync?

Another thing to keep in mind. Normal GL sync objects cause you to wait until all the pipeline work is done, not just the vertex work. On tile-based GPUs since the fragment pipe runs a frame or two late and the sync object waits for that, that’s a lot of latency! A mid-frame GL waitsync can also trigger a full fragment flush mid-frame, which is bad for performance on tile-based GPUs and can cause some rendering artifacts (so hopefully you’re “not” doing that). You have to be tricky to sync on just the vertex work completing if that’s all you really need to wait for. But if so and if you do, that will reduce your buffer complete latency and thus the number of buffers you need to have in-flight to keep the pipeline full. It can also avoid the perf and artifact issues associated with a mid-frame flush.

debinair · September 29, 2016, 12:06pm

There are threads A and B with different contexts. B takes texture from A. There are using EGL window surface. and yes it is Mali device but i am having some issues with setup of streamline with driver.so and not able to see profiling info per call. but i got the info using systrace which shows me duration of syncwait.

I have another thread with separate context which takes one of the 3 buffers to show on the screen and which runs with VSync, 60 times in 1 sec. It is front buffer mode. The problem is that i am calling syncwait on first thread which is rendering main scene and which takes time other thread just runs fine. So not a issue about VSync here.

Now A is having chain of buffers with length 3 with single context so i guess all the commands are being submitted to one command buffer and which is separate from B’s command buffer. Now the question why is this wait occurs very infrequently so for example frame 1, 2 runs fine , but frame 3 not. or any random behavior ?

Another question, my A thread creates context and calls eglMakeCurrent and eglClientwaitSync, so eglClientwaitSync() will be for the particular command buffer corresponding to the A’s context right?

Dark_Photon · September 29, 2016, 4:45pm

Sounds like you have quite a few balls in the air with this one. You might gradually simplify your scenario and see when the wait goes away. One theory for why it might come up is that your signaling thread doesn’t run in a consistent time for some reason (e.g. loss of thread CPU, competition for resources including the GPU, etc.). You might put wall clock timers on that thread and see if you can verify/refute that.

Another question, my A thread creates context and calls eglMakeCurrent and eglClientwaitSync, so eglClientwaitSync() will be for the particular command buffer corresponding to the A’s context right?

Do you mean that CPU thread A, being the one that called ClientWaitSync, should be the thread that’s blocked waiting on the signal (if not signaled yet)? Yes, that’s my understanding.