PDA

View Full Version : Swapbuffers priority



macarter
04-14-2006, 08:52 AM
We have windows XP multi-threaded application with the drawing thread running at a real-time priority level. With Nvidia drivers the buffer swap will not occur until a CPU is available that is not running a real-time thread. This is not true with ATI drivers. It therefore appears that the Nvidia buffer swap function does not run at or above the priority level of the calling thread but at some reduced priority level. I have tested with 77.77 81.98 and the 84.21 driver versions with and without vsync enabled on a GeForce 7800 GTX with no difference in behavior. Can anyone confirm our observations? Is there a way to boost the buffer swap priority? Do the Nvidia linux drivers also have this characteristic?

ZbuffeR
04-14-2006, 09:15 AM
:eek: to me realtime priority should be reserved to processes that have finite processing needs, not something having a render loop.

Do you really need realtime prio ? very high is not enough ?

Under linux, the window manager has even lower priority :
"You can lock yourself out of the system by placing a cpu-heavy process in a realtime priority."
... and only root can do that.

Anyway I am not from nvidia so won't be able to help more...

ccbrianf
04-17-2006, 03:05 PM
Originally posted by ZbuffeR:
:eek: to me realtime priority should be reserved to processes that have finite processing needs, not something having a render loop.

Do you really need realtime prio ? very high is not enough ?Yes! Trust macarter, he knows what he is doing. In our business, a missed frame means a lost customer.

Chuck0
04-17-2006, 03:26 PM
Originally posted by ccbrianf:

Originally posted by ZbuffeR:
:eek: to me realtime priority should be reserved to processes that have finite processing needs, not something having a render loop.

Do you really need realtime prio ? very high is not enough ?Yes! Trust macarter, he knows what he is doing. In our business, a missed frame means a lost customer. well then you must have lost all nvidia customers since they loose quite some frames.
i mean setting a thread to realtime on a system that is only running your application is quite an overkill...
What do you gain by doing this? especially if you are working on multi core/processor systems. If the operating system dares to scedule another system process instead of yours this will hardly cost you any frames.

Overmind
04-17-2006, 03:37 PM
In our business, a missed frame means a lost customer.And what does this have to do with setting the rendering thread to realtime priority? Just setting the thread to the highest priority available is not going to help with application performance, it is most likely conterproductive (as you already noticed...).

def
04-18-2006, 07:17 AM
You would have to somehow change the priority of the driver thread...
But I have been working in the broadcast video business for a while and there was never a need or practical reason to run in anything but standard priority. ( using dual cpu machines with WinXP )

ccbrianf
04-18-2006, 08:41 AM
Originally posted by Chuck0:

Originally posted by ccbrianf:

Originally posted by ZbuffeR:
:eek: to me realtime priority should be reserved to processes that have finite processing needs, not something having a render loop.

Do you really need realtime prio ? very high is not enough ?Yes! Trust macarter, he knows what he is doing. In our business, a missed frame means a lost customer. i mean setting a thread to realtime on a system that is only running your application is quite an overkill...

A system is never ONLY running your application. There are LOTS of processes running in an otherwise idle system. All of them can and do interfere with an application running without realtime priority.

What do you gain by doing this? especially if you are working on multi core/processor systems.

An application can have more than just a rendering thread! And, in order for your theory to work, you must have one core per possibly concurrently running thread. My idle XP system has 20+ processes "running". Not all of them need to run at the same time, but that's still a lot of cores to assure there is no competition with a rendering application.

If the operating system dares to scedule another system process instead of yours this will hardly cost you any frames. I disagree. It will cost many over the course of an application run.

ccbrianf
04-18-2006, 08:43 AM
Originally posted by Overmind:

In our business, a missed frame means a lost customer.And what does this have to do with setting the rendering thread to realtime priority? Just setting the thread to the highest priority available is not going to help with application performance, it is most likely conterproductive (as you already noticed...). The point is not to increase rendering application performance (obviously priority can't make code more efficient), but to keep other applications from interfering with the rendering application.

ccbrianf
04-18-2006, 08:45 AM
Originally posted by def:
You would have to somehow change the priority of the driver thread...
But I have been working in the broadcast video business for a while and there was never a need or practical reason to run in anything but standard priority. ( using dual cpu machines with WinXP ) Are you rendering smooth motion video without stepping using this technique?

Overmind
04-18-2006, 09:31 AM
Originally posted by ccbrianf:
The point is not to increase rendering application performance (obviously priority can't make code more efficient), but to keep other applications from interfering with the rendering application. Of course I was talking about realtime performance, not rendering performance. Sorry if I wasn't clear about that.

Under normal circumstances, the highest non-realtime priority is enough for what you want. With normal circumstances I mean that no application is running that abuses thread priority. And if you want soft realtime performance, you better make sure no such application is running on the same machine, otherwise you won't have a chance with realtime threads either.

Threads with realtime priority have only a very limited special use, it is definitely not meant for a rendering loop, or any other permanently running loop, but only for short tasks with low latency. Switching the rendering task to realtime definitely falls into the category "fine if it works, but expect it to break with the next driver release, service pack, ...".

The problem with realtime priority is that you're giving your application a priority that's not only higher than the other applications, but also higher than most operating system services. And this can lead to all sorts of bad effects if you don't voluntarily limit your own CPU use. The swapbuffers deadlock you experienced is a typical example for this kind of problem.

ccbrianf
04-18-2006, 10:18 AM
Originally posted by Overmind:
Under normal circumstances, the highest non-realtime priority is enough for what you want. With normal circumstances I mean that no application is running that abuses thread priority. And if you want soft realtime performance, you better make sure no such application is running on the same machine, otherwise you won't have a chance with realtime threads either.

Agreed.

Threads with realtime priority have only a very limited special use, it is definitely not meant for a rendering loop, or any other permanently running loop, but only for short tasks with low latency.

When a rendering loop is scheduled and coded properly, it is a short task that requires low latency ;-), once per frame.

Switching the rendering task to realtime definitely falls into the category "fine if it works, but expect it to break with the next driver release, service pack, ...".

Yes. Soft realtime programmers are all too familiar with a (Solaris, etc.) kernel patch, driver update, etc. breaking realtime behavoir. That still doesn't mean the expectation and usage is incorrect. It just means it is not a quality testing priority.

The problem with realtime priority is that you're giving your application a priority that's not only higher than the other applications, but also higher than most operating system services. And this can lead to all sorts of bad effects if you don't voluntarily limit your own CPU use.

Yep. And the application has to be designed with this in mind.

The swapbuffers deadlock you experienced is a typical example for this kind of problem. Unfortunately so :-(. One would still expect that swapbuffers should accomodate at least a high priority timeshare application. It appears to not even do that correctly. The swap appears to happen at HIGH_PRIORITY_CLASS, THREAD_PRIORITY_NORMAL.

def
04-19-2006, 02:23 AM
Originally posted by ccbrianf:
Are you rendering smooth motion video without stepping using this technique? Yes I do, but to be honest, there are no other applications running and we disable quite a few unneeded services in Windows to make it as sleek as possible.

Overmind
04-19-2006, 04:09 AM
When a rendering loop is scheduled and coded properly, it is a short task that requires low latency ;-), once per frame.Then code the rendering loop such that it renders the frame and then sends a signal to a lower priority thread that does the swapbuffer and let it wait until it gets the "I'm done" signal back.

If you include the swapbuffers in the realtime rendering loop, it may never yield the CPU. I guess the driver has some work to do on swapbuffers that should not interrupt realtime applications (in the general case), while swapbuffers does some form of busy wait to reduce latency (which is required for vsync). With the communication to a low priority thread, you're basically explicitly telling the driver "you may interrupt me now".


Unfortunately so :-(. One would still expect that swapbuffers should accomodate at least a high priority timeshare application.I'd expect swapbuffers to have a high priority, but not neccesarily the highest. No device driver should have realtime priority per default, this priority is reserved for applications that need to have a priority that is higher than anything else.

ccbrianf
04-19-2006, 09:25 AM
Originally posted by Overmind:
Then code the rendering loop such that it renders the frame and then sends a signal to a lower priority thread that does the swapbuffer and let it wait until it gets the "I'm done" signal back. If you include the swapbuffers in the realtime rendering loop, it may never yield the CPU.

It does wait. That's not the problem.

There are other realtime threads in the application with lower realtime priorities than the rendering loop. Those other threads are interfering with the high priority realtime rendering thread's swapbuffers.

I guess the driver has some work to do on swapbuffers that should not interrupt realtime applications (in the general case),

That may be the case, but I think that may be giving too much credit to the design. It's quite possible that this situation was just not considered. We'd like it to at least be configurable, or to run at the highest thread priority that calls swapbuffers.

while swapbuffers does some form of busy wait to reduce latency (which is required for vsync).

Not in our application. To reduce latency, we don't let frames pile up enough to spin in swapbuffers. Our frame scheduler is fairly sophistocated.


Unfortunately so :-(. One would still expect that swapbuffers should accomodate at least a high priority timeshare application.I'd expect swapbuffers to have a high priority, but not neccesarily the highest. No device driver should have realtime priority per default, this priority is reserved for applications that need to have a priority that is higher than anything else. Agreed, but see above. It should have the highest priority of the calling threads it is servicing.

ccbrianf
04-19-2006, 09:26 AM
Originally posted by def:

Originally posted by ccbrianf:
Are you rendering smooth motion video without stepping using this technique? Yes I do, but to be honest, there are no other applications running and we disable quite a few unneeded services in Windows to make it as sleek as possible. You wouldn't care to share a pointer to this information (tweaks, disabled services, etc.), would you?

def
04-19-2006, 10:37 AM
Originally posted by ccbrianf:
You wouldn't care to share a pointer to this information (tweaks, disabled services, etc.), would you? Nothing special really, just going through all services and deciding what is really actually needed. We have a memory footprint of about 95 Mb for WindowsXP , without networking we would be even lower.
But I am not saying we had problems without disabeling... ;)

ccbrianf
04-19-2006, 01:09 PM
Originally posted by def:

Originally posted by ccbrianf:
You wouldn't care to share a pointer to this information (tweaks, disabled services, etc.), would you? Nothing special really, just going through all services and deciding what is really actually needed. We have a memory footprint of about 95 Mb for WindowsXP , without networking we would be even lower.
But I am not saying we had problems without disabeling... ;) We're somewhat Windows illiterate here :-( (darn UNIX realtime programmers), so I was hoping for an easy out. Oh, well... Thanks.

knackered
04-19-2006, 01:54 PM
I've done realtime video playing too - didn't need to touch the thread priority. If you're missing frames I would suggest your sequencing is wrong. (by missing frames I mean not rendering a frame at consistant monitor refreshes)
Maybe you're doing something like this:-

void renderLoop()
{
if (timeToRenderFrame)
{
uploadTexture();
swap();
}
}

...when really you should be doing this:-

void renderLoop()
{
if (timeToRenderFrame)
{
swap();
uploadTexture();
}
}

Don't take this too literally, it's only an example of what I mean by a sequencing mistake.

evanGLizr
04-19-2006, 02:39 PM
Well, frankly speaking, if you are using REALTIME_PRIORITY, you get what you deserve:


When manipulating priorities, be very careful to ensure that a high-priority thread does not consume all of the available CPU time. A thread with a base priority level above 11 interferes with the normal operation of the operating system. Using REALTIME_PRIORITY_CLASS may cause disk caches to not flush, hang the mouse, and so on.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/setthreadpriority.asp

Windows is not a realtime system, if this is the only problem you've found, consider yourself lucky and expect many more due to priority inversion.

knackered
04-19-2006, 03:40 PM
he obviously hasn't run it on a single processor machine.

yooyo
04-19-2006, 04:48 PM
Can you explain where sync signal comes? Is it external sync or vsync or sync from your app?
The point is... sync drives swap. In case you do vsync SwapBuffer would do the job.

Now.. problem can be a source. Is it same freq as output? If it is then start decoding thread and use wnidows sync objects (events, critical sections and semaphores) to sync decoding and rendering thread.

"Detach" rendering from any windows message processing. You can create window, get it's DC, start new rendering thread, create gl context and run render loop.

Example:

void renderloop()
{
if (wait_for_decoder(20)) // wait max 20ms for new frame.
{
// new frame arrive. upload data...
upload_new_frame();
}
render();
swap();
}

decoder thread should raise event when new frame arrive.

ccbrianf
04-21-2006, 11:45 AM
Originally posted by knackered:
I've done realtime video playing too - didn't need to touch the thread priority. If you're missing frames I would suggest your sequencing is wrong. (by missing frames I mean not rendering a frame at consistant monitor refreshes)We've checked this MANY times and it works correctly on other platforms, so I'm sure this is not the case.

ccbrianf
04-21-2006, 11:49 AM
Originally posted by evanGLizr:
Well, frankly speaking, if you are using REALTIME_PRIORITY, you get what you deserve:


When manipulating priorities, be very careful to ensure that a high-priority thread does not consume all of the available CPU time. A thread with a base priority level above 11 interferes with the normal operation of the operating system. Using REALTIME_PRIORITY_CLASS may cause disk caches to not flush, hang the mouse, and so on.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/setthreadpriority.asp

Windows is not a realtime system, if this is the only problem you've found, consider yourself lucky and expect many more due to priority inversion. Believe me, we know all the above and do expect these issues. It is unfortunate that everyone considers this line of responses when highly skilled professional realtime programers are the ones asking the questions. We have many years of experience in realtime graphics systems.

ccbrianf
04-21-2006, 11:50 AM
Originally posted by knackered:
he obviously hasn't run it on a single processor machine. Who he? Yes, we have run it on a single processor machine with and without hyperthreading. This is how we confirmed the problem. And, using ATI, there is no problem on said machines.

ccbrianf
04-21-2006, 12:00 PM
Originally posted by yooyo:
Can you explain where sync signal comes? Is it external sync or vsync or sync from your app?
vsync.

The point is... sync drives swap. In case you do vsync SwapBuffer would do the job.

You haven't read the thread close enough. vsync drives swap if swapbuffers has completed.

The problem is that a high priority realtime thread calling swapbuffers on nVidia will block if a lower priority thread is still running. Any thread above the priority stated in my previous post will cause swapbuffers to wait for it to complete. This is wrong.

The high priority realtime thread calling swapbuffers should not have to wait on a lower priority thread to actually do the work. Or, it should at least inherit the priority of the thread requesting the work.

Now.. problem can be a source. Is it same freq as output?

Yes.

If it is then start decoding thread and use wnidows sync objects (events, critical sections and semaphores) to sync decoding and rendering thread.

This is realtime rendering, not video playback.

"Detach" rendering from any windows message processing. You can create window, get it's DC, start new rendering thread, create gl context and run render loop.Already all done long ago.

evanGLizr
04-21-2006, 12:50 PM
Originally posted by ccbrianf:

Originally posted by evanGLizr:
Windows is not a realtime system, if this is the only problem you've found, consider yourself lucky and expect many more due to priority inversion. Believe me, we know all the above and do expect these issues. It is unfortunate that everyone considers this line of responses when highly skilled professional realtime programers are the ones asking the questions. We have many years of experience in realtime embeded proprietary IG systems. It's also unfortunate that you keep being in denial when it's a highly skilled professional OpenGL driver developer with many years of experience answering.

Like I said Windows is not an "embedded realtime system", so your experience is of no help here. RTOS have mechanisms to deal with inversion priority, Windows doesn't. RTOS have mechanisms to deal with hard and soft deadlines, Windows has nothing like that. And I should know, I wrote an RTOS kernel myself for x86 with immediate ceiling priorities and the works.

ccbrianf
04-21-2006, 06:05 PM
Originally posted by evanGLizr:

Originally posted by ccbrianf:

Originally posted by evanGLizr:
Windows is not a realtime system, if this is the only problem you've found, consider yourself lucky and expect many more due to priority inversion. Believe me, we know all the above and do expect these issues. It is unfortunate that everyone considers this line of responses when highly skilled professional realtime programers are the ones asking the questions. We have many years of experience in realtime graphics systems. It's also unfortunate that you keep being in denial when it's a highly skilled professional OpenGL driver developer with many years of experience answering.

I'm sorry if I offended you. That was not my intention. It's also hard to know anyone's background on these forums.

It just seems the typical response to any realtime question, Windows or otherwise, is "Don't do that, you don't know what your doing." We in fact do know what we're doing in realtime programming, so I was trying to discourage that. We are still learning the poor soft realtime nature of Windows, but have fought through these soft realtime issues on Solaris before.

Like I said Windows is not an "embedded realtime system", so your experience is of no help here.

We well know that Windows is not an embedded realtime system. Unfortunately, we're still figureing out exactly what that means. The same warnings you gave also apply to Solaris, but it does not have these issues.

RTOS have mechanisms to deal with inversion priority, Windows doesn't. RTOS have mechanisms to deal with hard and soft deadlines, Windows has nothing like that.

Again, well known.

And I should know, I wrote an RTOS kernel myself for x86 with immediate ceiling priorities and the works. Us too.

After testing networking, which our application also needs soft realtime response for, and finding that the stack seems to run at the same priority as the swapbuffers helper, we have decided to reprioritize our application into the HIGH_PRIORITY class and carefully set thread priorities to avoid interfering with swapbuffers and the network stack.

Thanks for the reponse.

knackered
04-22-2006, 05:04 AM
so after insulting everyone who's tried to help you, you just back down without admitting you're wrong or apologising for your attitude.
Don't hurry back with any further problems.

ccbrianf
04-22-2006, 09:59 AM
Originally posted by knackered:
so after insulting everyone who's tried to help you, you just back down without admitting you're wrong or apologising for your attitude.
Don't hurry back with any further problems. Um... I thought I made it pretty clear in my last post that my intention was never to insult anyone. It was simply to emphasize that as a professional realtime programer I was well aware of the traditional warnings, priority inversion, and Windows soft realtime nature. It was only an attempt to direct the conversation toward constructive comments about specific realtime swapbuffers behavior on nVidia rather than generic "don't do realtime because you don't know what you're doing" style of responses.

If I need to apologize personally to anyone, I will gladly do it. I thought that stating my intention would clear up the misunderstanding without need for that. I guess I was wrong. So, if anyone feels the need for a personal apology, please speak up.

I don't believe I was wrong about how it "should" work in an ideal realtime world. However, I know that Windows is far from such. Conceeding that we decided to do what the evidence suggests is not admitting anyone is right or wrong, but that we have to live with what is provided. I do not believe anyone gave us the specific information necessary to make this determination via this thread.

Again, I'm sorry you were offended. I try to keep my discussions technical in nature without any personal connotations. It is not always easy doing that correctly in print.

Please accept my apology, knackerd, since I see you were one of the people offended.

ZbuffeR
04-22-2006, 11:23 AM
Conceeding that we decided to do what the evidence suggests is not admitting anyone is right or wrong, but that we have to live with what is provided. I do not believe anyone gave us the specific information necessary to make this determination via this thread.I gave the exact solution you are using on the first answer, if don't want to listen, at least please don't say nobody helped ...

knackered
04-22-2006, 01:58 PM
Originally posted by ccbrianf:
Please accept my apology, knackerd, since I see you were one of the people offended. Yes, I was offended, but seeing as though you've been gracious enough to admit you behaved like an arse, I accept your apology on behalf of the rest of the GL posse.
Also, who the hell is macarter? I imagine him as being some poor junior programmer you ordered to find the answer to your problem, before elbowing him out of the way when the choice to run your app at realtime priority on a notoriously non-realtime OS was ridiculed on a public forum.
I could be wrong though.

edit: removed corporate thingy, just a joke, not serious, but could feasibly be misunderstood.

ccbrianf
04-22-2006, 04:05 PM
Originally posted by ZbuffeR:
I gave the exact solution you are using on the first answer, if don't want to listen, at least please don't say nobody helped ... Once again I have been misconstrued. I never said no one helped. Many people gave us usefull information. But, I believe your exact solution of:


Originally posted by ZbuffeR:
to me realtime priority should be reserved to processes that have finite processing needs, not something having a render loop.

Do you really need realtime prio ? very high is not enough ?,lacked the concrete details necessary for us to craft an exact solution or understand that you were making more than a somewhat educated guess.

If you had stated:

"Many system processes and drivers that launch helper threads run at HIGH_PRIORITY_CLASS, THREAD_PRIORITY_NORMAL, so you should be careful what threads run above this. I don't know about nVidia swapbuffers specifically, but this is the general rule of thumb."

that would have given us enough for an exact solution.

Our application is multithreaded and requires basic thread prioritization. It is important for us to know the exact priority at which swapbuffers and other important system process run in order to allocate these priorities correctly. Your resonse may have lead us in the right direction, but it was unfortunately not enought for an exact solution. evanGLizr's response was much closer to this.

I/we do appreciate your help, though. Thanks.

ccbrianf
04-22-2006, 04:14 PM
Originally posted by knackered:

Originally posted by ccbrianf:
Please accept my apology, knackerd, since I see you were one of the people offended. Yes, I was offended, but seeing as though you've been gracious enough to admit you behaved like an arse, I accept your apology on behalf of the rest of the GL posse.

I am trying to be very professional about this and would appreciate you not using that kind of language to accept my apology, please.

I could be wrong though.You are very wrong, and please do not bring any corporate entity into this technical discussion. Thanks.

ccbrianf
04-22-2006, 06:06 PM
I would like to make one last sincere apology to anyone I offended, especially ZbuffR since my last response to him may be construed this way as well.

Again, my posts were never meant to be offensive, insulting, condescending, or otherwise. Since I can't seem to properly word my posts to prevent this impression, I'll be voluntarily leaving this community.

Thank you once again for all the constructive advice.

Cheers.

knackered
04-23-2006, 08:12 AM
whatever.