Extensions for controlling SLI/Cross Fire

The current generation of drivers for doing SLI/Cross Fire entirely within the OpenGL driver itself, with the application developer just treating the two(or more) graphics cards as a single graphics context.

While this is convinient for making lots of applications work with SLI/Cross Fire it, its certainly isn’t the most efficient way to do things, and will only scale really well for fill limited applications.

What I’d like to see is a set of extensions for controlling the compositing backend of the SLI/Cross Fire hardware, such that at the application level we could set up two (or more) sepearte graphics contexts and do all the cull and state sorting indivudually for each context, and just send down to each context only the geometry and state that is required.

Doing so will reduce the amount of work required by the graphics driver, and the amount of state thrashing that will occur (since less state is being used per context).

The upside to all of this? Better performance of course, and even more compelling reason to have mulit core CPU’s and multiple GPU’s.

Not all applications will have the ability to do the the required multi-threadeding/multi-context work, but there are now a sizeable number of applications that are now scalable in this way - all OpenSceneGraph based application for instance, already have this scalability inbuilt thanks to the multi-threaded/multipipe support.

Is there anything in works in this direction?

NV was working on an extension for their SLI.
I think it is called NV_gpu_affinity (not sure)
My guess is that you can specify on which GPU you want your GL context to run.

What I’d like to see is a set of extensions for controlling the compositing backend of the SLI/Cross Fire hardware, such that at the application level we could set up two (or more) sepearte graphics contexts and do all the cull and state sorting indivudually for each context, and just send down to each context only the geometry and state that is required.
I don’t understand. State sorting is something you should already do for each context. Culling large amounts of geom is also something you should do.
In principle, the driver should not state sort, or check for redundent calls, or do geom culling.
If it’s doing that, then it is a high level driver.

Originally posted by V-man:
NV was working on an extension for their SLI.
I think it is called NV_gpu_affinity (not sure)
My guess is that you can specify on which GPU you want your GL context to run.

Thanks for the pointer. Do you have any relevent links?

However, we’d more that just sepecifying an affinity, we’d need to setup the composition of the final image from the seperate contexts.

[quote]What I’d like to see is a set of extensions for controlling the compositing backend of the SLI/Cross Fire hardware, such that at the application level we could set up two (or more) sepearte graphics contexts and do all the cull and state sorting indivudually for each context, and just send down to each context only the geometry and state that is required.
I don’t understand. State sorting is something you should already do for each context. Culling large amounts of geom is also something you should do.
In principle, the driver should not state sort, or check for redundent calls, or do geom culling.
If it’s doing that, then it is a high level driver.
[/QUOTE]The point I was trying to make is that the higher level API can do culling and state sorting for each segment of the display. If you cull geometry, you also cull the state associated with it, so its a double win. This is very much the domain of the high level API/application rather than OpenGL.

The OpenGL driver can’t do this culling - it doesn’t know the bounds of the goemetry, especially now we have vertex programs, so it’ll have render everything you pass it for both GPU, and do all the state changes as well, as it won’t know what relevant and whats not. This is all a great expense that you could avoid using doing things at a higher level.

The project I’m involved in, OpenSceneGraph, already does the culling and state sorting, and can do it for multiple graphics context and run it all multi-threaded. It uses Producer to do configuration of the windows and the multi-threading, Producer could easily be extended to set up the compositing, and the OpenSceneGraph would just leverage this automatically.

The upside to all of this? Better performance of course, and even more compelling reason to have mulit core CPU’s and multiple GPU’s.
There should never be a reason for people to have multiple graphics cards. People should never be forced to have those kinds of systems to get reasonable game performance.

I don’t ever want to have to put two hot graphics cards into a system in order to get reasonable performance out of it. Nowadays, you can buy two cheaper cards and get performance similar to one expensive one; that’s what SLI is for.

Originally posted by Korval:

I don’t ever want to have to put two hot graphics cards into a system in order to get reasonable performance out of it. Nowadays, you can buy two cheaper cards and get performance similar to one expensive one; that’s what SLI is for.

We all want infinite performance for zero cost. This doesn’t change the fact that if you do have dual core, dual graphis card system you want to maximize the performance potential from it.

If you maximize the performance from it then your two cheap cards will be a more compelling replacement for a single expensive one.

Or if you do have an application that truely needs the performance such as medical volume visualization then you might just have a real reason for wanting that scalability that even a singel expensive card can’t handle.

I was able to find the page
http://www.opengl.org/discussion_boards/ubb/ultimatebb.php?ubb=get_topic;f=3;t=014507#000000

I’ve read (a little digression) NV made a special driver path for HAVOC FX.
Something similar could be pretty interesting to expose. Ok, I really understand that HAVOK is an estabilished middleware but come on…

I think back in the early GL2 proposals by 3Dlabs there were something like “task objects”… maybe this could cut it, or maybe I’m just goofing up.

By sure, setting a simple affinity does not seem to cut it well for me: it takes less work but also delivers less potential.

I also hate SLI/Crossfire, but some users are getting it, so at least I should think at this. It does have interesting applications in non-gaming scenarios however.

Korval wrote:
There should never be a reason for people to have multiple graphics cards.

I strongly disagree. There are many reasons for this. Put into context of what you say next though…

People should never be forced to have those kinds of systems to get reasonable game performance.
(emphasis added)
… I suspect (hope) the previous sentence was in relation to this? If so, I wholeheartedly agree. That anyone (in the role of consumer/user) being forced into anything I’m very much against.

Now, to add something possibly useful to this thread - yes, I do think the vendors could add even more vendor-spcific extensions to “their” OpenGL API, for very, very specific purposes, and such things should never show up on the SGI extensions registry site (unfortunately it’d have to be in the context of the OpenGL API why that could be hard to enforce). But I’m also a firm believer in that such additions shouldn’t harm the millions upon millions of customers not running such setups or requiring this functionality, meaning the vendors would have to create specific drivers and specific install packages for these very special needs.

I’m totally against forcing even a single byte extra download or driver size down millions upon millions of users’ throats just to provide what less than a millionth of the customers/users need, want, would or even could use.

(yeah, yeah, I know, isn’t it silly to say this in the light of a single compressed driver download from major vendors is now so large as to have surpassed the size of many whole operating system distributions)

But I’m also a firm believer in that such additions shouldn’t harm the millions upon millions of customers not running such setups or requiring this functionality, meaning the vendors would have to create specific drivers and specific install packages for these very special needs.
I’m sure the IHVs wouldn’t mind ;-). After all, this SLI stuff is good for business. Why buy 1 card when you can buy 2?

I’m totally against forcing even a single byte extra download or driver size down millions upon millions of users’ throats just to provide what less than a millionth of the customers/users need, want, would or even could use.
Well, wouldn’t that view discourage a unified driver architecture, like Nvidia’s?

Originally posted by tamlin:

Now, to add something possibly useful to this thread - yes, I do think the vendors could add even more vendor-spcific extensions to “their” OpenGL API, for very, very specific purposes, and such things should never show up on the SGI extensions registry site (unfortunately it’d have to be in the context of the OpenGL API why that could be hard to enforce). But I’m also a firm believer in that such additions shouldn’t harm the millions upon millions of customers not running such setups or requiring this functionality, meaning the vendors would have to create specific drivers and specific install packages for these very special needs.

Having very specific extensions is a good thing. People are demanding to have SLI setup and specific people need to access that with GL. For sure, D3D won’t be offering those options.
It’s a niche market that exists. It’s not for everyone because SLI is expensive.

SGI sites lists nearly all extensions ever created. You don’t want it listed there for what reason?

Leghorn: In a way, yes. I could be seen as “dissing” vendors for providing “unified drivers” nowadays - when those downloads has reached sizes surpassing whole operating systems.

Let me put it this way: You have a TNT2, you run Windows 2000, you are on dial-up and you pay by the second you are connected. Would you prefer a driver for your card and be done with it, or would you prefer to bring down 98% more useless stuff like drivers in Korean, and hacks and workarounds for GLSL that weights in at tens of megabytes?

V-man: The reason I don’t want it listed, is that I fear this is just a fad blowing over, just as VL (remember VESA Local bus back in the beginings of the '90s?) cards were. Having such a small, both in potential number of users and even lifetime, thing added to the SGI registry I figured would give it more credibility than it deserves.

On the other hand, you do have a point. Many are the vendor-extension listed that today aren’t supported by anyone else, and in cases not even themselves (even some EXT extensions are in the latter camp if memory serves me).

Hell, I don’t know. As the potential benefit would be for such a small market, I just didn’t want to see this become another thing where the many are punished for the cravings of a few. If that’s no longer a virtue, so be it.

you are on dial-up and you pay by the second you are connected.
You should get a better ISP. I’ve never had a dial-up ISP that required payment by the second connected.

If you have a TNT2, there is no need to update your drivers everytime a new one comes out. They only work on drivers for Gf FX and above. That is my guess, but I’m pretty sure Gf4 and below are dead. Considering many people have Gf2, Gf3 or Gf4, maybe if a new popular game comes out that supports those dinosaurs, they might patch the driver.

Multi GPUs are not for everyone, but I think they will be around for years until someone comes up with some miracle tech to run these things at 10GHz while using 10W
It is likely that dual GPU per card will resurface. Remember that ATI tried this once and I think 3Dlabs has them too.

Still, I prefer going to the registry rather than searching. It is where all extensions should be. They should perhaps sort them as those that are in core, those that can be forgotten, …

Would you prefer a driver for your card and be done with it, or would you prefer to bring down 98% more useless stuff like drivers in Korean, and hacks and workarounds for GLSL that weights in at tens of megabytes?
Well, as long as they don’t toss in the lost Brady Bunch episodes, I don’t care. Bandwidth is pretty cheap these days.

Originally posted by V-man:
If you have a TNT2, there is no need to update your drivers everytime a new one comes out. They only work on drivers for Gf FX and above. That is my guess, but I’m pretty sure Gf4 and below are dead.
hm…i had a gf4 mx440- not even GL_ARB_fragment_program in the linux driver. does anybody know if GL_ARB_fragment_program is included in the windows driver for gf4?

Originally posted by RigidBody:
hm…i had a gf4 mx440- not even GL_ARB_fragment_program in the linux driver. does anybody know if GL_ARB_fragment_program is included in the windows driver for gf4?
It is not. The GF4 HW does not have sufficient capabilities, especially the MX version.

For those who doubt the value of impact of multiple GPU machines, a recent poll on modsim.org is interesting, 39% of the systems are using multiple GPU machines.

http://www.modsim.org/modules.php?op=modload&name=Downloads&file=index&req=getit&lid=16

While vis-sim is relatively small market compared to games, it is a market that really needs scalability in performance. The extensions I’ve requesed for controlling the compositing present in the current SLI/Crossfire systems would certainly help achive this.

The extensions would also just so happen to benefit games too :slight_smile: