Geforce 680GTX and bindless textures, OpenGL ext?

I am wondering how is Nvidia going about doing this? UBO? or some kind of extension that isn’t out yet? I been waiting for ever on this tech, to allow you to bind as many textures as you need no more need for texture atlases, texture arrays ect…

There is this: Bindless Texture and image API patent, going to the last 2 pages of the pdf of the patent application, one can see the interface. It looks like to me that they are doing a very similar interface as for GL_NV_shader_buffer_load, you make the “texture handle resident”, (see MakeTextureHandleResidentNV), and looks like a new uniform type whose value is set with the function UniformHandleui64NV. The application also mentions a spec extension name (which I guess is what the final beans will be): NV_bindless_texture. There is stuff for both image (i.e. write access too) and textures (read only)…

Obviously, NVIDIA is trying to find a way to get people to write as little OpenGL in their OpenGL applications as possible. Currently, you can use NVIDIA-sanctioned APIs for shaders (Cg/assembly), vertex data (bindless attributes), uniform blocks and buffer textures (bindless uniforms). And now texture/image binding.

What does that leave? Blending? The viewport? FBOs (they’ll probably have a way to use “bindless textures” with that)? Can you even call it OpenGL at that point when 90% of your “OpenGL” calls have “NV” at the end?

Obviously, NVidia is trying to innovate to provide the fastest method to use their GPUs possible.

The purists can scoff. Those of us on the practical side shipping products where perf sells will happily just use it when advantageous and available. I’m personally glad they’re applying their expertise here.

Hopefully the ARB is looking carefully at what the vendors come up with for driving GPUs harder/faster and will promote up what is (or could be) broadly supported.

Lets put it in a different perspective: the GL specification, although worked on ALOT and has a revival has not kept up with the hardware capabilities… neither has D3D.

AMD has partially resident textures (Cool)
NVIDIA has “no more limits on number texture units” (Cool)

and neither D3D or OpenGL-Core have anything to expose these features. Lets face it, D3D and unextended OpenGL represent the lowest common denominator.

FBOs (they’ll probably have a way to use “bindless textures” with that)?

Just to put gasoline on the fire: GL4.2, if one wishes to abuse it, already has something to get away from FBO’s: the image API (i.e. GL_ARB_shader_image_load_store) and now with NVIDIA’s jazz, it looks like the limit on number of these active at once is essentially only bounded by VRAM… though using image instead of FBO’s to render to texture is going to foo-bar performance… it is not meant for that :smiley:

I’d doubt though that 90% calls end with NV

That’s about right. The extension spec was published on the NVIDIA developer site this morning:

http://developer.download.nvidia.com/opengl/specs/GL_NV_bindless_texture.txt

I’ve submitted it for publication in the OpenGL extension registry, but it hasn’t been pushed out yet.

That’s about right. The extension spec was published on the NVIDIA developer site this morning:

http://developer.download.nvidia.com/opengl/specs/GL_NV_bindless_texture.txt

I’ve submitted it for publication in the OpenGL extension registry, but it hasn’t been pushed out yet.
[/QUOTE]

Does it require Kepler GPU?

Yes; the extension is not supported on Fermi or earlier NVIDIA GPUs.

Why not? If it proves to be useful and becomes adopted by AMD in some form why not?

OK, then we’ll call it NVGL. :slight_smile:
NV is very permissive with its GL implementation. Nobody prevents anyone to use “standard” OpenGL. But, if there is a faster way to do something, and one can afford it to buy particular piece of hardware in order to achieve better performance, than I don’t see any reason why NV extensions shouldn’t be used.

Exactly! Those extensions maybe cannot be used for games and other software aiming wide audience at the moment, but for the specialized software they are more than welcome. Furthermore, if audience become aware of benefits and increase purchase of NV cards it will force competitors (I mean AMD at the first place) to improve their capabilities and drivers.

There are no downsides of API improvement. We all benefit. Even AMD users will benefit from bindless, if AMD accepts the challenge.

Lets face it, D3D and unextended OpenGL represent the lowest common denominator.

So why is nothing being done to improve the LCD?

Bindless vertex rendering identified a performance issue with regard to caching and such when it comes to vertex setup. That was years ago. The ARB has done nothing to solve this performance issue. Integrating bindless as-is is a bad idea, but they’ve done nothing to even attempt to correct the binding problem.

Why not? If it proves to be useful and becomes adopted by AMD in some form why not?

Because NVIDIA patented it. Which means AMD would have to license it from NVIDIA.

Somehow, I don’t see that happening.

But, if there is a faster way to do something, and one can afford it to buy particular piece of hardware in order to achieve better performance, than I don’t see any reason why NV extensions shouldn’t be used.

It’s called “vendor lock-in.”

NVIDIA needs to retain their hold over the workstation graphics card market, because CPU-built GPUs are currently eating the low-end GPU market. The best way to do that is to change their API from OpenGL to something proprietary. Of course, workstation app creators aren’t going to change APIs wholesale.

But they’d be willing to make a few modification to the current one.

Hence NVGL. Cg/assembly shaders, fed by bindless vertex attributes, accessing bindless uniforms, and now sampling from bindless textures. You can pick how much NVGL you want, but once you start using it, your codebase is at least partially NVGL. And thus, unless you write a new codepath, not compatible with non-NVIDIA hardware.

For “Those of us on the practical side shipping products,” this means that they’ll have to continue spending money on NVIDIA’s products. Rather than letting capitalism and competition work to drive prices down, they’re locked into NVIDIA’s world. They write NVGL applications that only work on NVIDIA’s GPUs, and their customers must therefore continue to buy NVIDIA’s cards. It creates a cycle, a monopoly that’s difficult to break.

That is only a good thing for those who are personally invested in NVIDIA’s world.

Problem I have is I want this on all Gfx cards, and this feature is exactly what I wanted 8 years ago… Just couldn’t understand why the hell I need to bind a texture to only 8 units or 16 and be limited when you have 2GB of VRAM and should only have to call a pointer to that memory location… Sigh hopefully AMD gets on board one way or another… As for Intel… LOL…

Just couldn’t understand why the hell I need to bind a texture to only 8 units or 16 and be limited when you have 2GB of VRAM and should only have to call a pointer to that memory location…

Yeah, those hardware limitations that exist to make graphics cards fast and useful are so terrible.

Wait…

It’s one thing to say that you’ve always wanted it (though to be honest, with today’s 16 textures per stage, I can’t see how this could be particularly limiting. That’s not even the main feature of bindless texturing). It’s quite another to say that you should have had it on hardware that had separate vertex and fragment shaders.

Textures are not “pointer to that memory location”; there’s a lot more to accessing a texture than just knowing where it is in memory. You have to know its format. You have to cache access to it. You have to decompress it, where applicable. You have to filter it. Etc.

Having limits on the numbers of these things is not unreasonable. We might like it if these limits didn’t exist, but we’d also like it if we could arbitrarily read and write from buffers without worrying about ordering and have the graphics card magically make it all work exactly the way we want.

We get what works, not what we like.

It’s called “vendor lock-in.”

NVIDIA needs to retain their hold over the workstation graphics card market, because CPU-built GPUs are currently eating the low-end GPU market. The best way to do that is to change their API from OpenGL to something proprietary. Of course, workstation app creators aren’t going to change APIs wholesale.

That sounds like a paranoid delusion. Here is why: there is a spec for the extension. Before wading into the patent issue, lets not forget that bits of GL4 core and GL3 are also subject to patents. Since the extension spec is public, subject to a patent-cross-licensing deal, other vendors could implement it. The real issue is this: can the hardware of other vendors do it? Different hardware works in different ways. We have seen that ATI/AMD exposed (all the way back to 4xxx series) hardware features that were not in the GL3 spec (anyone remember GL_AMD_vertex_shader_tessellator ? )

Bindless vertex rendering identified a performance issue with regard to caching and such when it comes to vertex setup. That was years ago. The ARB has done nothing to solve this performance issue. Integrating bindless as-is is a bad idea, but they’ve done nothing to even attempt to correct the binding problem.

Ahh… such sweet memories, I remember quite some time ago the posting between Alfhonse and I about bindless. But lets get a few things clear. Firstly “bindless” with respect to buffer objects is actually two bits:

  • [li] GL_NV_shader_buffer_load defined a “GPU-address” backing stores of buffer objects. It allows one to have unrivaled flexible random read access of buffer objects in shaders. It gave pointers in shaders with the backing store being buffer objects. No more limits on reading buffer object data. This to me was the most coolest thing. This is orders of magnitude better than what texture buffer objects allowed. It is still the best thing since sliced bread IMO for buffer read access (and NVIDIA for it’s GL4 era added write access too)[*] GL_NV_vertex_buffer_unified_memory allowed one to set a vertex attribute source from a “GPU address” of a buffer object.

The issue of binding is this: GL makes the GL names functionally act like pointers to GL objects. Except names are integers, not pointers. One lookup right there to covert from integer to CPU-address. Another bit of CPU cache-thrash is then to dereference that pointer to get the numbers to send to the video card… and then also the additional checks that things are cool. The concept of GPU address cut all of those out.

The issue with bindless (both texture and buffer object) I see is this do other GPU designs work enough in the same way to do it? That is the question and the answer.

Moving on, the whole point of GL extensions was so that a piece of hardware can expose more capability that the lowest common denominator offers. Just to make sure one takes off the tin-foil hats, the people that wrote that extension are also involved in making the spec!

But really, saying it is for vendor-lock-in is a tinfoil hat joke.

I’d doubt though that 90% calls end with NV.

Well, take any OpenGL application. What are the most common calls within a frame of execution? Not what you find in your code, but what actually gets called per-frame.

glVertexAttribPointer (and its ilk). glDraw*. glBindBuffer. glUseProgram. glBindTexture. glUniform.

Those are the big ones, the most frequently used OpenGL functions in most real applications. The only ones that NVIDIA doesn’t have alternate forms of are the glDraw* family.

So answer me this: is it healthy for an API where a fair number of people write applications, apps that sell, that have the majority of the possible API calls instead use proprietary versions? I’m not talking about using actual hardware features. This is basic, fundamental stuff: “I’ve got some vertices to render. Do that. I’ve got a shader here, use that.” Etc.

The last time this sort of thing happened was back when you had NVIDIA’s NV_vertex_array_range, NV_vertex_shader, NV_texture_shader, NV_register_combiner, etc, all against against ATI’s EXT_vertex_shader, ATI_fragment_shader, ATI_vertex_array_object, and so forth.

That was not a good time for OpenGL as an API. It was the time when some of the most vehement OpenGL supporters fled the API in terror and embraced D3D (though admittedly for different reasons. Namely, because there was no core equivalent). OpenGL never really recovered from this period.

The “purists” among us are purists because we have a vested interest in OpenGL being platform-agnostic. It’s one thing to have some extensions that offer up different hardware functionality. It’s quite another when you can walk through page after page of ostensibly OpenGL code and see very few actual OpenGL calls. This ability is potentially dangerous to the long-term stability of the API.

That sounds like a paranoid delusion. Here is why: there is a spec for the extension. Before wading into the patent issue, lets not forget that bits of GL4 core and GL3 are also subject to patents. Since the extension spec is public, subject to a patent-cross-licensing deal, other vendors could implement it.

Yes, some of GL4 and 3 are subject to patents. The problem is that ARB members allow people to implement OpenGL without having to make “patent-cross-licensing” deals for patents owned by ARB members. It’s part of what the ARB/Khronos does, to allow more people to implement their specifications. The significant GL3/4 patent issues are about patents held by those outside the ARB; they can’t do anything about those.

This bindless texture stuff isn’t the case. NVIDIA has a patent on it, and they’re an ARB member. And unless they choose to allow the ARB to stick it into OpenGL, there’s nothing anyone can do without licensing it directly. And why would NVIDIA license the tech to anyone? The only people that would want to license it to are their competitors. Is it “paranoid” to say that a company isn’t going to actively aid their competitors?

You’re not going to see bindless texturing in OpenGL core. You’re not going to see bindless texturing implemented by others on their own.

As for lock-in, just look at Dark Photon. For him and his work, by all appearances, OpenGL is exactly what NVIDIA says it is. Which is fine for his needs; this isn’t an attack on him or his work. But this means that AMD simply cannot break into that market. If AMD were to magically have hardware 2x as good as NVIDIA’s, with drivers twice as stable, at half the price, they still couldn’t get in. Because they can’t implement the NVIDIA extensions.

That’s vendor lock-in. The software is written for NVIDIA’s hardware, so the people who buy that software cannot buy anything but NVIDIA’s hardware. It creates a closed loop that makes it more difficult for others to break into the market.

NVIDIA isn’t making these extensions out of the goodness of their heart. They make them because it makes software writers want to use it. Which in turn encourages users of that software to buy NVIDIA (assuming that there’s a non-NVIDIA fallback rendering path). Which gets NVIDIA more market share. Which makes software writers more likely to use NVIDIA extensions. And thus the cycle continues.

That’s pretty much the definition of “vendor lock-in”. You know, the whole “Embrace, Extend, Extinguish” strategy (though perhaps without the third part). Is it “paranoid” to call something exactly what it is?

You want to split hairs about whether an NV extension can be implemented because of hardware reasons (lack of hardware), patent reasons, driver reasons, or whatever. That’s irrelevant; what matters is the effect of having so many calls in an application end in “NV”.

Moving on, the whole point of GL extensions was so that a piece of hardware can expose more capability that the lowest common denominator offers.

That’s hardly “the whole point”; there are many reasons to have extensions. For example, core extension allow the ARB to not have to release GL 3.4, 3.5, etc, while still allowing users to gain access to things like separate_shader_objects and shading_language_420pack. It allows the ARB to experiment with APIs before enshrining them in core, as was done with FBOs. Back when the ARB only released a new GL version once in a blue moon (and almost never with new, unproven extensions), it allowed people to use hardware functionality that were widely supported, but not available in core yet.

The only extensions where “the whole point” is to access a particular piece of hardware are proprietary extensions.

Alfonse, your discussion is brilliant. But you have overseen a simple fact that the competition on the market is a very important progress driving force.

Having a patent does not necessarily imply preventing others to use it. Clipmaps are patent-protected, but it doesn’t mean anyone implementing it in its own application should pay for it. By patenting the company ensures that nobody else can do the same and impose some restriction or forbid the actual implementation. I’m not sure what restrictions of using bindless textures are, but we will see soon.

Not true. If that happened, we would flip. We do not require NVidia extensions, but we use them when available (necessary just to support older cards anyway – still supporting GeForce 7s).

That said, they have to compete with the performance we get on NV with bindless/etc., so unless they support it (or something better), they’re handicapped by that.

If AMD were to magically have hardware 2x as good as NVIDIA’s, with drivers twice as stable, at half the price, they still couldn’t get in

Just to point out: Subject to AMD hardware being able to do it, there are no issue for the original bindless extensions (GL_NV_shader_buffer_load and GL_NV_vertex_buffer_unified_memory) to be on AMD hardware… As for texture_bindless, considering that only the newest NVIDIA’s cards support it, it’ll be a while…

along similar lines, AMD could also implement CUDA: large portions of it are open sourced as are the file specification public…

ok back to topic:

what i do not understand is why they introduce uint64 as the opaque texture handle? why not introduce something along the lines of:

typedef struct __GLtexhandle *GLtexhandle;

this would be a perfect opaque handle on any platform. sure in the shader code you get into trouble when passing uint32/uint64 values into a shader and then casting them to a sampler (maybe this is not a good idea to start with). i think if something like this is not handled right from the start somewhere along the line we will get into similar trouble as currently with the unsigned int object handles.

this would be a perfect opaque handle on any platform.

And thus you’ve answered your own question :wink:

OK the technical explanation is… I don’t know.

When I started writing this post, I thought the technical explanation was simply that it looked more like the other bindless buffer APIs. Except that bindless texturing works differently. With the bindless buffers, you only get a “handle” after making the buffer resident. With bindless textures, you get the handle first. Indeed, the handle seems to be little more than an API mechanism to say, “I promise not to touch this texture/sampler’s state anymore.”

Indeed, having now read the specification, this (or something like it) looks surprisingly like something the ARB could actually do, unlike the lower-level bindless buffer stuff (though the patent issues make it rather more difficult, unless NVIDIA is willing to give implementers of OpenGL a free license). Especially considering the fact that the ARB implemented texture_storage with its immutable requirements. The handle stuff is pretty much an extended form of immutability.

The ARB version certainly should use a pointer rather than an integer. The ARB version would also be more likely to attach textures to the context rather than the program. I much prefer them being attached to the context; it involves less binding if you’re using the same textures across several programs. And the ARB version is far more likely to have a way to have an API to “unhandle” a texture, so that you can essentially unlock it from being state-immutable.

Indeed, the idea of state-immutability is something that could be extended to other object types like VAOs.

A better question is this:

As far as I can tell from the OpenGL extension viewer’s database, AMD_seamless_cubemap_per_texture is not implemented by any NVIDIA hardware. So seamless cubemapping is not possible when combined with this extension.

That’s rather odd. Understandable, since seamless cubemapping without the AMD extension is global state rather than per-object state. But still odd.

I did some tests using this functionality regarding accessing a large number of individual textures form a single shader in the context of virtual texturing. I posted some results in the drivers section [1], because i ran into some problems and think it will fit better in that sub-forum.

[1] Nvidia: bindless textures - first experiences and bugs - OpenGL - Khronos Forums

Regards
-chris