Rendering without Data: Bugs for AMD and NVIDIA

According to the OpenGL 3.3 core specification, it should be possible to bind a completely empty VAO (with all attributes disabled) and render with it with glDraw*Array. The vertex shader will get default values for any numbered inputs, gl_VertexID will be filled in, etc.

According to the OpenGL 3.3 compatibility specification, it is not possible to do this. Compatibility specifically says that something must be bound to attribute 0 or glVertex, or rendering fails with GL_INVALID_OPERATION.

Both AMD and NVIDIA get this wrong, though in different ways.

On AMD, it fails in both core and compatibility. On NVIDIA it succeeds in both core and compatibility.

So, that is 2:0 for NVIDIA! :wink:

I know that it works on NV, and I’m using it extensively. Although specification is something that prescribes what should be implemented, I like NVIDIA’s pragmatic implementation.

That’s great that you have drawn attention to the fact that AMD doesn’t support something that should. I’ll rather change Compatibility specification than lose functionality.

I’ll rather change Compatibility specification than lose functionality.

And I might agree, except you can’t change the compatibility specification. It’s already written; it’s done, it exists, and it cannot be changed (outside of making higher version numbers).

I’m talking about conformance with what the spec actually says, not what we might want to have.

And NVIDIA allowing it in compatibility is just as bad as AMD disallowing it in core.

Specifications exist for a reason. Being too permissive is no better than being to restricted. Both of them are non-conforming.

Indeed, I would go so far as to say being too permissive is worse. Being too restricted means that your code will still work on other implementations, even if it won’t be as optimal. Being too permissive means that code will break.

And then people who are used to that “functionality” will demand that a vendor who is conformant to the spec “fix” their implementation.

Of course that something written and published cannot be changed easily, but, as you’ve said, there will be future releases.

Strongly disagree!
Compatibility profile should support all previous functionality as well as current (Core) functionality. Do you agree with that?
So,
Core supports attributeless rendering AND Compatibility must support all functionality AND Core is subset of the functionality => Compatibility must support attributeless rendering!

Agree, if the portability is your primary goal.

This cannot be emphasised enough. Allowing functionality in compatibility that should not work is bad bad bad, and removes the usefulness of NVIDIA as a platform for developing on.

…or if having a reasonable guarantee that your program stands at least a decent chance of running on anything other than NVIDIA is any kind of a goal for you.

Of course that something written and published cannot be changed easily, but, as you’ve said, there will be future releases.

Not any that support 3.3-class hardware. All future releases will focus on the GL 4.x line of hardware and above.

The OpenGL Specification Version 3.3 will always say what it says now. And therefore, all OpenGL implementations on this hardware should conform to that.

They might make an extension to expose this, but it’s strange to make a compatibility-only extension.

Compatibility profile should support all previous functionality as well as current (Core) functionality. Do you agree with that?

You’re trying to have a different conversation. The conversation you want to have is “what should the specification say?” That’s not what I’m talking about. I’m talking about “what does the specification say?” Whether I agree about what the spec ought to say is irrelevant; the spec is what it is and it says what it says.

What matters is that both AMD and NVIDIA are deficient in this regard. One is too restrictive, the other too permissive. Both answers are wrong.

Agree, if the portability is your primary goal.

If you happen to live in that tiny, sequestered bubble called “NVIDIA-OpenGL”, fine. Be happy there. But anyone living in the rest of the world must accept the simple reality that their code will be run on non-NVIDIA hardware.

And that NVIDIA-only world? It’s slowly but surely getting smaller.

Being too restricted means that code written against (the spec/a conformant implementation) will break. The situation is pretty much identical.

Being too restricted means that code written against (the spec/a conformant implementation) will break. The situation is pretty much identical.

True. But specifications aren’t implementations. You can’t write your code against the specification. You can think you have. You can read your code carefully and believe you have done everything according to the spec. But the only way to know it is to actually run it. And that requires an implementation. This is why conformance tests are so important.

If something that ought to work fails, you can chaulk it up to a driver bug. If something works as they expect it to, generally people don’t question it. That’s why permissiveness is more dangerous: it’s easy to not know that you’re using off-spec behavior.

After all, how many people do you think actually know that 3.3 core allows you to render without buffer objects, while 3.3 compatibility does not? This is an esoteric (though useful in some circumstances) use case, one that’s rarely if ever covered in documentation or secondary sources.

People generally do not read specifications. They read secondary documentation (reference manuals, the Redbook, etc) or tertiary materials (online tutorials, someone else’s code). They know what they’ve been exposed to. They know what they’ve been shown. And they know what their implementation tells them works or doesn’t work.

This means that, if you write code on a permissive implementation, it may not work on a conformant one. Whereas, if you write code on a restricted implementation, it will still work on the conformant one.

Both non-conformant implementations are wrong, but only one leads to writing non-portable code.

You know very well that ARB created mess with releasing GL 3.3/4.0. All extensions introduced to the core of GL 4.1 are supported by SM4 hardware. It is a pretty frivolous story that new specs wouldn’t support “older” hardware.

Exactly! Why should we stick to something written as it is the Holy Scripture? Even in the specs there can be errors.

You didn’t disagree with my “logical statement”, so it is true (and it should be if all premises are correct; it’s a pure logic).

You know very well that ARB created mess with releasing GL 3.3/4.0. All extensions introduced to the core of GL 4.1 are supported by SM4 hardware. It is a pretty frivolous story that new specs wouldn’t support “older” hardware.

I don’t understand the problem. Yes, most of the 4.1 set of features are supported by 3.x hardware. But they’re also available as extensions. There’s no need to make a GL 3.4 just to bring a few extensions into core. That’s a waste of ARB time.

Why should we stick to something written as it is the Holy Scripture? Even in the specs there can be errors.

Because if you don’t stick to what the spec actually says, you have chaos. The purpose of a specification is to have some reasonable assurance of conformance. That everyone who implements something is implementing the same thing, and that they should all provide exactly and only the described behavior.

The purpose of a specification is not to suggest. It is not to imply. It is to state exactly and only what the legal commands are.

You can want the specification to change. But until it does, non-conformance to it is wrong.

You didn’t disagree with my “logical statement”, so it is true

You fail logic forever. Not disagreeing with a statement does not make it true.

I said that it was irrelevant for this conversation. The veracity of your statement is not the issue at hand.

The key point here is that NVIDIA’s “everything always works” stance isn’t just wrong, it’s dangerous.

This doesn’t just apply to the topic of this thread, it applies everywhere this stance manifests.

It’s dangerous because a developer using NVIDIA hardware has a very real risk of producing a program that won’t work on any other hardware.

It’s dangerous because it perpetuates the situation where different OpenGL implementations behave differently and “To Hell With The Spec”.

It’s dangerous because it is a very real fact that OpenGL driver workarounds in programs will only continue so long as vendors are - and are allowed to be - lax with the spec.

This is damaging to OpenGL’s reputation as a solid API. This is damaging to OpenGL’s reputation as a dependable API. This is - in short - a classic example of the kind of nonsense that drives people to D3D.

It would be great to see future conformance tests be modified to check that things which are supposed to fail actually do fail.

Yes, I can only agree that NVIDIA’s behavior is indeed dangerous with regards to portability, I myself have run into problems stemming from the very fact that we are using NVIDIA hardware as our primary development platform.

However, it’s not the only side of the problem by far. If something works on NVIDIA as you expected it would, and then it doesn’t work on another hardware, you are way too often left out without hardly any diagnostics. The standard is full of “undefined behavior” clauses that are surely sensible performance-wise, but for development it’s hell. Nobody would complain if there was a debug context that caught all these kinds of errors, and OpenGL would be much better platform for it.

What drives people to D3D is poor development support and some poor design decisions.
IMO the standard should include mandatory functionality that eases the development and debugging, the stuff that actually can enhance the quality of the API. It would be beneficial not just for developers but also for the vendors. It’s insane that currently blame goes often on vendors who implement the standard better, jut because from the developer’s perspective the standard-compliant version doesn’t work, and they can’t easily find out why.

But there is a very useful debugging mechanism exposed through debug_output extension. Its application will we even useful if vendors implement more information into retrieved error messages.

I completely agree with the first part of the statement. The development support is not poor, but there is almost none. Considering the other part, I can only partially agree. Be aware that OpenGL is cross-platform solution and lower level API than D3D. The multi-threading support should be much better (especially because we’ve seen that it is possible with Quadro drivers).

Generally I agree that standardization is very important, but as you said, some things are not well stated and vendors have to fix it in their implementations. The question is: Do we want more rapid API development or more stable specification?

We were witnesses of severe API changes in last few years. If we want to follow that trend we have to accept flexibility in the interpretation of the specification. If something is specified, but it is proven to be inefficient, it should be changed in the next release. It is very painful for the developers. That’s why profiles are “invented”. Core profile should be streamlined, while compatibility profile could be more conservative.

What makes you say such a thing?
From hw vendors pov D3D is much more low level, and users get pretty much the same level of abstraction imo.

AFAIK HAL is an essential part of D3D. That implies much higher level of abstraction.

The key difference is in the driver architecture.

With D3D there are two main components: the runtime which is provided by Microsoft and the HAL which is provided by the vendor. The advantage here is that everyone is on a reasonably consistent runtime, and the programmer only needs to code to the runtime. The vendor also only needs to ensure that their HAL works with the runtime (instead of working with every program individually). So in effect the runtime can act as a layer of sanity checking between your code and the vendors code, which is one reason why D3D doesn’t suffer so much from the kind of problem being discussed here. Proper driver certification helps a little too, but vendors can and will try all manner of tricks to get around that. Nobody should forget the GeForce FX benchmarking fiasco.

The D3D “HAL” by the way is not a true hardware abstraction layer; it may have been in the past but these days it’s just a name. The HAL can in fact directly interface with the hardware itself, and does not provide any emulation of unsupported features.

(As an aside, there’s a nice story here about one way Microsoft have dealt with misbehaving drivers in the past.)

The obvious disadvantages are that it’s not extensible and that the extra layer from the runtime adds some overhead (in practice though it’s not even noticeable).

From one perspective this could be viewed as higher level. From another perspective D3D is much lower level in that it exposes the guts of the hardware more (sometimes brutally so), has no software emulation fallbacks, and (in versions 9 and below) requires you to write code yourself for a lot of things that OpenGL handles automatically (I’m thinking in particular of the dreaded D3DERR_DEVICELOST here).

If you write code on a conformant implementation it might fail on a restricted one.

And if, as you say, people don’t read specs they won’t care much which implementation is conformant. It only matters in that future bugfixes will generally gravitate towards conformance.

That’s quite hard, though, due to GL’s extension-without-enable design (outside of GLSL). So you have to adjust the conformance test for every extension.

If something works on NVIDIA as you expected it would, and then it doesn’t work on another hardware, you are way too often left out without hardly any diagnostics.

Don’t forget: what we’re talking about here is a driver bug (on both sides). The specifications are very clear about what should and should not happen around this functionality. The only thing that is even theoretically wrong with the specification in this circumstance is that the compatibility profile doesn’t let you render with no data. The compatibility profile is very clear about what you need to do to render with it, as is the core profile.

Diagnostics aren’t the problem: not following the specs is the problem.

The standard is full of “undefined behavior” clauses that are surely sensible performance-wise, but for development it’s hell.

But those clauses are part of the standard. While you can accidentally get yourself in undefined behavior, the standard clearly says when you are in undefined behavior. And generally, these “accidental” cases generally revolve around you doing something “shady”: RTT shenanigans and so forth. And due to being shady, you should be checking the spec to make sure it’s legal.

Remember: most of the times you can fall into undefined-land, it is because determining when you are is not something that is generally possible. Not without performance problems. Take the case of rendering to the same image you’re reading from. There is no way for OpenGL to detect that you are certainly going to do so. It can tell that you may be doing so. But through arbitrary fragment shader logic, it is impossible for the implementation to know for certain that you are. The only way to handle this is to declare it to be undefined.

Don’t mistake “undefined behavior” for “works on NVIDIA but not AMD.”

IMO the standard should include mandatory functionality that eases the development and debugging, the stuff that actually can enhance the quality of the API. It would be beneficial not just for developers but also for the vendors. It’s insane that currently blame goes often on vendors who implement the standard better, jut because from the developer’s perspective the standard-compliant version doesn’t work, and they can’t easily find out why.

Your last sentence in this paragraph has nothing to do with the others. The reason why the last part happens is due to poor spec writing from the ARB. Stuff that “eases the development and debugging” would not change that problem.

Sometimes, specs have issues (sometimes lots of issues coughARB_separate_program_objectscough).

If you write code on a conformant implementation it might fail on a restricted one.

Which is why you develop on the restricted one.

All I’m saying is that it would be helpful if the standard also provided for a better debugging functionality and error detection. As Aleksandar said, there’s a debug context but we can only wait until it becomes usable in these cases.

It’s all fine that the specs say exactly when you are in undefined land, but once you are experiencing an undefined behavior it’s hard to detect what’s going on without detecting the cause. Sure, there are cases where the detection is hard if not impossible, but that doesn’t mean we should say it’s not worth it and everybody should learn the standard to the letter instead. Being a purist and sending devs to learn the specs to every detail doesn’t help the api to gain the devs at all. I’m trying to say that the standard should actively support the development process.

Don’t mistake “undefined behavior” for “works on NVIDIA but not AMD.”

I don’t, the problem is this: I, like many, didn’t read the standard first to depth, focusing on all the statements there and making a global image that would guard me on my path. Instead I would get the general idea and move on to doing stuff on it, in contrast to becoming an OpenGL guru.

Then it happens that on NVIDIA stuff often works more logically. For example, locations of vertex attributes are numbered in order of appearance. Standard says it’s undefined, and so ATI numbers it randomly. A tiny detail, one might say, but IMO the standard should get rid of all unnecessary undefines because they make the life harder without a real purpose. Again, saying that the standard exactly says so and it’s the developer’s fault is right, fine. But it doesn’t exactly help.

You are right that in the cases when I’m doing something really shaddy I know it, and I’m consulting the specs thoroughly, and also the forums usually. In these cases the detection would be costly (even in debug contexts) or impossible to implement, so not everything can be handled this way to help detect the errors and speed up the development process. But that’s ok, nothing’s only black or white.

As Aleksandar said, there’s a debug context but we can only wait until it becomes usable in these cases.

And what if that extension had required more accurate error reporting? What if the ARB_debug_output extension had mandated specific, detailed messages for each kind of possible error? Do you think things would be better now?

Of course not. All that would happen is that nobody would implement it. At least, not yet. Implementing detailed error reporting is a pain, and it takes time.

If the ARB had forced their hand by somehow putting it into 4.1, then they simply pretend to support it by supporting the entrypoints, but giving generic error messages. You might say, “But that’s against the spec!” but when has that ever stopped anyone from claiming to support a particular version of GL before?

I, like many, didn’t read the standard first to depth, focusing on all the statements there and making a global image that would guard me on my path. Instead I would get the general idea and move on to doing stuff on it, in contrast to becoming an OpenGL guru.

I’m having some trouble imagining these circumstances. Could you describe a place in the spec were you could look at the API and get a really wrong idea about what is legal and what is not. And no, the example you gave doesn’t count because:

For example, locations of vertex attributes are numbered in order of appearance. Standard says it’s undefined, and so ATI numbers it randomly. A tiny detail, one might say, but IMO the standard should get rid of all unnecessary undefines because they make the life harder without a real purpose.

OK. So how do you define “order of appearance” in GLSL?

Remember how the GLSL compilation model works. You compile shader strings into object files, then link those into a program. So, how do you determine the order that attributes appear in if attributes are defined in different shader objects? Is it the order that the shader objects are attached to the program? Is that something you really want to enforce?

Furthermore, is that even a good idea? Do you really want to allow subtle breakages in code just because you rearranged the order of how you defined a couple of attributes? I’m not sure if I would consider that a good idea to begin with. At least with layout(location), there’s an explicit number in the shader; here, it’s based on something implicit.

Yes, you effectively have the same thing with uniform blocks: ordering in the shader can impact the results. But uniform blocks are at least cordoned off. They’re clearly separate from other global definitions; the ordering happens within a clearly-defined boundary.

Also, the spec doesn’t say that the attribute locations are “undefined.” It says that they are “implementation-dependent.” “Undefined” means you shouldn’t do it; “implementation-dependent” means you should expect it to vary from implementation to implementation.

Every example that uses shaders will either use glBindAttribLocation to set attributes it before linking, layout(location) to set attributes in shaders, or use glGetAttribLocation to query the location after the fact. None of them rely on NVIDIA order. So I have no idea how you even discovered NVIDIA’s ordering, let alone came to believe that this was in-spec behavior and relied on it.

I’m guessing that NVIDIA puts the attributes in an array somewhere, and simply assigns indices using those array indices. Whereas ATI probably sticks them in a std::map or similar sorted structure, and therefore assigns indices based on some kind of name ordering (not necessarily lexicographical).