PDA

View Full Version : Official feedback on OpenGL 4.3 thread



Khronos_webmaster
08-06-2012, 05:57 AM
August 6th, 2012 – Los Angeles, SIGGRAPH 2012 – The Khronos™ Group today announced the immediate release of the OpenGL® 4.3 specification,bringing the very latest graphics functionality to the most advanced and widely adopted cross-platform 2D and 3D graphics API (application programming interface). OpenGL 4.3 integrates developer feedback and continues the rapid evolution of this royalty-free specification while maintaining full backwards compatibility, enabling applications to incrementally use new features while portably accessing state-of-the-art graphics processing unit (GPU) functionality across diverse operating systems and platforms. The OpenGL 4.3 specification contains new features that extend functionality available to developers and enables increased application performance. The full specification is available for immediate download at http://www.opengl.org/registry.

Twenty years since the release of the original OpenGL 1.0, the new OpenGL 4.3 specification has been defined by the OpenGL ARB (Architecture Review Board) working group at Khronos, and includes the GLSL 4.30 update to the OpenGL Shading Language.

New functionality in the OpenGL 4.3 specification includes:


compute shaders that harness GPU parallelism for advanced computation such as image, volume, and geometry processing within the context of the graphics pipeline;
shader storage buffer objects that enable vertex, tessellation, geometry, fragment and compute shaders to read and write large amounts of data and pass significant data between shader stages;
texture parameter queries to discover actual supported texture parameter limits on the current platform;
high quality ETC2 / EAC texture compression as a standard feature, eliminating the need for a different set of textures for each platform;
debug capability to receive debugging messages during application development;
texture views for interpreting textures in many different ways without duplicating the texture data itself;
indirect multi-draw that enables the GPU to compute and store parameters for multiple draw commands in a buffer object and re-use those parameters with one draw command, particularly efficient for rendering many objects with low triangle counts;
increased memory security that guarantees that an application cannot read or write outside its own buffers into another application’s data;
a multi-application robustness extension that ensures that an application that causes a GPU reset will not affect any other running applications.


Learn more about what is new in OpenGL 4.3 at the BOF at Siggraph, Wed, August 8th from 6-7PM in the
JW Marriott Los Angeles at LA Live, Gold Ballroom –Salon 3. Then join us to help celebrate the 20th anniversary of OpenGL on Wednesday August 8th from 7-10 PM in the JW Marriott Los Angeles at LA Live Gold Ballroom – Salon 1, 2 & 3.

Complete details on the BOF are here:
http://www.khronos.org/news/events/siggraph-los-angeles-2012#opengl_bof

Complete details on the Party are here:
http://www.khronos.org/news/events/opengl-20th-anniversary-party

kRogue
08-06-2012, 06:50 AM
A Big Thankyou to all the contributors of the GL4.3 specification. The reorganization of the specification (which is almost like a text-rewrite) is wonderful . Comments on the added beans to follow from me.

Wish I could have made it to SIGGRPAH this year :(.

Groovounet
08-06-2012, 06:55 AM
OpenGL 4.3 review available: http://www.g-truc.net/doc/OpenGL4.3review.pdf :)

Gedolo
08-06-2012, 07:02 AM
The compute shaders with the shader storage buffer objects are very interesting. (Calculating data and integrating that data into the graphics pipeline + other direction and doing both in an advanced application could achieve very interesting blends of things.)

high quality ETC2 / EAC texture compression as a standard feature: Hopefully this solves texture compression for the foreseeable future. Becomes core in a future OpenGL version.

Multi-application robustness: important, I can't believe OpenGL only has this now. Hope to see this in OpenGL ES and WebGL too!

Feature requests:

Direct State Access functions: less error prone, same done with less code.

Bindless textures: less error prone and abstraction allows things not possible before while driver can be optimized for gpu specific architecture things.

If I debug an application I want to be able to see if api calls are deprecated, either in the currently used OpenGL version in the context or other versions (let me specify api's and version ranges, compatibility ranges, profiles). If I use an 3.3 context also show me deprecations for further contexts. First list the currently used version then the other ones (from earlier deprecation to recent deprecation). This way a developer can work from the earlier stuff that should be tackled first to the later stuff that should be tackled last in the list. Make sure this is not tied to the used context of the OpenGL application just like the show debug output.

Alfonse Reinheart
08-06-2012, 08:22 AM
The Unofficial OpenGL 4.3 Feature Awards!

I hereby hand out the following awards:

We Did What We Said We Were Gonna Award

The OpenGL Graphics System: A Specification (Version 4.3)

Literally as I was downloading the spec, I said out loud something to the effect of, "There's no way the ARB actually did the rewrite they promised." I was wrong.

It's a much better spec. There are still a couple of issues, like why VAOs are still presented after all of the functions that set state into them. But outside of a few other anomalies like that, it is much more readable. Props for having a dedicated error section for each function, in a different background, to make it clear what the possible errors are.

Most Comprehensive Extension Award

ARB_internalformat_query2

This is something OpenGL has needed for ages, and I generally don't like giving props for people finally doing what they were supposed to have done long ago. But this extension provides pretty much every query you could possibly imagine. It's even a little too comprehensive, as you can query aspects of formats that aren't implementation-dependent (color-renderable, for example).

One Little Mistake Award

ARB_vertex_attrib_binding

This was good functionality, until I saw that the stride is part of the buffer binding, not the vertex format. Indeed, that's the reason why it can't use glBindBufferRange; because it needs a stride. Why is the stride there and not part of the format? Even NVIDIA's bindless attribute stuff puts the stride in the format.

This sounds like some horrible limitation ported over from D3D's stream nonsense that OpenGL simply doesn't need.

We Declare A Do-Over Award

ARB_compute_shader

The existence of ARB_compute_shader is a tacit admission that OpenCL/OpenGL interop is a failure. After all, the same hardware that runs compute shaders will run OpenCL. So why wouldn't you use OpenCL to do GPGPU work, and interop with OpenGL with the interop layer.

3D Labs Is Finally Dead Award

ARB_explicit_uniform_location

The last remnants of the Old Republic have been swept away forever. GLSL, as 3D Labs envisioned it, is now dead and buried in a shallow grave. OpenGL has finally accepted that uniform locations won't be byte offsets and will have to be mapped into a table to reconstruct those byte offsets.

So it's high time the ARB cut their losses with the last of 3D Labs's horrible ideas and provided functionality in a way that coincides better with reality.

We Need More Ways To Do Things Award:

ARB_shader_storage_buffer_object

You know what OpenGL doesn't have enough of? Ways to access buffer objects from shaders. I mean, buffer textures, UBOs, and image buffer textures just weren't enough, right? We totally needed a fourth way to access buffer objects from shaders.

Yeah, I get what they were saying in Issue #1 (where they discussed why they didn't re-purpose UBOs for this). But it doesn't change the fact that there are now 4 separate ways to talk to buffer objects from shaders, and each one has completely different performance characteristics.

Let's Rewrite Our API And Still Leave The Original Award:

ARB_program_interface_query

I get the idea, I really do. It provides a much more uniform and extensible ways to query information from shaders. That's a good thing, and I'm not disputing that. But we already have APIs to do that; indeed, pretty much the only thing missing from those APIs was querying fragment shader outputs.

Now the API is very cluttered. We have the old way of doing things, plus this entirely new way.

You Were Right Award

Mhagain's suggestion for texture_views (http://www.opengl.org/discussion_boards/showthread.php/178473-mutable-texture-formats?p=1240438&viewfull=1#post1240438), which was implemented. I_belev's initial idea was close, but much more narrow (changing the format on an existing texture, rather than creating a new texture that referenced the old data) than the actual texture view stuff provided by 4.3.

Overall, I'm getting a feeling of deja vu: once again, OpenGL has many ways to do things, and little guidance as to how to do it. We've got multiple ways to set up vertex formats and buffers, multiple ways to read data from buffers in shaders, etc.

Of course, we won't see another API cleanup and function removal round, since the last one went so well.

aqnuep
08-06-2012, 08:37 AM
One Little Mistake Award

ARB_vertex_attrib_binding

This was good functionality, until I saw that the stride is part of the buffer binding, not the vertex format. Indeed, that's the reason why it can't use glBindBufferRange; because it needs a stride. Why is the stride there and not part of the format? Even NVIDIA's bindless attribute stuff puts the stride in the format.

This sounds like some horrible limitation ported over from D3D's stream nonsense that OpenGL simply doesn't need.
I couldn't agree more, stride is just strange there, as it's more a format specifier than a resource specifier. Also, the additional indirection between vertex attribute indices and vertex buffer binding indices sounds way too much abstraction. But to be honest, personally, I would drop all vertex array stuff at once. Using shader storage buffers now is probably the best way to access vertex data, if you can live with the limitation that your shaders now have to know explicitly your vertex data formats (actually, from performance-wise it's probably preferable to be explicit).


We Declare A Do-Over Award

ARB_compute_shader

The existence of ARB_compute_shader is a tacit admission that OpenCL/OpenGL interop is a failure. After all, the same hardware that runs compute shaders will run OpenCL. So why wouldn't you use OpenCL to do GPGPU work, and interop with OpenGL with the interop layer.
Sad, but probably true. I wouldn't expect it either, but if that's what it takes to have efficient compute-graphics interworking, then let it be.

Alfonse Reinheart
08-06-2012, 08:41 AM
But to be honest, personally, I would drop all vertex array stuff at once. Using shader storage buffers now is probably the best way to access vertex data, if you can live with the limitation that your shaders now have to know explicitly your vertex data formats (actually, from performance-wise it's probably preferable to be explicit).

... how? Attribute format conversion is free; doing it in the shader would be not free. How is "not free" faster than "free"?

Furthermore, attributes are fast, with dedicated caches and hardware designed to make their particular access pattern fast. Shader storage buffers are decidedly not. Specific hardware still beats general-purpose code.

kyle_
08-06-2012, 08:46 AM
Also, the additional indirection between vertex attribute indices and vertex buffer binding indices sounds way too much abstraction.

Hey, but we have gained some consitency. Now both inputs (attributes) and outputs (frag outs) from pipeline have the indirection ;)

kRogue
08-06-2012, 11:01 AM
Combing the forums, one can see lots of stuff that was stated in Suggestion for next release of GL found there way into GL4.3:

Texture views (though I strongly suspect that it was in the draft specs before it appeared as a suggestion)
The decoupling of vertex attribute source and format
explicit uniform location
read stencil values from depth-stencil texture
ability to query in's and outs of shaders and programs
arbitrary formatted to structure writes to buffer objects from shaders (I freely confess that what I was begging for was NVIDIA's GL_NV_sahder_buffer_load/store but what is delivered in GL4.3 is still great)


Really happy about the spec rewrite/reorg.

kRogue
08-06-2012, 11:49 AM
Hmmm....



Section 7.2 Shader Binaries:



shaders contains a list of count shader object handles. Each handle refers to a
unique shader type,


why do the shader types need to be unique? Is this a typo and should it read that the handles are unique? In contrast, AttatchShader of section 7.3:



Multiple shader objects of the same type may be attached to a single program
object, and a single shader object may be attached to more than one program object.



Section 10.3.1 (Specifying Arrays for Generic Vertex Attributes) and 10.4(Vertex Array Objects)
Table 10.2 refers to glVertexAttribPointer calls rather than glVertexAttribFormat calls.
I found that reading the extension GL_ARB_vertex_attrib_binding was easier to grok the index indirection that VertexAttribBinding specifies. The interaction of VAO's with VertexAttribBinding also takes a moment or so to grok correctly. It is in the table, but perhaps a little more text to hold a reader's hand on it.


"Clear Texture"?
There is ClearBufferSubData, why is there not an analogue for textures? Or is it in the specification and I missed it?


Easier if...
Section 7.6.2.2 (Standard Uniform Block Layout) 1st paragraph, last sentence:



"Applications may query the off-sets assigned to uniforms inside uniform blocks with query functions provided by
the GL"


Would be a touch more merciful if a reference to the query function and parameters was indicated
on how to query of GLSL program for the format of a uniform block.. this is nit picking, but hey, I missed SIGGRAPH this year, so I am grouchy.

Groovounet
08-06-2012, 12:21 PM
... how? Attribute format conversion is free; doing it in the shader would be not free. How is "not free" faster than "free"?

Furthermore, attributes are fast, with dedicated caches and hardware designed to make their particular access pattern fast. Shader storage buffers are decidedly not. Specific hardware still beats general-purpose code.

First the stride parameter needs to be a binding parameter. It allows to reusing the same vertex formats but data that is dispatched into various number of buffer. Typically we can expect cases where a vertex format will be used for a perfectly packed buffer which interleave all the data. We can however imagine that the same vertex format can be reused without switching for a data that are stored into two buffers: One static and one dynamically updated.

Second, attribute format conversion is the same cost when done in the shader because it is effectively already done in the shader behind or back. GPUs no longer use dedicated hardware for that, it takes space that can't be reused for something else. That will be more and more the case.

aqnuep
08-06-2012, 12:51 PM
First the stride parameter needs to be a binding parameter. It allows to reusing the same vertex formats but data that is dispatched into various number of buffer. Typically we can expect cases where a vertex format will be used for a perfectly packed buffer which interleave all the data. We can however imagine that the same vertex format can be reused without switching for a data that are stored into two buffers: One static and one dynamically updated.

Second, attribute format conversion is the same cost when done in the shader because it is effectively already done in the shader behind or back. GPUs no longer use dedicated hardware for that, it takes space that can't be reused for something else. That will be more and more the case.
Exactly, shaders do perform the fetching and attribute format conversion internally, so they have to know the stride in order to create the appropriate fetching and conversion code, thus despite we have now separate format and binding, considering that the stride is coupled with the binding, there can be still internal re-compiles of vertex fetching code even one only changes binding (as we had it with the old API), thus it defeats the purpose.

But, of course, this all depends on hardware and driver implementation.

Groovounet
08-06-2012, 01:33 PM
Could the stride parameter be a variable which content would be fetched from a register file?

aqnuep
08-06-2012, 03:51 PM
Could the stride parameter be a variable which content would be fetched from a register file?
Maybe, it all depends on the hardware and driver implementation, though I suppose that not all existing hardware can do it that way, but maybe I'm wrong. However, that means an additional indirection which might not be good performance-wise for some applications. I still believe that for new applications programmable vertex fetching is the way to go, especially having something like shader storage buffers in place.

However, shader storage buffers have their problems too:
1. They have to be writeable, thus they cannot be supported on GL3 hardware
2. GL4.3 only requires a max of 16MB for storage buffers which may be too small for some use cases (of course, implementations are free to allow larger buffers, but still, it just sounds too small for me)

Alfonse Reinheart
08-06-2012, 03:54 PM
First the stride parameter needs to be a binding parameter. It allows to reusing the same vertex formats but data that is dispatched into various number of buffer. Typically we can expect cases where a vertex format will be used for a perfectly packed buffer which interleave all the data. We can however imagine that the same vertex format can be reused without switching for a data that are stored into two buffers: One static and one dynamically updated.

What good is that? The assumption with this is that the non-stride part of the vertex format is the performance-limiting issue, rather than the buffer binding itself. Bindless suggests quite the opposite: that changing the vertex format is cheap, but binding buffer objects is expensive. At least for NVIDIA hardware.


Second, attribute format conversion is the same cost when done in the shader because it is effectively already done in the shader behind or back. GPUs no longer use dedicated hardware for that, it takes space that can't be reused for something else. That will be more and more the case.

Do you have evidence of this? And for what hardware is this true?

aqnuep
08-06-2012, 04:08 PM
Bindless suggests quite the opposite: that changing the vertex format is cheap, but binding buffer objects is expensive. At least for NVIDIA hardware.
What makes you think that? Please give a reference to where did you read that because I would be also interested.

From bindless, I feel that NVIDIA thinks two things to be expensive:
1. Vertex format change
2. Mapping buffer object names to GPU memory addresses

Maybe I'm wrong, so please disprove my assumptions.

Booner
08-06-2012, 04:16 PM
Most Comprehensive Extension Award

ARB_internalformat_query2

This is something OpenGL has needed for ages, and I generally don't like giving props for people finally doing what they were supposed to have done long ago. But this extension provides pretty much every query you could possibly imagine. It's even a little too comprehensive, as you can query aspects of formats that aren't implementation-dependent (color-renderable, for example).



Thanks!
Definitely agree that this was long overdue. As for the 'too comprehensive' aspect -- my desire is that this will become widely supported across GL and GLES versions. In this usage, the properties that aren't implementation-dependent for a specific version of GL, then do have value.

kyle_
08-06-2012, 04:35 PM
GPUs no longer use dedicated hardware for that

So when can we expect fully programmable vertex pulling (with right performace) extension from AMD? :D
I think its a little bold to assume that over significant amount of hardware out there.

I know of at least some recent that cant do that (without silly amount of shaders recompilation before batch submition).

mhagain
08-06-2012, 07:10 PM
What good is that? The assumption with this is that the non-stride part of the vertex format is the performance-limiting issue, rather than the buffer binding itself. Bindless suggests quite the opposite: that changing the vertex format is cheap, but binding buffer objects is expensive. At least for NVIDIA hardware.

In D3D at least, when using IASetVertexBuffers (slot, num, buffers, strides, offsets), changing the strides and/or offsets alone is cheaper than changing everything. Since the buffer part of the specification is not changing the driver can make use of this knowledge and optimize behind the scenes.

With GL this is now explicit in the new state that has been introduced. VERTEX_BINDING_STRIDE and VERTEX_BINDING_OFFSET are separate states, so they are the only states that need to be changed on a BindVertexBuffer call where everything else remains equal.

Where this functionality is useful is when you may have multiple models packed into a single VBO, or multiple frames for the same model in a single VBO, or for LOD schemes. You can jump to a different model/frame/LOD with a single BindVertexBuffer call, rather than having to specify individual VAOs for each model/frame/LOD, or respecify the full set of VertexAttribPointer calls for each model/frame/LOD (worth noting that stride is not really a big deal here and is unlikely to change in real-world programs; offset is the important one).

The decoupling of buffer from layout introduced here is useful functionality on it's own. Your VertexAttribArray (VertexAttribFormat) calls are no longer dependent on what the previous BindBuffer call was, which introduces extra flexibility and reduces potential for error. Getting rid of such inter-dependencies is also some nice cleaning up of the API and it's a good thing that GL has finally got this, the only seeming bad part being (and I haven't fully reviewed the spec so I may have missed something) that BufferData/BufferSubData/MapBuffer/MapBufferRange haven't been updated to take advantage of the new binding points.

I recommend sitting down and writing some code for non-trivial cases using this API; you should soon see how superior this method is to the old.

Alfonse Reinheart
08-06-2012, 08:21 PM
You can jump to a different model/frame/LOD with a single BindVertexBuffer call, rather than having to specify individual VAOs for each model/frame/LOD, or respecify the full set of VertexAttribPointer calls for each model/frame/LOD (worth noting that stride is not really a big deal here and is unlikely to change in real-world programs; offset is the important one).

Welcome to my entire point: if the stride isn't going to change in any real-world system, why is the stride not part of the vertex format?

There's a reason I gave it the "one little mistake" award: the only thing wrong with the functionality is that the stride is in the wrong place for no real reason. Or at least, the only reason is because "Direct3D does it that way." It doesn't actually make sense; that's just how they do it.


the only seeming bad part being (and I haven't fully reviewed the spec so I may have missed something) that BufferData/BufferSubData/MapBuffer/MapBufferRange haven't been updated to take advantage of the new binding points.

There aren't new binding points. glBindVertexBuffer does not bind the buffer to a binding target the way that glBindBufferRange does. That's why it doesn't take a "target" enum. It only binds it to an indexed vertex buffer binding point; it doesn't bind the buffer to a modifiable target.

This was almost certainly done to allow glVertexAttribPointer to be defined entirely in terms of the new API. glVertexAttribPointer doesn't change GL_ARRAY_BUFFER's binding, nor does it change any other previously-visible buffer binding state. Therefore, glBindVertexBuffer doesn't either.

Personally, I don't have a problem with the minor API inconsistency.

mhagain
08-06-2012, 08:38 PM
Welcome to my entire point: if the stride isn't going to change in any real-world system, why is the stride not part of the vertex format?
On the other hand, what harm does it do? I personally can't see any reason to change stride either, but the functionality is now there and maybe someone will find a use for it? I don't see it as being a "wrong thing", more of an odd but ultimately inconsequential decision. Doing what D3D does can make sense in many cases - makes it easier to port from D3D to GL, after all. That's gotta be a good thing. But in this case the D3D behaviour is also odd but ultimately inconsequential. It could be worse - just be thankful that it didn't take an array of each of buffers/strides/offsets like D3D does - that's painful to use.


There aren't new binding points.
Just using the terminology from http://us.download.nvidia.com/opengl/specs/GL_ARB_vertex_attrib_binding.txt

Alfonse Reinheart
08-06-2012, 09:36 PM
I don't see it as being a "wrong thing", more of an odd but ultimately inconsequential decision.

But it's not inconsequential. It's taking something that is by all rights part of the format and putting it elsewhere. It's not broken as specified, but it's not what it should be.

It's like not being able to specify attribute indices in shaders and many other API issues with OpenGL, past and present. Yes, you can live without it, but it would clearly be better to have it done right.

mhagain
08-07-2012, 12:34 AM
Hypothetical reason why you may wish to change the stride - skipping over vertexes for an LOD scheme.

To be honest I think you're wasting too much negative energy on this. Not having attribute indices in shaders was a colossal pain in the rear-end; this is nowhere even near the same magnitude. If it's a genuine API issue that is going to cause torment to those using it, then by all means yell about it from the rooftops (I'll be right there beside you). This isn't.

aqnuep
08-07-2012, 06:44 AM
Hypothetical reason why you may wish to change the stride - skipping over vertexes for an LOD scheme.
I don't understand what's the use case here. How stride helps you "skipping over vertices for a LOD scheme"? Also, skipping over vertices should be done by giving a different base index to DrawElements calls as you probably use indices anyways and if you use LOD I barely believe that you want to use the same set of indices. Why would you? That would mean that all of your LOD levels render the same amount of vertices which defeats the purpose.
Also, if you don't want to use indices, you probably better off sending a different first vertex parameter to your DrawArrays calls instead of always changing the offset and/or stride of your vertex buffers.

xahir
08-07-2012, 07:59 AM
from spec 2.5.10

Vertex array objects are container objects including references to buffer objects, and are not shared

Even with vertex formats removing buffer object references, I still need to carry vertex format info from my loading thread to main thread in order to finalize my OpenGL objects.


Of course, we won't see another API cleanup and function removal round, since the last one went so well.

so this just makes me sad...

aqnuep
08-07-2012, 10:50 AM
But it's not inconsequential. It's taking something that is by all rights part of the format and putting it elsewhere. It's not broken as specified, but it's not what it should be.

It's like not being able to specify attribute indices in shaders and many other API issues with OpenGL, past and present. Yes, you can live without it, but it would clearly be better to have it done right.
Agree, not to mention that the per-attribute relativeoffset parameter is still specified for the vertex attributes themselves and, in practice, these relative offsets don't make any sense unless you are also aware of the stride, thus again, it defeats the purpose.

kRogue
08-07-2012, 12:12 PM
I think the use pattern intended was that format of the attribute data was unchanged but weather or not and how it was interleaved with other attribute data varied.. the current interface does effectively have an offset in both glBindVertexBuffer and glVertexAttrib*Format .. so the issue is which use case comes up more often:

Keeping the format the same, but varying buffer sources and interleaving

OR

Using the same buffer, but varying interleave and format


What is in the spec makes the first case possible with only setting the buffer sources where as what some are wanting is to do the 2nd more often.


It look to me like that the interface is made for when a GL implementation works likes this:


Attribute puller has only two things: location from which to grab data and stride on what to grab
Attribute interpreter converts raw bytes from the puller for the vertex shader to consume


If a GL implementation worked like that, then I can see how a GL implementer would strongly prefer how the interface came out. Though an offset within the formatting setter kind of invalidates the above idea without more hackery...

Dean Calver
08-08-2012, 01:55 AM
In recent (but not sure latest hardware), there are no such things as 'Vertex Puller', feeding vertex data into a shader consists of two step, a DMA unit that moves blocks of vertex data into registers or memory closer to the shader and then the attrib converter/loader that feed the actual shader. The DMA unit doesn't really care whats in the vertex itself, only the address and total size of each vertex. Hopefully you can see where the interface in D3D and GL4.3 comes from. You're effectively programming the two processes separately.
On at least one platform when programming at a lower level API, it was possible to leave some vertex DMA streams on even if the data wasn't used, this could be a serious performance loss. The DMA unit would pay the bandwidth cost of retrieving data but then nothing would actually need or use it.
It was a simple pipeline optimisations, because vertex/index is highly predictable (its predefined) you can use simple DMA to ensure the data is in the best place beforehand.

However I don't believe the latest hardware uses this optimisation (I suspect they use the general cache and thread switches to achieve a similar effect), so its usefulness going forward may be doubtful...

mhagain
08-08-2012, 04:32 AM
I don't understand what's the use case here. How stride helps you "skipping over vertices for a LOD scheme"? Also, skipping over vertices should be done by giving a different base index to DrawElements calls as you probably use indices anyways and if you use LOD I barely believe that you want to use the same set of indices. Why would you? That would mean that all of your LOD levels render the same amount of vertices which defeats the purpose.
Also, if you don't want to use indices, you probably better off sending a different first vertex parameter to your DrawArrays calls instead of always changing the offset and/or stride of your vertex buffers.

This assumes that all of your VBO streams are going to be using the same stride or offset, which is not always the case. You may have a different VBO for texcoords as you have for position and normals, and you may only need to change the stride or offset for the position/normals VBO. The old API wouldn't let you do that without respecifying the full set of vertex attrib pointers; the new one lets you do it with a single BindVertexBuffer which - because stride and offset are separate state - can be much more efficient.

I really get the feeling that this is very new territory for many of you. Because you've never had this capability you don't see the advantages and flexibility of it, and need to have explained in detail what others have been successfully using for well over a decade now. There's an element of "the Americans have need of the telephone, but we do not. We have plenty of messenger boys" in that, and that's why I mentioned actually sitting down and writing some code that used it earlier on.

The sentiment that "just because it's D3D functionality it doesn't mean that GL has to do it" has a counterpart - just because it's D3D functionality it doesn't mean that GL doesn't have to do it either - because GL is not D3D and can evolve whatever functionality is deemed appropriate; whether or not it's similar is not relevant. Exhibiting opposition to functionality just because it works in a similar manner to D3D is quite preposterous, to be honest.

Eosie
08-08-2012, 05:02 AM
Welcome to my entire point: if the stride isn't going to change in any real-world system, why is the stride not part of the vertex format?

Even though it might not make much sense to you from a theoretical standpoint, the reason the spec's been written like that is that it maps perfectly on the current hardware. There's no other reason. The stride is just part of the vertex buffer binding.

Gedolo
08-08-2012, 05:20 AM
Would like to see more removal of deprecated functions and stuff. Cleaning.
We have Core and compatibility profile now. There is no reason to not remove/deprecate/clean old cruft form the core profile.

For people who find that it is more trouble than it's worth:
If it is too much work to replace obsolete functions and rebuild obsolete parts of algorithms. Then why the need to adapt the current program to a new OpenGL version? New features can also force rethinking of architecture.

I was a bit disappointed when I saw the OpenGL ES spec had a section of legacy features.
Because OpenGL ES 2.0 did not have backwards compatibility. Breaking it again would not have surprised many and would be expected.
Please do not do backwards compatibility again with OpenGL ES. Doing this is simply not necessary and allows for a lean specification without a lot of old cruft that makes it more work to build conforming drivers.

aqnuep
08-08-2012, 12:03 PM
The old API wouldn't let you do that without respecifying the full set of vertex attrib pointers; the new one lets you do it with a single BindVertexBuffer which - because stride and offset are separate state - can be much more efficient.
That's a good point, I can accept that one. Though still not as flexible as programmable vertex fetching.


Even though it might not make much sense to you from a theoretical standpoint, the reason the spec's been written like that is that it maps perfectly on the current hardware. There's no other reason. The stride is just part of the vertex buffer binding.
Maps perfectly to what hardware? May fit one, may not fit another. Lots of extensions map well to some hardware but may be pretty inefficient on other. OpenGL tries to match what hardware does as best as possible, however, as usual, there is no one-fits-all design so just assuming that whatever OpenGL supports maps perfectly to any hardware supporting it is just too naive.

kyle_
08-08-2012, 02:13 PM
Maps perfectly to what hardware?

Probably to Khronos' members hardware, otherwise i guess that the feature would be vetoed from core spec.

Eosie
08-08-2012, 02:36 PM
Maps perfectly to what hardware? May fit one, may not fit another. Lots of extensions map well to some hardware but may be pretty inefficient on other. OpenGL tries to match what hardware does as best as possible, however, as usual, there is no one-fits-all design so just assuming that whatever OpenGL supports maps perfectly to any hardware supporting it is just too naive.
Of course it doesn't map to all hardware in existence, but it maps exactly to AMD and NVIDIA hardware since GL3-capable chipsets. I don't really care about the rest.

mhagain
08-08-2012, 04:06 PM
There's also the case that it will make it easier to extend the vertex buffer API going forward (based on the assumption that it maps to, and continues to map to, hardware, of course) - which would result in cleaner, more robust drivers with fewer shenanigans going on behind the scenes. Plus it's setting clear precedent for something like a hypothetical GL_ARB_multi_index_buffers in a hypothetical future version (a similar API could be used) - and that's something I don't think even Alfonse could nitpick over. ;)

Alfonse Reinheart
08-08-2012, 04:07 PM
This assumes that all of your VBO streams are going to be using the same stride or offset, which is not always the case. You may have a different VBO for texcoords as you have for position and normals, and you may only need to change the stride or offset for the position/normals VBO. The old API wouldn't let you do that without respecifying the full set of vertex attrib pointers; the new one lets you do it with a single BindVertexBuffer which - because stride and offset are separate state - can be much more efficient.

That doesn't explain what this has to do with LODs.

Furthermore, glDrawElementsBaseVertex already dealt with the "offset" issue quite well; you can just render with different indices, using a base index added to the indices you fetch. No need to make glVertexAttribPointer calls again.

mhagain
08-08-2012, 04:40 PM
That doesn't explain what this has to do with LODs.

Furthermore, glDrawElementsBaseVertex already dealt with the "offset" issue quite well; you can just render with different indices, using a base index added to the indices you fetch. No need to make glVertexAttribPointer calls again.

Mental note to self: don't pull random examples out of ass when the Spanish Inquisition are around. ;)

See my comment on the previous page for one case where a base index is insufficient.

Alfonse Reinheart
08-08-2012, 06:26 PM
New topic: internalformat_query2.

There's one minor issue that's effectively a spec bug.

Internalformat_query2 allows you to query GL_FRAMEBUFFER_RENDERABLE. All well and good. But it doesn't really say what it means for it to have FULL_SUPPORT. It just says:


The support for rendering to the resource via framebuffer attachment is returned in <params>

What I mean is this: GL_FRAMEBUFFER_UNSUPPORTED is allowed to happen by the spec for the use of formats that aren't supported, or the use of a combination of formats that aren't be supported. However the spec sets aside a number of formats which are not allowed to return GL_FRAMEBUFFER_UNSUPPORTED. You can use any combination of any of these formats and the implementation is required to accept it.

If I test a format that isn't on OpenGL's required list, and it returns FULL_SUPPORT, does that mean that I can never get UNSUPPORTED if I use it? No matter what? The spec doesn't say. The exact behavior is not detailed, only that it is "supported".

I think section 9.4.3 should be amended as follows. It should be:



Implementations must support framebuffer objects with up to MAX_COLOR_-ATTACHMENTS color attachments, a depth attachment, and a stencil attachment. Each color attachment may be in any of the required color formats for textures and renderbuffers described in sections 8.5.1 and 9.2.5. The depth attachment may be in any of the required depth or combined depth+stencil formats described in those sections, and the stencil attachment may be in any of the required combined depth+stencil formats. However, when both depth and stencil attachments are present, implementations are only required to support framebuffer objects where both attachments refer to the same image.

Any internal format that offers FULL_SUPPORT from the FRAMEBUFFER_RENDERABLE query can be used in non-layered attachments in any combination with other required formats or formats that offer FULL_SUPPORT. Any internal format that offers FULL_SUPPORT from the FRAMEBUFFER_RENDERABLE_LAYERED may be used in any combination with other required formats or formats that offer FULL_SUPPORT.


This would give the query some actual teeth, because right now, it's not clear what it means to fully support FRAMEBUFFER_RENDERABLE. Also, it explains what CAVEAT support means: that the format can be used in certain combinations with other formats, but it's not guaranteed to work in combination with all other fully supported formats. NONE obviously means that it can never be renderable no matter what.


See my comment on the previous page for one case where a base index is insufficient.

I think there's a basic failure to communicate here. I understand why we want to have a separation between buffer objects and vertex formats. I understand how that's useful.

I don't understand why we need to have a separation between strides and vertex formats. That's my only problem with the extension: that the stride is with the buffer and not the format.

Eosie suggests that it's a hardware thing, and I can understand that. However, NVIDIA's bindless graphics API also provides separation between format and buffers (well, GPU addresses, but effectively the same thing: buffer+offset). And yet there, they put the stride in the format.

So why the discrepancy? I imagine that NVIDIA's bindless accurately describes how their hardware works, more or less.

thokra
08-09-2012, 07:01 AM
First of all, thank you ARB. The spec seems much clearer in many respects.


There is ClearBufferSubData, why is there not an analogue for textures? Or is it in the specification and I missed it?

No, as far as I can tell there isn't. I find the whole extension strange and certainly confusing at least to the beginner. Try explaining why to use


glClearBufferData(GL_ARRAY_BUFFER, GL_R32F, GL_RED, GL_FLOAT, 0)

to someone trying to initialize a vertex buffer with zeros instead of using glBufferData with an accordingly sized array of zeros. BTW, I hope I set this call up correctly.

It's good to have a memset() equivalent for convenience and for performance reasons when resetting a buffer during rendering (i.e. not having to transfer a complete set of data instead of a single value) but currently I can't really imagine many convincing example uses that justifies introducing 2 (or 4) new APIs. If anyone has some please share.

Alfonse Reinheart
08-09-2012, 07:05 AM
currently I can't really imagine many convincing example uses that justifies introducing 2 (or 4) new APIs.

There needs to be a SubData version for clearing part of a buffer. Sometimes, that's really what you want to do. At the same time, there should be a Data version for clearing the whole thing, without having to query its size.

Now, why they bothered with the non-DSA versions when several extensions don't provide non-DSA versions... that's a good question.

thokra
08-09-2012, 07:54 AM
kRogue: BTW, the workaround could be to use glClearBuffer() on a framebuffer attachment - albeit it's not cool but does essentially what you ask for. I can't help but feel that naming a function glClearBuffer{Sub}Data in the presence of glClearBuffer wasn't the wisest decision to make. Again, to the knowing developer it's not a problem but when seeing it the first time you start thinking.

Alfonse Reinheart
08-09-2012, 10:57 AM
Yes, probably not the best nomenclature. But what else could you call it? glMemsetBufferData? glClearBufferObjectData?

The ARB was kinda screwed by glClearBuffer; it should have been named something like glClearFramebufferImage or glClearFramebufferAttachment. Something that has "Framebuffer" in it.

kRogue
08-09-2012, 10:59 AM
The attach texture to FBO and do clear buffer is what I do now. It just is awkward.. also affects GL state and obscures what I am after: memset a texture. This clearing of a texture before use is really freaking important with respect to GL accelerated browsers. Oh well. I'll live, for now.

Alfonse Reinheart
08-09-2012, 11:07 AM
It just is awkward.. also affects GL state and obscures what I am after: memset a texture.

You assume that the hardware implementation would not have to do the same thing: use the framebuffer clearing mechanism to clear the texture's data. Remember: doing an actual clear like this on texture memory is not exactly a common operation. 99 times out of 100, if you're clearing an image, it's because you're about to render to it. Textures not meant to be render targets are typically uploaded to, not cleared.

kRogue
08-09-2012, 11:29 AM
Um... how to go about this. The information leak is the following for a web browser supporting WebGL:

Create texture
Attach texture to FBO
execute glReadPixels

OR

Create texture, but do not initialize it
use the texture to draw contents
glReadPixels from the framebuffer


now if a WebGL implementation does not clear the texture before it is used (as far as the user of WebGL is concerned) then a WebGL process can read image data from discarded memory, in particular previous image data held in texture be it in the same browser process or even a completely different program.

This is bad. So a WebGL implement must add additional code to either track if the texture was cleared already or all of it set, etc... which is a right pain in the rear when all one wants is the ability to memset the freaking memory when it is allocated.

Alfonse Reinheart
08-09-2012, 11:40 AM
Why exactly is this bad? You may be reading image data from discarded memory. You may not. It's certainly not something reliable, and even if it were, all you get is... a picture.

kRogue
08-09-2012, 11:50 AM
Ahh.. a picture.. from a secure website...maybe your private chats, maybe online banking... There are already attacks of this form. It is a security risk, it leaks information. The picture can be sent to a remote sight for analysis, etc.... Blackmail, forgery, etc.

Leaking information, like a freaking screenshot is a security risk.

At any rate, lets get back on target: I'd like a glClearTexture call rather than bind framebuffer, attach texture, call glClear**(). They had the chance with the "immutable texture" thing introduced, just wish those texture making functions included a memset value. Whine whine, I'd like some cheese please.

kyle_
08-09-2012, 01:17 PM
Are you doing implementaion of WebGL or something? :)
There certainly is security vulnerability possible to implement, but i dont think its that of a big deal (as in, for the driver - i hope its a big deal for the browsers).

It would probabaly be best to specify context flag that would require all 'uninitialized' bits to be set to zero, without any API overhead.

Alfonse Reinheart
08-10-2012, 07:00 AM
Sorry, I seem to have missed an award:

We Can't Be Bothered To Use A Diff File To Fix Our Spec Bugs

This one goes to whomever is responsible for maintaining the .spec files.

Some time ago, I made a .diff file available (https://bitbucket.org/alfonse/gl-xml-specs/downloads) that fixes a large number of .spec bugs. Missing enumerators, wrong enumerators, etc. And yet... those .spec bugs still exist.

It's really not that hard to run `patch` over the .spec files, guys. I did the hard work for you.

Booner
08-11-2012, 09:17 AM
Some time ago, I made a .diff file available (https://bitbucket.org/alfonse/gl-xml-specs/downloads) that fixes a large number of .spec bugs. Missing enumerators, wrong enumerators, etc. And yet... those .spec bugs still exist.

Did you file a bug with your fixes? If not, please file one at https://www.khronos.org/bugzilla/enter_bug.cgi?product=OpenGL

sqrt[-1]
08-11-2012, 06:00 PM
I think it may have been logged as this bug (2011-09-12):
https://www.khronos.org/bugzilla/show_bug.cgi?id=529

Contains a link to the thread that has the fixes.

Marco Di Benedetto
08-11-2012, 07:11 PM
I thank the Khronos Group for the new features.
I also thank the members for the cleaner specs.

I can NOT understand their refrain on direct state access (DSA).
Are they able to understand that bind-to-edit/bind-to-use is a killer paradigm for clean, library, multideveloper environments?
Sure they are aware of this. And yes, DSA would introduce a lot of new API functions.

In my dreams, I see two options:
1) completely rewrite the GL API (!!!)
2) put into core specs the DSA functionalities

History told me that I have to give up on option 1.
For option 2, given the experience we had about bad naming conventions, what about defining every DSA function as glDirect<Somethig>() ? It has been already adviced, but no one on the upper floors listens to it.

Sorry, but I am very upset about the lack of a proper, well structured DSA.
And I am also worried about all the advertising on GL overtaking DX. GL catched DX, great, kudos to all, really. But WE WANT A GOOD API.

</rant>

m.

thokra
08-12-2012, 01:48 AM
It has been already adviced, but no one on the upper floors listens to it.

Actually, GL_ARB_clear_buffer_object defines 2 APIs having the substring "Named" in them which makes senses as you identify the object being modified by its name (a GLuint) and not by its binding target. IMHO that's ok as far as naming conventions go.

mhagain
08-12-2012, 05:21 AM
The DSA extension as it stands isn't appropriate for promotion to core - for one thing it modifies a LOT of deprecated functionality, and any hypothetical DSA-in-core would very likely not include these parts. Existing programs will be more likely to either retain their old code or do a full port to core, new programs can start out as core-only, so it would be lot of spec work for something that wouldn't be used.

The big win from DSA is removal of bind-to-modify but DSA as it is specifies a lot more than just that, and builds on the existing functionality (mostly by just adding an extra param to each call) rather than specifying a real new way. That means that all of the other nastiness in e.g. texture objects remains exposed with DSA just fixing up one part.

There's a clear alternative path available here, which is to unify texture and buffer storage into a single new object type (it's all just video RAM if you think about it, with the only real difference being how it's used, which is a program-specific feature and doesn't seem to justify any major API-separation), and provide DSA-only entry points for that object type. The existing non-DSA API could then be layered on top of that in the driver, in much the same way as the fixed pipeline is layered on top of shaders, and immediate mode is (probably) layered on top of vertex buffers in current implementations. In an ideal world the hardware vendors could even get together and provide a common such layer that they would all ship, but I don't see that being anything other than a slim possibility.

That seems the sensible route, but this is the ARB that we're talking about, so we'll need to wait and see. After recent spec evolutions I've a little more faith than before, and they may yet surprise us in a nice way, but that's just me and it's entirely possible I may be wrong.

(As an aside: it's nice to see the newer functionality taking a DSA-like approach so it's obvious that this is something that the ARB do recognise the value of, meaning that it's most likely not a case of resistance to the idea of DSA but more a case of difficulties in getting a sensible specification together.)

Alfonse Reinheart
08-12-2012, 04:25 PM
As an aside: it's nice to see the newer functionality taking a DSA-like approach so it's obvious that this is something that the ARB do recognise the value of, meaning that it's most likely not a case of resistance to the idea of DSA but more a case of difficulties in getting a sensible specification together.

The problem is that they're very inconsistent about it.

For example, ARB_invalidate_subdata (http://www.opengl.org/registry/specs/ARB/invalidate_subdata.txt) is pure DSA. It doesn't add a GL_INVALIDATE_BUFFER target, nor does it use buffers attached to the context. It simply takes buffer and texture objects directly. ARB_copy_image (http://www.opengl.org/registry/specs/ARB/copy_image.txt) works similarly.

And yet, ARB_framebuffer_no_attachments (http://www.opengl.org/registry/specs/ARB/framebuffer_no_attachments.txt), which adds parameters to framebuffers, works just like the standard OpenGL way. You don't pass an FBO; you have to bind it and modify it. Similarly, ARB_clear_buffer_object (http://www.opengl.org/registry/specs/ARB/clear_buffer_object.txt) isn't DSA; you have to bind it to the context.

Both of them have EXT functions that are DSA, but the core functions are not. So you can invalidate a buffer via DSA, but not clear it.

I would say that the ARB doesn't recognize the value of it; NVIDIA does. Just look at the Contributors section. For ARB_copy_image, you have 10 people; 9 of them are from NVIDIA and one is from Transgaming. For ARB_invalidate_subdata, 2 of the 3 contributors are from NVIDIA. The non-DSA-style extensions are credited as "Members of the Khronos OpenGL ARB TSG" or simply don't have a Contributors section at all.

Looking at the Revision History, the DSA-style extensions seem to have been more or less done internally by NVIDIA, then presented to the ARB for editing and approval. Things like "internal revisions" and "based on NV_copy_image". The others seem to have been formed by the ARB themselves.

Of course, this also explains why ARB_vertex_attrib_binding isn't DSA-style (it doesn't even add DSA EXT functions). Because the objects they would be modifying are VAOs, and NVIDIA doesn't seem to like VAOs or encourage their use. Granted, VAB tends to work against 70% of the whole point of VAOs, but that's another issue.

So the ARB isn't trying to make DSA happen; NVIDIA is. That's why we don't have DSA in core, because only one member of the ARB actually wants it to happen.


The big win from DSA is removal of bind-to-modify

DSA does not, and never did, remove bind-to-modify. It simply provides an alternative. Removing "bind-to-modify" would require removing every function that operates on state that happens to be encapsulated into an object.

mhagain
08-12-2012, 05:21 PM
...on the other hand sampler objects are primarily from AMD and use a DSA-style API: http://www.opengl.org/registry/specs/ARB/sampler_objects.txt

And as for DSA not removing bind-to-modify, check out http://www.opengl.org/registry/specs/EXT/direct_state_access.txt and "void TextureSubImage2DEXT(uint texture, enum target, ..." or "void NamedBufferSubDataEXT(uint buffer, intptr offset, ..." - what's that if not removal of bind-to-modify?

With the main object types that bind-to-modify affects in real-world code being texture objects and buffer objects, the point that a replacement API without bind-to-modify would suit this requirement more than building on top of the existing API by providing variants for every function still stands. Unless we're going to squabble over semantics of "modify" versus "load data", of course... ;)

Alfonse Reinheart
08-12-2012, 06:47 PM
...on the other hand sampler objects are primarily from AMD and use a DSA-style API: http://www.opengl.org/registry/specs...er_objects.txt

True. But it's also creating an entirely new object type. Whereas the current brand of extensions are just changing what you can do to them.


And as for DSA not removing bind-to-modify, check out http://www.opengl.org/registry/specs...ate_access.txt and "void TextureSubImage2DEXT(uint texture, enum target, ..." or "void NamedBufferSubDataEXT(uint buffer, intptr offset, ..." - what's that if not removal of bind-to-modify?

It's adding the ability to modify textures without binding them. It doesn't remove the possibility of modifying textures by binding them. Bind to modify is not removed by DSA. An individual application may never use bind-to-modify again. But because it is still allowed by the API, drivers must assume that the user can and will do it.

mhagain
08-13-2012, 02:57 AM
Mental note to self: don't use "removal of {X}" as shorthand for "removal of the absolute need to use {X} in every concievable situation while still allowing that the ability to use {X} may be retained" because it will be taken blindly literally. Sigh.
In any event that's what the deprecation mechanism is for.

Alfonse Reinheart
08-13-2012, 04:27 AM
Mental note to self: don't use "removal of {X}" as shorthand for "removal of the absolute need to use {X} in every concievable situation while still allowing that the ability to use {X} may be retained" because it will be taken blindly literally. Sigh.

"every conievable situation" obviously not including, "still retaining backwards compatibility with implementations that don't implement DSA" or "we're not going to rewrite our entire codebase just because someone came out with a new OpenGL version." Because those are inconceivable.

In any case, without actually getting rid of bind-to-modify, I never really saw the point of DSA. Because as long as drivers have to assume that an application could be binding an object to modify it, the driver can't do useful things like assume that when you bind that VAO, you actually mean to render with it. And so forth. Without that, it's little more than API convenience.

A nice one to be sure. But I don't know that convenience alone is really worth adding 100 more OpenGL functions.


In any event that's what the deprecation mechanism is for.

Just like the ARB deprecated `glUniform*` when ARB_separate_shader_objects gave us `glProgramUniform*`. Like they deprecated the sampler object state inside of textures when they created separate sampler objects. Like they deprecated glVertexAttribPointer when they came out with glVertexAttribFormat and glBindVertexBuffer. Like they deprecated glGetActiveUniform when they made the new program querying API.

I can keep going, but I think my point is clear: deprecation is dead. They're not doing that again.

absence
09-10-2012, 01:44 AM
In any case, without actually getting rid of bind-to-modify, I never really saw the point of DSA. Because as long as drivers have to assume that an application could be binding an object to modify it, the driver can't do useful things like assume that when you bind that VAO, you actually mean to render with it. And so forth. Without that, it's little more than API convenience.

Why should drivers have to assume anything about binding? The binding part can be moved to a sluggish wrapper around the DSA driver, that only gets used when an application does not request the core DSA context.

Alfonse Reinheart
09-10-2012, 09:57 AM
The binding part can be moved to a sluggish wrapper around the DSA driver

You can't move the binding part anywhere, because you still need binding in order to render.

Also, what is a "core DSA context"? That sounds suspiciously like deprecation, which as previously stated, isn't going to happen again. Indeed, notice how implementations only implemented ARB_debug_output in debug contexts, but KHR_debug is now core. That shows that the ARB doesn't like splitting functionality between different context flags like they.

They have only one split right now: core/compatibility. They're not going to make compatibility/core/coreDSA.

absence
09-14-2012, 11:48 AM
They have only one split right now: core/compatibility. They're not going to make compatibility/core/coreDSA.

I was thinking compatability/coreDSA. People who can't or won't rewrite can use compatability or previous core versions, and new projects can use DSA. I do concede that such a change may be too radical for a committee to agree on, but something radical is needed if GL is ever going to come close to DX.

Alfonse Reinheart
09-14-2012, 03:36 PM
Besides the form of the API (which is generally a stylistic question), in what way is OpenGL lagging behind D3D?

absence
09-16-2012, 02:26 PM
Besides the form of the API (which is generally a stylistic question), in what way is OpenGL lagging behind D3D?

Disregarding the source of the problem wouldn't work. Compare:



void OpenGLEngine::Buffer::Update(PVOID data)
{
glBindBuffer(..., mBuffer);
glBufferSubData(..., data);
glBindBuffer(..., 0);
}

void Direct3DEngine::Buffer::Update(PVOID data)
{
context->UpdateSubresource(mBuffer, ..., data, ...);
}


Notice how the stylistic form of the API affects the number of API calls.

mhagain
09-16-2012, 03:04 PM
Disregarding the source of the problem wouldn't work. Compare:



void OpenGLEngine::Buffer::Update(PVOID data)
{
glBindBuffer(..., mBuffer);
glBufferSubData(..., data);
glBindBuffer(..., 0);
}

void Direct3DEngine::Buffer::Update(PVOID data)
{
context->UpdateSubresource(mBuffer, ..., data, ...);
}


Notice how the stylistic form of the API affects the number of API calls.

It's actually worse because you need to save out and restore the previously bound buffer, otherwise you're potentially going to mess with state used for drawing (e.g. VAO state). That is the single biggest problem here - there's an artificial connection between state used for drawing and state used for creating or updating and each can mess with the other.

Yes, we all know that this can be abstracted away in your own "UpdateBuffer" routine - that's not the point. This point is: you shouldn't have to. It's putting extra work on the developer, creating bear traps for inexperienced people to get caught in, and building more points where things can go wrong into your program. A good API helps you to use it properly and helps you to avoid this kind of basic mistake.


I was thinking compatability/coreDSA. People who can't or won't rewrite can use compatability or previous core versions, and new projects can use DSA. I do concede that such a change may be too radical for a committee to agree on, but something radical is needed if GL is ever going to come close to DX.

DSA as it is will never go into core. Much of it relates to deprecated functionality, some of it has been superseded (why use glTextureParameteriEXT when you've got sampler objects?) and some of it is already there (glProgramUniform calls for example). What really needs to go to core as a matter of priority is buffer and texture updates; anything else needed can follow at it's own pace.

Alfonse Reinheart
09-16-2012, 04:56 PM
Disregarding the source of the problem wouldn't work.

The source of what problem? What problem are you trying to solve with this? Because I don't see one.

The number of API calls is generally irrelevant to performance. Or if it is relevant, you need to prove that it is in some way. This is even moreso when considering that most of these API calls won't actually touch GPU state directly.


A good API helps you to use it properly and helps you to avoid this kind of basic mistake.

Yes, but nobody's saying that OpenGL is a good API. My claim is that OpenGL does not need anything to "come close to DX". OpenGL is just as functional and performance as Direct3D. Yes, they do things a different way, and OpenGL would be better off if we could magic many of its functions out of existence and magic a bunch more into existence. But that's not going to happen.

DSA is not a new concept. And as much as NVIDIA keeps trying to push DSA, the ARB seems just as determined to avoid DSA. It's not going to happen, so there's no point in wishing that it would.

kRogue
09-17-2012, 12:52 AM
DSA is not a new concept. And as much as NVIDIA keeps trying to push DSA, the ARB seems just as determined to avoid DSA. It's not going to happen, so there's no point in wishing that it would.


.. but DSA is slowly happening, for example as you noted that glProgramUniform found it's way into core (you can argue it was needed for SSO, but there was function to make which "sub"-program active for glUniform as well). In addition, sampler objects have a DSA interface. As you stated the big ones are buffer objects and textures.. I'd also argue vao's as well and to a lesser extent FBO's. In the ARB's defense, for editing buffer objects there are already sme bind points that are not related to rendering: TEXTURE_BUFFER, COPY_READ_BUFFER and COPY_WRITE_BUFFER.. indeed TEXTURE_BUFFER as a binding point for a buffer object affects NOTHING. So one can fake DSA's for buffer objects by simply always using TEXTURE_BUFFER as the bind point for modifications.. this is still worse than having DSA, but avoids all those bear traps. This might be why DSA on buffer objects is not happening so quickly... or just the ARB does not like introducing lots and lots of functions at one go.

mhagain
09-17-2012, 01:08 AM
DSA is not a new concept. And as much as NVIDIA keeps trying to push DSA, the ARB seems just as determined to avoid DSA. It's not going to happen, so there's no point in wishing that it would.

But the ARB ain't avoiding DSA. Sampler objects have a DSA API (and they're primarily from AMD, so less of the "NVIDIA trying to push" thing because it's just not true). glProgramUniform is in core. Some form of DSA is getting there, slowly but surely, and not the very same as the GL_EXT_direct_state_access spec, but it's getting there.

absence
09-17-2012, 03:54 AM
The source of what problem? What problem are you trying to solve with this? Because I don't see one.

I think mhagain hit the nail on its head:


there's an artificial connection between state used for drawing and state used for creating or updating and each can mess with the other.


Yes, but nobody's saying that OpenGL is a good API. My claim is that OpenGL does not need anything to "come close to DX".

That is a contradiction. A good API is exactly what OpenGL needs to come close to Direct3D. As you say, it's not about performance or features.


DSA as it is will never go into core.

Just to be clear, I meant the concept of DSA, not the current implementation of the extension.

mhagain
09-17-2012, 04:32 AM
It's worth noting that back in the old GL 1.1 vs D3D 3 days, the fact that OpenGL was unquestionably a better API was a very significant influencing factor. Both had similar capabilities and could hit similar performance, but OpenGL 1.1 helped the programmer to be productive, D3D 3 didn't. So the importance of an API being good shouldn't be downplayed; a good API lets you focus on productive and useful work that gets stuff done rather than on writing boilerplate and wrappers that should otherwise be needless. A good API lets you iterate more quickly on proof-of-concept work and lets you focus on fixing errors in and adding cool stuff to your own code rather than on dealing with bizarre and outdated design decisions.

For me at least this is nothing whatsoever to do with any concept of OpenGL coming close to D3D. Each has it's own world-view that's subtly different from the other (and in some ways GL 4.3 is already quite well ahead of D3D 11, so if it were to come close it would involve it getting worse). This is to do with OpenGL becoming a good API again, in it's own right. So "just because D3D does it" is a poor reason to add functionality to OpenGL, but it's also the case that "just because D3D does it" is also a damn poor reason to not add functionality. Ignore what D3D does, look at what OpenGL does, ask yourself "is this screwed?", and if the answer is "yes" then ask "what needs to be done?" That's all.

Alfonse Reinheart
09-17-2012, 07:48 AM
In the ARB's defense, for editing buffer objects there are already sme bind points that are not related to rendering: TEXTURE_BUFFER, COPY_READ_BUFFER and COPY_WRITE_BUFFER.. indeed TEXTURE_BUFFER as a binding point for a buffer object affects NOTHING.

Actually, there are quite a few more binding points that have no semantic component associated with them. All of the indexed targets (GL_TRANSFORM_FEEDBACK_BUFFER, GL_UNIFORM_BUFFER, GL_ATOMIC_COUNTER_BUFFER, etc) have actual binding points in addition to the indexed locations, but because they're indexed targets, these bind points don't have any semantic component associated with them.

Indeed, only about half of the targets actually have explicit semantics: GL_ARRAY_BUFFER, GL_ELEMENT_ARRAY_BUFFER, GL_PIXEL_PACK/UNPACK_BUFFER, and GL_DRAW/DISPATCH_INDIRECT_BUFFER.


But the ARB ain't avoiding DSA. Sampler objects have a DSA API (and they're primarily from AMD, so less of the "NVIDIA trying to push" thing because it's just not true). glProgramUniform is in core. Some form of DSA is getting there, slowly but surely, and not the very same as the GL_EXT_direct_state_access spec, but it's getting there.

"Getting there" would imply progress, the gradual decrease in new features that don't use DSA interfaces. But that's not what we see. There's no gradual decrease in non-DSA interfaces; it's completely random and haphazard. We discussed this (http://www.opengl.org/discussion_boards/showthread.php/178711-Official-feedback-on-OpenGL-4-3-thread?p=1241367&viewfull=1#post1241367): except for new object types, the only OpenGL extensions that are DSA only are those that originated from NVIDIA (like SSO; EXT_SSO didn't require DSA because NVIDIA already implemented glProgramUniformEXT through the DSA extension).

The ARB's policy clearly appears to be not to adopt DSA. But they don't seem willing to make changes to NVIDIA-born extensions just to make them non-DSA.

So I don't see us "getting there" here. New features are just as likely as not to use DSA interfaces.


That is a contradiction. A good API is exactly what OpenGL needs to come close to Direct3D.

Do you honestly believe that the reason OpenGL is not used as often today as D3D is because of its API?


Both had similar capabilities and could hit similar performance, but OpenGL 1.1 helped the programmer to be productive, D3D 3 didn't.

D3D v3 did not have "similar capabilities" at all. It has no software T&L, it had fewer blending modes (granted, so did the hardware, so that was a performance trap of 1.1 at the time), and so forth.

Also, we need to recognize that there is a major difference between how D3D v3's API was terrible and how OpenGL's API is poor. GL's API makes your code a bit inelegant, or requires you to be a little careful. Looking at D3D v3 code is like staring into the Ark of the Covenant. It was bad.

OpenGL's issues are a minor inconvenience.

thokra
09-17-2012, 08:36 AM
A good API is exactly what OpenGL needs to come close to Direct3D. As you say, it's not about performance or features.

What good is beautifully designed API if it lacks functionality, is a bitch to make fast and can only be used on a restricted range of platforms? This is not to say that Direct3D lacks functionality that's versatile enough for the mainstream market or isn't fast, it's simply to clarify that a good API is much more than simple convenience.

And what does coming close to Direct3D mean anyway? Can you do anything with Direct3D that you can't do with OpenGL? Can you port Direct3D apps to natively run on any Unix-like OS that's supported by hardware vendors? Do you recognize the importance of Android and iOS? Are you aware that the Playstation and the Wii aren't giving a damn about Direct3D? It sounds like zealotry on my part but these are aspects that are independent of API design but just as important to the potential of the API.

What I miss as an OpenGL developer is a more powerful GLSL compiler (and of course the corresponding spec) and something like XNA - something like an official set of tools for use with OpenGL that eases your everyday pain. Sure, we can gather all sorts of good third-party libs like glm, GLEW and so forth but it's much more effort than it should be. IMHO, the ecosystem of D3D's accompanying libs and tools, all official and actively developed, is where D3D shines. Still, this has nothing to do with OpenGL as an API. If you're talking about OpenGL versus Direct3Dand take them for what they are, then you will see stuff that's tedious in OpenGL and better solved in Direct3D but the result will not differ if done right and that's good enough for me.

One thing I have come to understand in the past months is that arguing how OpenGL could be better here and there doesn't get you anything unless you have proven, with working code, that a real problem exists. Have you taken any timings regarding your DSA example above? Are you absolutely sure that binding, updating and unbinding is actually slower than the single call to UpdateSubresource()? It might look logical since you save two API calls, but do you have any idea about how much time the functions actually take?

Unless you can't prove that OpenGL falls short in terms of the results you get out of it, Direct3D might be the better designed API - but it's not a better solution.

Alfonse Reinheart
09-17-2012, 11:52 AM
all official and actively developed

Except for XNA, who's future is actively in doubt (http://gamedev.stackexchange.com/questions/22292/what-is-the-future-of-xna-in-windows-8-or-how-will-manged-games-be-developed-in). And for D3DX, which is not being "actively developed" (http://preview.library.microsoft.com/en-us/library/ee663275%28v=vs.85%29.aspx), though they did spin off DirectXMath.

So being "official" doesn't seem to have changed the fact that if you want to make Win8 games, you're going to have to ditch your XNA-based system and use something else. Personally, I'd rather work with something open source if I don't have the money to buy/license a real engine; at least then, I can go in and work on it if the original owner decided to stop.

thokra
09-17-2012, 12:24 PM
Except for XNA, who's future is actively in doubt (http://gamedev.stackexchange.com/questions/22292/what-is-the-future-of-xna-in-windows-8-or-how-will-manged-games-be-developed-in).And for D3DX, which is not being "actively developed" (http://preview.library.microsoft.com/en-us/library/ee663275%28v=vs.85%29.aspx), though they did spin off DirectXMath. So being "official" doesn't seem to have changed the fact that if you want to make Win8 games, you're going to have to ditch your XNA-based system and use something else.

I wasn't aware of that - maybe my narrow interest in Windows 8 is to blame. I can't fathom why some undoubtedly useful stuff has obviously outlived its time.

mhagain
09-17-2012, 12:43 PM
One thing I have come to understand in the past months is that arguing how OpenGL could be better here and there doesn't get you anything unless you have proven, with working code, that a real problem exists. Have you taken any timings regarding your DSA example above? Are you absolutely sure that binding, updating and unbinding is actually slower than the single call to UpdateSubresource()? It might look logical since you save two API calls, but do you have any idea about how much time the functions actually take?

Unless you can't prove that OpenGL falls short in terms of the results you get out of it, Direct3D might be the better designed API - but it's not a better solution.

Here's the way things work.

There are at least three main classes of functionality that are candidates for adding to OpenGL.

The first is to fix bugs in existing spec and I don't think anyone could argue against that. Not even Alfonse.

The second is to add new functionality; ditto but the form that it takes is worthy of discussion while everything is still up for grabs.

The third is what we're talking about here - general spec cleanup and usability improvements, and it seems that an argument is being made that this class has no place in any hypothetical future spec.

That's a spurious argument and seems to be advanced for the sake of having an argument rather than of furthering discussion. GL has had tons of additional functionality of this nature added. Was GL_ARB_explicit_attrib_location necessary? No, but yet it was added. Was GL_ARB_explicit_uniform_location necessary? No, but likewise. Is it necessary to (pulling a possible future example off the top of my head, please don't take too seriously) be able to provide initial/default values for sampler uniforms? No. Is it spec cleanup and a usability improvement? Yes.

Hell, even GL_ARB_separate_shader_objects doesn't provide functionality that can't be worked around otherwise (and I seem to recall that a certain someone was opposed to that idea too). That one is actually a classic example of exactly the kind of argument that is being advanced against any notion of any kind of DSA here (and is also a good example of something that involved a significant number of new entry points and a handful of new GLenums).

Again with the specific problem that DSA resolves: there is an artificial connection between state used for drawing objects and state used for creating/updating them. Doing one has consequences that the developer most likely does not intend for the other. Yet they are clearly different operations. So that artificial connection needs to be broken.

So - we're being told "come up with a compelling argument for" to which I'll respond: "come up with a compelling argument against, and you may be worth taking seriously".

absence
09-17-2012, 12:56 PM
Do you honestly believe that the reason OpenGL is not used as often today as D3D is because of its API?

Not at all, or I would have made the claim. I believe lack of driver quality and developer tools are more important factors, but that's not an excuse to make the lives of developers harder than necessary by ignoring the API issue.

thokra
09-17-2012, 02:26 PM
mhagain: No one in their mind can be against fixing bugs, no one in their right mind can be against API changes that alleviate development. absence's argument was that stuff like that has to happen to raise OpenGL to be on par with Direct3D. But absence offered no substantial facts whatsoever to base the claim on.

What I'd like to see is for people to show real examples of where differences in both APIs are really significant - not only praise functionality which makes your life easier but is no more powerful.

Regarding DSA I agree. However, Alfonse has a point, there doesn't seem to be a clear direction the ARB seems to be headed in. It's the same wishy-washyness that made deprecation useless. I'd like to see DSA in core too, but if the ARB actually decides to introduce a complete and consistent DSA API they might as well revamp the whole thing and finally break backwards-compatibility.

Alfonse Reinheart
09-17-2012, 02:38 PM
The third is what we're talking about here - general spec cleanup and usability improvements

No, what we're talking about here is DSA, a specific usability improvement. One that has been around in extension form since 2008. One that the ARB has had 6 OpenGL spec versions to incorporate. One that the ARB has failed to incorporate in each of those 6 spec versions.

This isn't like ARB_explicit_uniform_location, where people clearly wanted it from day one, but there never was an EXT spec for it or anything. This is something that's been an extension spec for 4 years; the ARB clearly knows about it since a good half of the extension specs that make up 4.3 have EXT_DSA functions in them.

If someone has clearly been presented with an idea, and they reject that idea six times, you should recognize that they're not going to do it. They didn't put DSA into OpenGL ES 3.0. They didn't put DSA into desktop GL 3.1, 3.2, 3.3, 4.0, 4.1, 4.2, or 4.3.

Is there some reason at all to expect them to put it into 4.4 or 5.0?


Hell, even GL_ARB_separate_shader_objects doesn't provide functionality that can't be worked around otherwise (and I seem to recall that a certain someone was opposed to that idea too).

Originally, way back when 3DLabs proposed GLSL, I was. However, I was primarily against EXT_separate_shader_objects because it didn't let you have that separation for user-defined inputs/outputs.

Lastly, I'm not "opposed to" DSA. I'm a realist: I have no faith that an organization that has refused to incorporate this through 6 versions of OpenGL will suddenly decide it's a great idea and incorporate it the seventh time. Especially when they've clearly shown that they don't want to add DSA-style APIs for many of their new features.

In short: it's not happening, so let it go already.


we're being told "come up with a compelling argument for"

No, my original question, the one you've seemed to have forgotten, was, "Besides the form of the API (which is generally a stylistic question), in what way is OpenGL lagging behind D3D?" That's what was being discussed until you shifted it into the question of why OpenGL needs DSA irrespective of D3D.

absence
09-18-2012, 04:00 AM
I'd like to see DSA in core too, but if the ARB actually decides to introduce a complete and consistent DSA API they might as well revamp the whole thing and finally break backwards-compatibility.

So we agree after all! They could even maintain the current API as the compatibility spec in order to avoid breaking anything. They're already maintaining two specs anyway.

absence
09-18-2012, 04:06 AM
Lastly, I'm not "opposed to" DSA. I'm a realist:

This is a feedback thread, and I provide feedback. If you don't even disagree with it, please keep the noise down. We're all disillusioned with the ARB, but keeping quiet isn't going to solve anything.

thokra
09-18-2012, 04:22 AM
They're already maintaining two specs anyway.

At some point this nonsense should just stop. Introducing a complete DSA API is the perfect opportunity to realize what was originally planned 4 years ago.

Dark Photon
09-18-2012, 06:16 PM
We're all disillusioned with the ARB...
Hey, speak for yourself! :D I second the rest of what you said, but let's avoid the grandstanding too. Each of us represents our own suggestions and opinions -- no others.

Over the years, I've come to appreciate the difficult job the ARB has. They're never going to please anyone completely (and they're limited by corporate budgets and priorities), so if anyone out there's holding their breath for complete nirvana, they might as well sit down before they pass out. Just throw your opinion out there for consideration, and don't get your panties all in a wad if you don't get everything you want. You're not the only fish in the pool.

absence
09-20-2012, 06:44 AM
Hey, speak for yourself!

Sorry, I meant to point out that Reinheart isn't the only one disillusioned, not grandstand. Back to the topic of feedback, I forgot to mention the elephant in the room: Can we please have multi-threading in OpenGL?

thokra
09-20-2012, 06:56 AM
Can we please have multi-threading in OpenGL?

A GPU is inherently multi-threaded - following AMD rhetoric it's even ultra-threaded. The problem is the single command buffer you can fill. Multi-threaded OpenGL can be done using two or more contexts in different threads. However, if you have a single command buffer and you switch threads it's nothing more than adding GL commands sequentially to that command buffer depending on what thread is currently pushing. If you have multiple command buffers, as with multiple GPUs, you can have true multi-threading.

malexander
09-20-2012, 06:58 AM
Can we please have multi-threading in OpenGL?

You can - just create a separate GL context for each thread, belonging to the same share group. You can then have some threads create objects, fill buffers, etc. while another does the rendering. This is precisely what GL_ARB_sync is for.

aqnuep
09-20-2012, 07:55 AM
Can we please have multi-threading in OpenGL?
I don't understand why everybody is asking for multi-threading in OpenGL since D3D has it. FYI: OpenGL always had multi-threading support.
D3D's multi-threading approach is not much different. Only deferred contexts are something that OpenGL doesn't have, though it's kind of similar to display lists. Also, deferred context promise more than what they actually achieve as in practice they barely give any benefit.

absence
09-21-2012, 01:47 AM
However, if you have a single command buffer and you switch threads it's nothing more than adding GL commands sequentially to that command buffer depending on what thread is currently pushing. If you have multiple command buffers, as with multiple GPUs, you can have true multi-threading.

Neither of those cases are very interesting, since the most common hardware setup is multiple CPU cores and a single GPU. Direct3D solves this by having multiple command lists for a single GPU. Isn't that feasible for OpenGL as well?


Only deferred contexts are something that OpenGL doesn't have, though it's kind of similar to display lists.

It doesn't matter if it's similar, because OpenGL doesn't have display lists.

thokra
09-21-2012, 02:25 AM
Direct3D solves this by having multiple command lists for a single GPU.

And does the hardware actually map that to multiple parallel command streams as well? Multiple streams in software don't give you much gain if you are still forced to be sequential in hardware.

mhagain
09-21-2012, 06:11 AM
A D3D deferred context is almost exactly as described - record API calls on a separate thread then play them back on the main thread. There's a very obvious case it targets and that's the case where the cost of making these API calls sequentially on the main thread outweighs the overhead of threading and of making two passes over each command (once to record and once to play back). It shouldn't be viewed as a "implement this and you'll be teh awesome" feature, rather, it needs careful profiling of your program and informed decision making. Implement it in a program that doesn't have the performance characteristics it targets and things will get worse.

absence
09-25-2012, 01:48 AM
It shouldn't be viewed as a "implement this and you'll be teh awesome" feature, rather, it needs careful profiling of your program and informed decision making.

Correct, and in order to decide whether to use a feature or not it needs to exist.

Leith Bade
10-27-2012, 01:37 AM
Will GL_EXT_depth_bounds_test be promoted to core now that both AMD and NVidia support it?

mbentrup
10-28-2012, 04:59 AM
The OpenGL 4.3 core spec removed the GL_ALPHA format for glReadPixels (chapter 18.2.1, page 458). The other single-component formats GL_RED, GL_GREEN and GL_BLUE are supported, but GL_ALPHA is only supported in the compatibility profile. In 4.2 core all 4 individual components could be read with glReadPixels.

Is there a reason why the alpha channel is treated special here ?

Brandon J. Van Every
11-14-2012, 09:50 AM
Correct, and in order to decide whether to use a feature or not it needs to exist.

It also needs to be used by a lot of people. If very few people actually use a feature, then it usually has very little testing or IHV support. Anything "not on the common paths" is very likely to blow up or not provide any benefits, just due to the exponential branching complexity of oh-so-many features. Ideally, any proposed feature would have some kind of "preliminary consensus" that it can indeed be a benefit, and would indeed match a lot of people's expected case uses.

ManDay
03-13-2013, 02:38 PM
The nonsensical description of DrawElements was eventually fixed in 4.3 spec, nice! However, it would be good if the
description of the "indices" parameters would reflect the according meaning on the ref pages as well (of it being an offset,
not any sort of pointer C-type).

Eosie
04-07-2013, 06:40 PM
It also needs to be used by a lot of people. If very few people actually use a feature, then it usually has very little testing or IHV support. Anything "not on the common paths" is very likely to blow up or not provide any benefits, just due to the exponential branching complexity of oh-so-many features. Ideally, any proposed feature would have some kind of "preliminary consensus" that it can indeed be a benefit, and would indeed match a lot of people's expected case uses.

This is mostly incorrect. Features are (or should be) tested before they are released regardless of popularity of each feature, and the way to achieve that is to write tests or use an existing test suite which can assess the quality of an OpenGL implementation. There is the official OpenGL conformance test suite developed by Khronos that driver developers should use first. There is also Piglit, the unofficial open source OpenGL test suite, which is pretty huge and it's getting bigger every day with 89 contributors so far. Finally, driver developers should write their own tests for features they implement (I know major vendors do that, though the specification coverage of their tests might vary a lot, considering AMD passed about 75 % of Piglit tests last time I checked, which is not very good, and NVIDIA even less than that).

Alfonse Reinheart
04-08-2013, 02:17 AM
This is mostly incorrect. Features are (or should be) tested before they are released regardless of popularity of each feature, and the way to achieve that is to write tests or use an existing test suite which can assess the quality of an OpenGL implementation.

You seem very conflicted about what you're saying.

On the one hand, you're saying that the post is wrong to say that only some features are being tested. Yet you cite low Piglit scores and the obvious lack of comprehensive testing by driver developers. So clearly some features are not being properly tested. So you agree with Brandon's point: that staying on commonly tread ground is the most effective way to avoid driver bugs.

The post you're responding to is not talking about what "should be"; the post is talking about what is currently. And you seem to agree with his explanation of the current state of things.

So what is "mostly incorrect" about what he's saying?


There is the official OpenGL conformance test suite developed by Khronos that driver developers should use first.

If there is a desktop OpenGL conformance test, I haven't heard of it. The old conformance test for desktop GL was never updated; it never even checked conformance to 2.0. So it's worthless for anything more recent. OpenGL ES 2.0 and 3.0 have conformance tests, but not desktop GL.

Khronos said that they were interested in making one, but they haven't announced anything beyond interest.