OpenGL 4.6 request list

Please add the following features to OpenGL 4.6

Add OpenGL ES 3.2 context creation functionality to OpenGL 4.6 core.

Add the extensions from OpenGL ES 3.2 to core OpenGL.
Make OpenGL a superset of OpenGL ES again.

Make in core OpenGL the ASTC support mandatory and s3tc optional (or what I like to see more: deprecate/remove s3tc).
Possibly adding one of the following ASTC extensions:
OES_texture_compression_astc
https://www.khronos.org/registry/gles/extensions/OES/OES_texture_compression_astc.txt
texture_compression_astc_hdr
https://www.opengl.org/registry/specs/KHR/texture_compression_astc_hdr.txt
Maybe make a full and portable profile for ASTC with different texture limits to serve the full spectrum of devices?

Put shader draw parameters in core.
https://www.opengl.org/registry/specs/ARB/shader_draw_parameters.txt

Allow using more varied names in core for low component texture components.
Especially single and dual.
Allow not only R but G, B and A as well.
And allow another letter for differentiating between colour components and other miscellaneous components.
(Maybe c for component or channel, maybe another letter)
The reason, rational for such a thing is: it makes detecting errors much easier for programmers. Allowing them to much easier see if their code is using the components correctly or not.
Do not mandate how to use the component names to avoid them becoming a programming limitation instead of a more expressive way to write code.

If you introduce an extension for async shader compilation, please put async somewhere in the name. And have both compilation and loading of shaders done asynchronously by the extension.
Perhaps the name could be GL_ARB_parallel_async_shader_compile for OpenGL and also used in OpenGL ES.
Not GL_ARB_parallel_shader_compile.
If it provides async compilation, that is a big feature/point, such a feature needs to be advertised in the name.
Unify with how async is done in Vulkan might be a good idea.
It seems only minor adjustments would need to be done to the following extension:
https://www.opengl.org/registry/specs/ARB/parallel_shader_compile.txt
Also have the specification provide plenty of information about how it interacts with a shader cache. Put in the specification plenty of information about shader caches. What a shader cache is, what it does, allows and mention shader cache a few times more in the description of the extension.
Do make sure there is good information about what async shader compilation and loading allows, especially in reducing lag spikes.
Increasing predictability and performance while reducing lag and stutter.

Do NOT put in features from Vulkan YET.
The following is not applicable to putting compatibility contexts between OpenGL and Vulkan.
It’s too early. Between several things:

  • apparently a new Vulkan release this summer
  • the Vulkan spec churn (new documentation release every week)
  • the resulting spec churn from the new Vulkan release this summer
  • getting feedback from developers about desired feature sets
    Vulkan really is not ready yet to base individual features for OpenGL on.
    Once more time has passed it will be.
    Once the documentation becomes somewhat more stable (maybe as early as next year: 2017). Once Vulkan’s features will be more crystallized with feature sets. And the new release of Vulkan has happened.
    After those things will have happened, it will be the right time to start doing feature cross-pollination between the two API’s.
    Also don’t put in SPIR-V when there is a new release coming up this summer.
    It makes little sense to start copying features between both API’s.
    Especially since with Vulkan. There will be feedback on what features developers want to have through determining feature sets. Knowing what features are popular will allow spec makers at Khronos to optimally choose which features to copy to other API’s.

Add OpenGL ES 3.2 context creation functionality to OpenGL 4.6 core.

OpenGL doesn’t define any “context creation functionality”. So it’s not clear what that would mean.

But in any case, you can use the EXT_create_context_es_profile extension to create any version of OpenGL ES contexts, where supported.

Make OpenGL a superset of OpenGL ES again.

No version of desktop OpenGL was ever a superset of any version of OpenGL ES. There was always code you could write in ES that would not do the same thing on desktop GL.

Make in core OpenGL the ASTC support mandatory

That’s not practical. ASTC is only supported by a very small set of hardware. Unless you want nobody to implement GL 4.6.

ASTC is not a real thing yet.

s3tc optional (or what I like to see more: deprecate/remove s3tc).

First, good news: S3TC was never adopted into core OpenGL. It has always been an extension.

FYI: RGTC isn’t S3TC. It’s similar to it, but it doesn’t have the patent issues. Which is why RGTC is core and S3TC is not.

Second, even if it was mandatory, why get rid of perfectly valid, functional, and useful technology? It’s not like IHVs will be ripping out support for it from their texture fetch units. Not so long as applications still use it.

Put shader draw parameters in core.

Intel doesn’t support it. It would be better to just bring in the gl_InstanceIndex functionality from khr_vk_glsl. That’s the most important part of draw parameters that OpenGL doesn’t support, and it’s something we know Intel can support.

And have both compilation and loading of shaders done asynchronously by the extension.

What exactly does that mean? Loading shader code is the job of the application, not OpenGL. It can’t make that asynchronous.

Unify with how async is done in Vulkan might be a good idea.

That would be the opposite of that extension. In Vulkan, there is no asynchronous shader compilation support. What there is in Vulkan are two things:

  1. When you call vkCreateShaderModule, you are guaranteed that the compilation is finished (successfully or with failure) by the time it returns. Similarly, when you call vkCreateGraphicsPipelines, you are guaranteed that the compilation is finished (successfully or with failure) by the time it returns.

  2. Both of those calls are fully reentrant. You can call them on the same VkDevice from multiple threads. You can even have multiple threads all using the same pipeline cache without synchronization.

Vulkan doesn’t make shader compilation parallel or asynchronous (FYI: those words mean the same thing in this context). It simply provides you with the tools to compile shaders asynchronously.

By contrast, parallel_shader_compile provides you with the tools to realize that the OpenGL implementation may compile shaders in parallel, and it gives you the tools to stop interfering in that process (by asking if there was an error before the compile has finished).

It’s two different models for two very different APIs. In one case, the API is asynchronous; in the other case, the API is reentrant.

Also have the specification provide plenty of information about how it interacts with a shader cache. Put in the specification plenty of information about shader caches. What a shader cache is, what it does, allows and mention shader cache a few times more in the description of the extension.

That is not how a specification works. A specification defines behavior, not how it gets implemented.

Vulkan talks about a pipeline cache because it is an explicit object which is part of the Vulkan system. It’s part of the API; it can’t not talk about it.

OpenGL has no similar construct. If the implementation uses a cache when compiling shaders, that’s not something OpenGL can explain, since it does not affect the behavior of the system or its interface. It only would affect performance.

It’s an implementation detail in OpenGL.

  • apparently a new Vulkan release this summer

From where have you heard of this summer release? Is it scheduled for SIGGRAPH?

[QUOTE=Alfonse Reinheart;1282835]
That’s not practical. ASTC is only supported by a very small set of hardware. Unless you want nobody to implement GL 4.6.[/quote]

To add to this, the Vulkan database shows that the only desktop hardware that supports ASTC is Intel. Even for NVIDIA, only their mobile Tegra line supports ASTC. This could be due to immature Vulkan drivers, but it does match up with the OpenGL support.

So while ASTC may be the future, it is definitely not the present.

It’s also worth noting that this is what OpenGL specs used to do in the past: define a software interface with little or no consideration to how hardware can support it (or even whether hardware supports it). That approach manifestly failed; the upshot was that OpenGL implementations tended to end up with functionality that was software-emulated, but it was not queryable if that was the case, so you could find yourself rudely thrown back to software emulation and single-digit framerates. That’s OK if you’re in a scenario where “everything must work and performance is secondary”, but that’s not always the case with everybody, and those for whom it wasn’t the case were poorly served by OpenGL.

ASTC is certainly feasible if all of the hardware vendors come onboard and implement support in conjunction with a future evolution of the spec. But that should be a requirement - the spec cannot evolve in isolation.

Far more interesting (and useful) would be to bring anisotropic filtering into core. 20 years from priority date has now expired (US6005582A - Method and system for texture mapping images with anisotropic filtering - Google Patents) so it should now be doable.

It should also be noted that other features are dealt with in the same way. Things are only added to core when a broad base of hardware can support it. In the case of OpenGL 4.x, something only becomes core if all current 4.x hardware can support it. Features of note which do not have such a broad base of support are:

  • ARB_fragment_shader_interlock: No AMD hardware support.
  • KHR_blend_equation_advanced: No AMD hardware support.
  • ARB_bindless_texture: No Intel/pre-GCN AMD hardware support.
  • ARB_sparse_texture/buffer: No Intel/pre-GCN AMD hardware support.

What OpenGL really lacks is Vulkan’s “feature” concept, which are effectively functionality that is defined in the core specification, but for which support is not required. OpenGL can express a form of this by using implementation-defined limits. For example, image load/store and SSBOs are only required to be supported for fragment and compute shaders. Other stages can express support by having non-zero limits.

But features like the above can’t really be expressed as “limits”. The best way OpenGL has to express such optional features is as ARB/KHR extensions. And there is nothing wrong with using an extension, either conditionally or relying on it.

May be the best solution to these problems will be setting up new OpenGL 5.0 :). ( this already occurred in the past. Opengl 1.X -> OpenGL 2.0, OpenGL 3.X -> OpenGL 4.0 )
In which will distinction between newer generation of GPU hardware from older ones.
In which support:

  1. ASTC compression,
  2. standard binary format of shaders and programs,
  3. bindless textures and buffers,
  4. some kind of “GL_NV_command_list”,
  5. good implementation of GL_KHR_no_error
  6. although a little bigger multithreaded support
    will be included.
    IMHO this will push forward OpenGL and it will can still easy to learn ( compared to Vulkan ) and can be as efficient as Vulkan.

In which will distinction between newer generation of GPU hardware from older ones.

You’re making an assumption that all “newer generation” hardware would be able to support all of that.

No Intel hardware supports bindless textures, and no AMD hardware supports non-ARB bindless stuff. There’s a reason why Vulkan doesn’t do things the bindless way, that it uses descriptor sets rather than just arbitrary numbers you throw around. It’s a better abstraction, one which can be implemented across lots of hardware while still providing the featureset required (access to arbitrary amounts of stuff in a shader).

The only functionality missing from that is NVIDIA’s passion for passing GPU memory pointers around. Note that Vulkan doesn’t let you do that, despite being lower-level than OpenGL.

Bindless is not a good hardware abstraction.

As for a variation of NV_command_list… why? If you’re willing to go through that much trouble, you may as well just use Vulkan. It’d be a much cleaner API, and you’d get more functionality out of it in the long run.

can be as efficient as Vulkan

No. No it can’t.

I’m watching evolution of GPU hardware for quite a long time and only my instinct tells me that the future can look like.

  1. ASTC compression -> newest mobile GPU have it so I’m pritty sure that desktop GPU will have it to. It’s too good ( for now ) to now implement this.
  2. standard binary format of shaders and programs -> SPIR_V its a good candidate for this.
  3. You are right that pointers are not perfect for that. Maybe a OpenGL needs Descriptor Sets too. Maybe something else, thats why this is only a suggestion.
  4. Why NV_command_list ? I think looking for solution how we can “Approaching Zero Driver Overhead in OpenGL” is a good idea. Multi Draw Indirect do not solves all problems.
    And solution proposed by NVIDIA is worth considering. Finding a best way how efficent pack state changes with clean API IMHO is a new goal for OpenGL.
    5 ) and 6 )
    When I said OpenGL can be as efficent as Vulkan. I mean I want to OpenGL be as efficent as Vulkan (of course in single thread enviroment only :slight_smile: ).
    And if driver will not be bottleneck I think it is possible. So GL_KHR_no_error is needed.

Much of what you list here isn’t actually anything to do with hardware though; what you’re talking about is evolution of a software abstraction, and you’re requesting to move the OpenGL software abstraction so close to Vulkan that it may as well just be Vulkan and be done with it.

How OpenGL should evolve in a post-Vulkan world is a valid topic for discussion of course, and some may even make a case that it’s more useful for OpenGL to evolve towards an even higher-level abstraction than it currently is.

OpenGL is in a strange place, abstraction-wise. Its abstraction is not a good fit for modern hardware from a performance standpoint, so it doesn’t really work there. But abstracting things more branches out into the realm of scene graphs, and there are innumerable ways of designing a scene graph. OpenGL is as high-level as you can reasonably get without going there.

The only real advantage OpenGL’s abstraction has is that it strikes an interesting balance between performance and ease-of-use. Handling synchronization as well as whatever gymnastics are needed in order to change framebuffers willy-nilly and so forth. You can get reasonable performance out of OpenGL as well as access to good hardware features, but without a lot of the explicit work that Vulkan requires.

At yet, engines like Unity, Unreal, and the like give you all kinds of power while hiding the details of APIs like Vulkan, D3D12, etc. They are easier to use than OpenGL, and they don’t really lose performance. But at the same time, they do lose the generality that OpenGL provides. If you’re not making a game, if it’s just a graphics demo or whatever, then there’s a lot that those engines do which you won’t care about.

But in other hand this “strange place” can be a good thing for OpenGL. More cross platform API competition can be good for OpenGL. So lets take what is best in Vulkan and do it in “OpenGL style” :slight_smile:
OpenGL is not only for game engines. I’m thinking now about visual engines for simulations. Those engines will not quiclly leavle OpenGL for Vulkan. Those products have a lot of inertia.
We know that OpenGL have some drawbacks. And my proposals are focused on one of them Driver Overhead. “GPU pointers” aren’t perfect but are know in use in some parts of OpenGL ( Bindless textures, Persistent Mapping ), and they are doing good job.

The new solutions will also help in the future development of WebGL . It’s also an interesting future-oriented field.

So lets take what is best in Vulkan and do it in “OpenGL style”

But that’s anathema to “OpenGL style”.

For example, OpenGL is all about changing state. Vulkan is all about you not changing state. You can’t simultaneously have both. Not in a coherent API.

What is best in Vulkan is that it’s Vulkan. By taking it, you will be losing what’s best in OpenGL. Making OpenGL act like Vulkan would simply be making Vulkan with a crappy API.

I’m thinking now about visual engines for simulations. Those engines will not quiclly leavle OpenGL for Vulkan. Those products have a lot of inertia.

If they have “a lot of inertia”, then they have sufficient inertia that they’re not going to be willing to do things the NV_command_list way either. If they’re willing to write code in the Vulkan-through-OpenGL way, then they’ll probably be willing to just write it with Vulkan.

“GPU pointers” aren’t perfect but are know in use in some parts of OpenGL ( Bindless textures, Persistent Mapping )

Bindless texture handles are not pointers. They are arbitrary 64-bit values which the implementation is able to use to understand texture data. That is all they are. They might be pointers, but they might not.

My 2c:

If the ARB had thought that they could do Vulkan-like things in OpenGL, they wouldn’t have created Vulkan in the first place and instead we’d have GL5.0. A lot of smart people collaborated on Vulkan, and it looks like a great API, so I trust their judgement.

People that want the ease of GL and the performance of Vulkan probably want a 3rd party API built upon Vulkan that handles all the low-level stuff for them (mem allocation, ref counting, etc).

[QUOTE=Alfonse Reinheart;1282897]But that’s anathema to “OpenGL style”.
For example, OpenGL is all about changing state. Vulkan is all about you not changing state. You can’t simultaneously have both. Not in a coherent API.[/QUOTE]

Of course, what I mean by “OpenGL style”. It is find a solution which helps reduce state changes cost.
I remember “display lists” days, (by the way for me, it has a lot in common with Vulkan’s recorded command buffers )
and when I check last time rendering old-style display lists was faster then all draw comands ( pre multi draw inderect ).
Why ? I’m not a drivers programer but I think validation drivers cost was lower for it then other draw commands.
Thats way NV_command_list ( or similar solution ) which helps pack state changes will be very good for OpenGL.
I think that this a best place for talking about it, exchange ideas, and maybe somebody from Khronos Group will read it and find some best solution :slight_smile:

[QUOTE=Alfonse Reinheart;1282897]
If they have “a lot of inertia”, then they have sufficient inertia that they’re not going to be willing to do things the NV_command_list way either. If they’re willing to write code in the Vulkan-through-OpenGL way, then they’ll probably be willing to just write it with Vulkan.[/QUOTE]

I would like not agree with you at this point. For them changing to NV_command_list will be a “warp speed” faster then changing to Vulkan.

Of course they might or not be a pointers. That’s way I wrote “GPU pointers” in quotes. I what to say, about some mechanism which helps driver
in validation. 64-bit values can be a start point.

and when I check last time rendering old-style display lists was faster then all draw comands ( pre multi draw inderect ).

Only because NVIDIA spent time and effort optimizing that rendering path. For other implementations, display list rendering wasn’t a particularly great performance improvement.

Also, even NVIDIA only optimized draw calls from display lists, not arbitrary state changes.

Why ? I’m not a drivers programer but I think validation drivers cost was lower for it then other draw commands.

No. It was because NVIDIA’s display list implementation absorbed the vertex data from the client-side arrays and put them in GPU memory. It could select an optimal format for each vertex attribute array. And since the GPU memory could never be modified (unlike buffer objects), it could put these vertices in the most optimal place.

In those days, validation costs were, while not unimportant, not the most painful part of sending a rendering call.

I would like not agree with you at this point. For them changing to NV_command_list will be a “warp speed” faster then changing to Vulkan.

… why? The only things you gain over Vulkan with that are:

  1. Nicer texture loading, without having to explicitly convert formats or stage buffers.

  2. Implicit synchronization.

Equally importantly, it’s unclear if such applications need that performance. Granted, mobile apps prove that everyone needs better CPU performance (for lower battery drain). But outside of that, do “visual engines for simulations” really need such performance?

And if they did, wouldn’t it be much easier for them to just buy a graphics engine built on Vulkan?

It’s great to hear that NVIDIA is doing something to improve performance of OpenGL even if it is something small and old as display list. This is big plus for NVIDIA.

Yes you are right. We need some thing some API which support not only draw calls but * arbitrary state changes* too.

Every AAA game now days have the same treatment too :slight_smile:

Those applications need performance too. For example they can’t cheat when rendering some visual effect ( what games are doing ).
Those engines was develop since OpenGL 1.0 :slight_smile: and many many man hours was put to extend theirs capabilities and
and optimization. So now dropping all that code and rewrite on Vulkan API? Managment will be not be happy because of this.
Correct me if I’m wrong. You what to say “Sorry guys OpenGL will not bring new features which help with performance. Go to use Vulkan.”

[QUOTE=mhagain;1282893]Much of what you list here isn’t actually anything to do with hardware though; what you’re talking about is evolution of a software abstraction, and you’re requesting to move the OpenGL software abstraction so close to Vulkan that it may as well just be Vulkan and be done with it.

How OpenGL should evolve in a post-Vulkan world is a valid topic for discussion of course, and some may even make a case that it’s more useful for OpenGL to evolve towards an even higher-level abstraction than it currently is.[/QUOTE]

Interesting remark about Vulkan and OpenGL.
Making OpenGL into Vulkan.

[QUOTE=Alfonse Reinheart;1282895]OpenGL is in a strange place, abstraction-wise. Its abstraction is not a good fit for modern hardware from a performance standpoint, so it doesn’t really work there. But abstracting things more branches out into the realm of scene graphs, and there are innumerable ways of designing a scene graph. OpenGL is as high-level as you can reasonably get without going there.

The only real advantage OpenGL’s abstraction has is that it strikes an interesting balance between performance and ease-of-use. Handling synchronization as well as whatever gymnastics are needed in order to change framebuffers willy-nilly and so forth. You can get reasonable performance out of OpenGL as well as access to good hardware features, but without a lot of the explicit work that Vulkan requires.

At yet, engines like Unity, Unreal, and the like give you all kinds of power while hiding the details of APIs like Vulkan, D3D12, etc. They are easier to use than OpenGL, and they don’t really lose performance. But at the same time, they do lose the generality that OpenGL provides. If you’re not making a game, if it’s just a graphics demo or whatever, then there’s a lot that those engines do which you won’t care about.[/QUOTE]

Abstraction wise, this position OpenGL has brings forth some interesting questions about abstractions and layers.
What is your opinion about making OpenGL a layer on top of Vulkan?
Using SPIR-V and other Vulkan features such as Vulkans resource description stuff.
Of course OpenGL would need to have a big rewrite.
Such a big rewrite won’t be out this year or maybe even not in the following year when hypothetically starting this year (summer).
Such big change would call for a major version change.
An OpenGL 5.0 release.

First and foremost, thanks for bringing feedback with your constructive criticism and insight into hardware.

About only bringing in gl_InstanceIndex.
It’s a great idea to bring in gl_InstanceIndex functionality as a good discrete jump in functionality for an OpenGL release if the whole shader draw parameters extension/functionality can’t be added yet.

[QUOTE=Alfonse Reinheart;1282835]

From where have you heard of this summer release? Is it scheduled for SIGGRAPH?[/QUOTE]

Summer release rumours mentioned in article on phoronix.com also mentioning a SIGGRAPH timeslot lacking subject description:
New Vulkan Slides; Wondering If “OpenGL 4.6” Will Be Out This Summer
http://www.phoronix.com/scan.php?page=news_item&px=Vulkan-DevDay-2016-Slides

It’s great to hear that NVIDIA is doing something to improve performance of OpenGL even if it is something small and old as display list. This is big plus for NVIDIA.

That was done last decade. It’s not something new.

Those applications need performance too. For example they can’t cheat when rendering some visual effect ( what games are doing ).

Will they care about the CPU overhead of render calls when they’re trying to do something highly complex? Or will they simply consider it the cost of doing business?

Or will they switch to a graphics engine that internally uses Vulkan?

Those engines was develop since OpenGL 1.0 and many many man hours was put to extend theirs capabilities and
and optimization. So now dropping all that code and rewrite on Vulkan API? Managment will be not be happy because of this.

You say that as if using an NV_command_list-style API would be any less of a rewrite.

We’ve seen what happens to an API when you try to evolve it in line with people who aren’t willing to rewrite their code. You get the horrible nonsense of glTexImage allocating mipmap levels instead of full texture storage. You get glVertexAttribPointer relying on some parameter that’s specified by glBindBuffer instead of by the function itself. And any number of other stupidities of OpenGL.

Also, people who were unwilling to rewrite their code were the ones who were responsible for the failure of Longs Peak. I see no reason to cater to them. If they want to stick with their slow API, so be it. But if they want to get into the 21st century with the rest of us, they should use Vulkan.

You what to say “Sorry guys OpenGL will not bring new features which help with performance. Go to use Vulkan.”

Not exactly. I’m saying that OpenGL should not try to become Vulkan. People who need Vulkan should use Vulkan.

Would adding descriptor sets to OpenGL be a good idea? Maybe. It would allow more bindless-like functionality in a hardware-neutral way. But such a system would still provide OpenGL’s normal validation and implicit synchronization mechanisms. So would it be as fast as Vulkan? No. Something similar could be said for push-constants and dynamic uniform/SSBO binding points. Good and useful features that improve performance.

But they wouldn’t match Vulkan’s performance.

The other thing you forget is that Vulkan’s primary advantage in CPU performance is the ability to thread command buffer construction. Oh, validation and synchronization matter. But not nearly as much as being able to create command buffers from different threads.

What is your opinion about making OpenGL a layer on top of vulkan?

The more I look at Vulkan’s API, the worse I see that being.

Despite Vulkan being lower-level, it is still an abstraction of the hardware. And a very different one from OpenGL. Consider pipeline objects and render passes. If you built an OpenGL implementation on top of Vulkan, you would have to develop a complex system of caching for pipelines and render passes. Whereas if you built your OpenGL implementation on top of the actual hardware directly, you could probably simplify a lot of things, because you can make assumptions about what that specific hardware would be doing. With Vulkan, you have to cache big pipeline objects. When implementing directly to hardware, if a render pass changes, you won’t necessarily rebuild the shader code. Or if you do, it will be for hardware-specific reasons.

At best, what you would have is an API that has a huge number of performance traps, but can work perhaps slightly faster if you do everything exactly like the underlying Vulkan implementation wants.

There is a way to build a safer and easier-to-use API on top of Vulkan. But it wouldn’t be as free-wheeling as OpenGL.

[QUOTE=Alfonse Reinheart;1282907]That was done last decade. It’s not something new.
[/QUOTE]
So this mean that only NVIDIA cares about performance of OpenGL from last decade ? :slight_smile:

How I see this rewriting engine, is easier pack state changes to NV_command_list-style API ( and not call lots of OpenGL API ). And save all code with FBO, textures, and other buffers etc. Then making a new Engine based on Vulkan which will take a quite some time . And you know time = money :).

As like you I’m sad and disappointed with failure of Longs Peak. That is big lost for OpenGL. But do you think that OpenGL 5.0 can’t do it what idea behind the Longs Peak?
If it was depend on me I would be willing to switch to Vulkan ( if it will be more stable and mature ).
For simulator industry one thing will convince managers to put theirs money for switching engines to Vulkan. Multi devices and multi display support done a magnitude better then on OpenGL. I hope Khronos will don’t forget to think about this subject.