PDA

View Full Version : OpenGL 4.6 request list



Gedolo2
05-22-2016, 02:36 PM
Please add the following features to OpenGL 4.6

Add OpenGL ES 3.2 context creation functionality to OpenGL 4.6 core.

Add the extensions from OpenGL ES 3.2 to core OpenGL.
Make OpenGL a superset of OpenGL ES again.

Make in core OpenGL the ASTC support mandatory and s3tc optional (or what I like to see more: deprecate/remove s3tc).
Possibly adding one of the following ASTC extensions:
OES_texture_compression_astc
https://www.khronos.org/registry/gles/extensions/OES/OES_texture_compression_astc.txt
texture_compression_astc_hdr
https://www.opengl.org/registry/specs/KHR/texture_compression_astc_hdr.txt
Maybe make a full and portable profile for ASTC with different texture limits to serve the full spectrum of devices?

Put shader draw parameters in core.
https://www.opengl.org/registry/specs/ARB/shader_draw_parameters.txt

Allow using more varied names in core for low component texture components.
Especially single and dual.
Allow not only R but G, B and A as well.
And allow another letter for differentiating between colour components and other miscellaneous components.
(Maybe c for component or channel, maybe another letter)
The reason, rational for such a thing is: it makes detecting errors much easier for programmers. Allowing them to much easier see if their code is using the components correctly or not.
Do not mandate how to use the component names to avoid them becoming a programming limitation instead of a more expressive way to write code.



If you introduce an extension for async shader compilation, please put async somewhere in the name. And have both compilation and loading of shaders done asynchronously by the extension.
Perhaps the name could be GL_ARB_parallel_async_shader_compile for OpenGL and also used in OpenGL ES.
Not GL_ARB_parallel_shader_compile.
If it provides async compilation, that is a big feature/point, such a feature needs to be advertised in the name.
Unify with how async is done in Vulkan might be a good idea.
It seems only minor adjustments would need to be done to the following extension:
https://www.opengl.org/registry/specs/ARB/parallel_shader_compile.txt
Also have the specification provide plenty of information about how it interacts with a shader cache. Put in the specification plenty of information about shader caches. What a shader cache is, what it does, allows and mention shader cache a few times more in the description of the extension.
Do make sure there is good information about what async shader compilation and loading allows, especially in reducing lag spikes.
Increasing predictability and performance while reducing lag and stutter.


Do NOT put in features from Vulkan YET.
The following is not applicable to putting compatibility contexts between OpenGL and Vulkan.
It's too early. Between several things:
- apparently a new Vulkan release this summer
- the Vulkan spec churn (new documentation release every week)
- the resulting spec churn from the new Vulkan release this summer
- getting feedback from developers about desired feature sets
Vulkan really is not ready yet to base individual features for OpenGL on.
Once more time has passed it will be.
Once the documentation becomes somewhat more stable (maybe as early as next year: 2017). Once Vulkan's features will be more crystallized with feature sets. And the new release of Vulkan has happened.
After those things will have happened, it will be the right time to start doing feature cross-pollination between the two API's.
Also don't put in SPIR-V when there is a new release coming up this summer.
It makes little sense to start copying features between both API's.
Especially since with Vulkan. There will be feedback on what features developers want to have through determining feature sets. Knowing what features are popular will allow spec makers at Khronos to optimally choose which features to copy to other API's.

Alfonse Reinheart
06-04-2016, 07:45 PM
Add OpenGL ES 3.2 context creation functionality to OpenGL 4.6 core.

OpenGL doesn't define any "context creation functionality". So it's not clear what that would mean.

But in any case, you can use the EXT_create_context_es_profile extension to create any version of OpenGL ES contexts, where supported.


Make OpenGL a superset of OpenGL ES again.

No version of desktop OpenGL was ever a superset of any version of OpenGL ES. There was always code you could write in ES that would not do the same thing on desktop GL.


Make in core OpenGL the ASTC support mandatory

That's not practical. ASTC is only supported by a very small set of hardware (http://opengl.gpuinfo.org/gl_listreports.php?listreportsbyextension=GL_KHR_t exture_compression_astc_ldr). Unless you want nobody to implement GL 4.6.

ASTC is not a real thing yet.


s3tc optional (or what I like to see more: deprecate/remove s3tc).

First, good news: S3TC was never adopted into core OpenGL. It has always been an extension.

FYI: RGTC isn't S3TC. It's similar to it, but it doesn't have the patent issues. Which is why RGTC is core and S3TC is not.

Second, even if it was mandatory, why get rid of perfectly valid, functional, and useful technology? It's not like IHVs will be ripping out support for it from their texture fetch units. Not so long as applications still use it.


Put shader draw parameters in core.

Intel doesn't support it (http://opengl.gpuinfo.org/gl_listreports.php?listreportsbyextension=GL_ARB_s hader_draw_parameters). It would be better to just bring in the `gl_InstanceIndex` functionality from khr_vk_glsl. That's the most important part of draw parameters that OpenGL doesn't support, and it's something we know Intel can support.


And have both compilation and loading of shaders done asynchronously by the extension.

What exactly does that mean? Loading shader code is the job of the application, not OpenGL. It can't make that asynchronous.


Unify with how async is done in Vulkan might be a good idea.

That would be the opposite of that extension. In Vulkan, there is no asynchronous shader compilation support. What there is in Vulkan are two things:

1. When you call `vkCreateShaderModule`, you are guaranteed that the compilation is finished (successfully or with failure) by the time it returns. Similarly, when you call `vkCreateGraphicsPipelines`, you are guaranteed that the compilation is finished (successfully or with failure) by the time it returns.

2. Both of those calls are fully reentrant. You can call them on the same `VkDevice` from multiple threads. You can even have multiple threads all using the same pipeline cache without synchronization.

Vulkan doesn't make shader compilation parallel or asynchronous (FYI: those words mean the same thing in this context). It simply provides you with the tools to compile shaders asynchronously.

By contrast, parallel_shader_compile provides you with the tools to realize that the OpenGL implementation may compile shaders in parallel, and it gives you the tools to stop interfering in that process (by asking if there was an error before the compile has finished).

It's two different models for two very different APIs. In one case, the API is asynchronous; in the other case, the API is reentrant.


Also have the specification provide plenty of information about how it interacts with a shader cache. Put in the specification plenty of information about shader caches. What a shader cache is, what it does, allows and mention shader cache a few times more in the description of the extension.

That is not how a specification works. A specification defines behavior, not how it gets implemented.

Vulkan talks about a pipeline cache because it is an explicit object which is part of the Vulkan system. It's part of the API; it can't not talk about it.

OpenGL has no similar construct. If the implementation uses a cache when compiling shaders, that's not something OpenGL can explain, since it does not affect the behavior of the system or its interface. It only would affect performance.

It's an implementation detail in OpenGL.


- apparently a new Vulkan release this summer

From where have you heard of this summer release? Is it scheduled for SIGGRAPH?

Alfonse Reinheart
06-07-2016, 08:14 AM
That's not practical. ASTC is only supported by a very small set of hardware (http://opengl.gpuinfo.org/gl_listreports.php?listreportsbyextension=GL_KHR_t exture_compression_astc_ldr). Unless you want nobody to implement GL 4.6.

To add to this, the Vulkan database shows that the only desktop hardware that supports ASTC is Intel (http://vulkan.gpuinfo.org/listreports.php?feature=textureCompressionASTC_LDR ). Even for NVIDIA, only their mobile Tegra line supports ASTC. This could be due to immature Vulkan drivers, but it does match up with the OpenGL support.

So while ASTC may be the future, it is definitely not the present.

mhagain
06-07-2016, 10:11 AM
It's also worth noting that this is what OpenGL specs used to do in the past: define a software interface with little or no consideration to how hardware can support it (or even whether hardware supports it). That approach manifestly failed; the upshot was that OpenGL implementations tended to end up with functionality that was software-emulated, but it was not queryable if that was the case, so you could find yourself rudely thrown back to software emulation and single-digit framerates. That's OK if you're in a scenario where "everything must work and performance is secondary", but that's not always the case with everybody, and those for whom it wasn't the case were poorly served by OpenGL.

ASTC is certainly feasible if all of the hardware vendors come onboard and implement support in conjunction with a future evolution of the spec. But that should be a requirement - the spec cannot evolve in isolation.

Far more interesting (and useful) would be to bring anisotropic filtering into core. 20 years from priority date has now expired (http://www.google.com/patents/US6005582) so it should now be doable.

Alfonse Reinheart
06-07-2016, 10:34 AM
It should also be noted that other features are dealt with in the same way. Things are only added to core when a broad base of hardware can support it. In the case of OpenGL 4.x, something only becomes core if all current 4.x hardware can support it. Features of note which do not have such a broad base of support are:

* ARB_fragment_shader_interlock: No AMD hardware support.
* KHR_blend_equation_advanced: No AMD hardware support.
* ARB_bindless_texture: No Intel/pre-GCN AMD hardware support.
* ARB_sparse_texture/buffer: No Intel/pre-GCN AMD hardware support.

What OpenGL really lacks is Vulkan's "feature" concept, which are effectively functionality that is defined in the core specification, but for which support is not required. OpenGL can express a form of this by using implementation-defined limits. For example, image load/store and SSBOs are only required to be supported for fragment and compute shaders. Other stages can express support by having non-zero limits.

But features like the above can't really be expressed as "limits". The best way OpenGL has to express such optional features is as ARB/KHR extensions. And there is nothing wrong with using an extension, either conditionally or relying on it.

jurgus
06-08-2016, 02:02 AM
May be the best solution to these problems will be setting up new OpenGL 5.0 :). ( this already occurred in the past. Opengl 1.X -> OpenGL 2.0, OpenGL 3.X -> OpenGL 4.0 )
In which will distinction between newer generation of GPU hardware from older ones.
In which support:
1) ASTC compression,
2) standard binary format of shaders and programs,
3) bindless textures and buffers,
4) some kind of "GL_NV_command_list",
5) good implementation of GL_KHR_no_error
6) although a little bigger multithreaded support
will be included.
IMHO this will push forward OpenGL and it will can still easy to learn ( compared to Vulkan ) and can be as efficient as Vulkan.

Alfonse Reinheart
06-08-2016, 07:07 AM
In which will distinction between newer generation of GPU hardware from older ones.

You're making an assumption that all "newer generation" hardware would be able to support all of that.

No Intel hardware supports bindless textures, and no AMD hardware supports non-ARB bindless stuff. There's a reason why Vulkan doesn't do things the bindless way, that it uses descriptor sets rather than just arbitrary numbers you throw around. It's a better abstraction, one which can be implemented across lots of hardware while still providing the featureset required (access to arbitrary amounts of stuff in a shader).

The only functionality missing from that is NVIDIA's passion for passing GPU memory pointers around. Note that Vulkan doesn't let you do that, despite being lower-level than OpenGL.

Bindless is not a good hardware abstraction.

As for a variation of NV_command_list... why? If you're willing to go through that much trouble, you may as well just use Vulkan. It'd be a much cleaner API, and you'd get more functionality out of it in the long run.


can be as efficient as Vulkan

No. No it can't.

jurgus
06-08-2016, 08:33 AM
I'm watching evolution of GPU hardware for quite a long time and only my instinct tells me that the future can look like.
1) ASTC compression -> newest mobile GPU have it so I'm pritty sure that desktop GPU will have it to. It's too good ( for now ) to now implement this.
2) standard binary format of shaders and programs -> SPIR_V its a good candidate for this.
3) You are right that pointers are not perfect for that. Maybe a OpenGL needs Descriptor Sets too. Maybe something else, thats why this is only a suggestion.
4) Why NV_command_list ? I think looking for solution how we can "Approaching Zero Driver Overhead in OpenGL" is a good idea. Multi Draw Indirect do not solves all problems.
And solution proposed by NVIDIA is worth considering. Finding a best way how efficent pack state changes with clean API IMHO is a new goal for OpenGL.
5 ) and 6 )
When I said OpenGL can be as efficent as Vulkan. I mean I want to OpenGL be as efficent as Vulkan (of course in single thread enviroment only :) ).
And if driver will not be bottleneck I think it is possible. So GL_KHR_no_error is needed.

mhagain
06-08-2016, 10:38 AM
I'm watching evolution of GPU hardware for quite a long time and only my instinct tells me that the future can look like.

Much of what you list here isn't actually anything to do with hardware though; what you're talking about is evolution of a software abstraction, and you're requesting to move the OpenGL software abstraction so close to Vulkan that it may as well just be Vulkan and be done with it.

How OpenGL should evolve in a post-Vulkan world is a valid topic for discussion of course, and some may even make a case that it's more useful for OpenGL to evolve towards an even higher-level abstraction than it currently is.

Alfonse Reinheart
06-08-2016, 11:44 AM
How OpenGL should evolve in a post-Vulkan world is a valid topic for discussion of course, and some may even make a case that it's more useful for OpenGL to evolve towards an even higher-level abstraction than it currently is.

OpenGL is in a strange place, abstraction-wise. Its abstraction is not a good fit for modern hardware from a performance standpoint, so it doesn't really work there. But abstracting things more branches out into the realm of scene graphs, and there are innumerable ways of designing a scene graph. OpenGL is as high-level as you can reasonably get without going there.

The only real advantage OpenGL's abstraction has is that it strikes an interesting balance between performance and ease-of-use. Handling synchronization as well as whatever gymnastics are needed in order to change framebuffers willy-nilly and so forth. You can get reasonable performance out of OpenGL as well as access to good hardware features, but without a lot of the explicit work that Vulkan requires.

At yet, engines like Unity, Unreal, and the like give you all kinds of power while hiding the details of APIs like Vulkan, D3D12, etc. They are easier to use than OpenGL, and they don't really lose performance. But at the same time, they do lose the generality that OpenGL provides. If you're not making a game, if it's just a graphics demo or whatever, then there's a lot that those engines do which you won't care about.

jurgus
06-08-2016, 12:49 PM
OpenGL is in a strange place, abstraction-wise..

But in other hand this "strange place" can be a good thing for OpenGL. More cross platform API competition can be good for OpenGL. So lets take what is best in Vulkan and do it in "OpenGL style" :)
OpenGL is not only for game engines. I'm thinking now about visual engines for simulations. Those engines will not quiclly leavle OpenGL for Vulkan. Those products have a lot of inertia.
We know that OpenGL have some drawbacks. And my proposals are focused on one of them Driver Overhead. "GPU pointers" aren't perfect but are know in use in some parts of OpenGL ( Bindless textures, Persistent Mapping ), and they are doing good job.

The new solutions will also help in the future development of WebGL . It's also an interesting future-oriented field.

Alfonse Reinheart
06-08-2016, 02:43 PM
So lets take what is best in Vulkan and do it in "OpenGL style"

But that's anathema to "OpenGL style".

For example, OpenGL is all about changing state. Vulkan is all about you not changing state. You can't simultaneously have both. Not in a coherent API.

What is best in Vulkan is that it's Vulkan. By taking it, you will be losing what's best in OpenGL. Making OpenGL act like Vulkan would simply be making Vulkan with a crappy API.


I'm thinking now about visual engines for simulations. Those engines will not quiclly leavle OpenGL for Vulkan. Those products have a lot of inertia.

If they have "a lot of inertia", then they have sufficient inertia that they're not going to be willing to do things the NV_command_list way either. If they're willing to write code in the Vulkan-through-OpenGL way, then they'll probably be willing to just write it with Vulkan.


"GPU pointers" aren't perfect but are know in use in some parts of OpenGL ( Bindless textures, Persistent Mapping )

Bindless texture handles are not pointers. They are arbitrary 64-bit values which the implementation is able to use to understand texture data. That is all they are. They might be pointers, but they might not.

malexander
06-08-2016, 02:56 PM
My 2c:

If the ARB had thought that they could do Vulkan-like things in OpenGL, they wouldn't have created Vulkan in the first place and instead we'd have GL5.0. A lot of smart people collaborated on Vulkan, and it looks like a great API, so I trust their judgement.

People that want the ease of GL and the performance of Vulkan probably want a 3rd party API built upon Vulkan that handles all the low-level stuff for them (mem allocation, ref counting, etc).

jurgus
06-09-2016, 12:12 AM
But that's anathema to "OpenGL style".
For example, OpenGL is all about changing state. Vulkan is all about you not changing state. You can't simultaneously have both. Not in a coherent API.

Of course, what I mean by "OpenGL style". It is find a solution which helps reduce state changes cost.
I remember "display lists" days, (by the way for me, it has a lot in common with Vulkan's recorded command buffers )
and when I check last time rendering old-style display lists was faster then all draw comands ( pre multi draw inderect ).
Why ? I'm not a drivers programer but I think validation drivers cost was lower for it then other draw commands.
Thats way NV_command_list ( or similar solution ) which helps pack state changes will be very good for OpenGL.
I think that this a best place for talking about it, exchange ideas, and maybe somebody from Khronos Group will read it and find some best solution :)



If they have "a lot of inertia", then they have sufficient inertia that they're not going to be willing to do things the NV_command_list way either. If they're willing to write code in the Vulkan-through-OpenGL way, then they'll probably be willing to just write it with Vulkan.

I would like not agree with you at this point. For them changing to NV_command_list will be a "warp speed" faster then changing to Vulkan.



Bindless texture handles are not pointers.


Of course they might or not be a pointers. That's way I wrote "GPU pointers" in quotes. I what to say, about some mechanism which helps driver
in validation. 64-bit values can be a start point.

Alfonse Reinheart
06-09-2016, 07:41 AM
and when I check last time rendering old-style display lists was faster then all draw comands ( pre multi draw inderect ).

Only because NVIDIA spent time and effort optimizing that rendering path. For other implementations, display list rendering wasn't a particularly great performance improvement.

Also, even NVIDIA only optimized *draw calls* from display lists, not arbitrary state changes.


Why ? I'm not a drivers programer but I think validation drivers cost was lower for it then other draw commands.

No. It was because NVIDIA's display list implementation absorbed the vertex data from the client-side arrays and put them in GPU memory. It could select an optimal format for each vertex attribute array. And since the GPU memory could never be modified (unlike buffer objects), it could put these vertices in the most optimal place.

In those days, validation costs were, while not unimportant, not the most painful part of sending a rendering call.


I would like not agree with you at this point. For them changing to NV_command_list will be a "warp speed" faster then changing to Vulkan.

... why? The only things you gain over Vulkan with that are:

1. Nicer texture loading, without having to explicitly convert formats or stage buffers.

2. Implicit synchronization.

Equally importantly, it's unclear if such applications need that performance. Granted, mobile apps prove that everyone needs better CPU performance (for lower battery drain). But outside of that, do "visual engines for simulations" really need such performance?

And if they did, wouldn't it be much easier for them to just buy a graphics engine built on Vulkan?

jurgus
06-09-2016, 10:40 AM
Only because NVIDIA spent time and effort optimizing that rendering path. For other implementations, display list rendering wasn't a particularly great performance improvement.
It's great to hear that NVIDIA is doing something to improve performance of OpenGL even if it is something small and old as display list. This is big plus for NVIDIA.


Also, even NVIDIA only optimized *draw calls* from display lists, not arbitrary state changes.

Yes you are right. We need some thing some API which support not only *draw calls* but * arbitrary state changes* too.



No. It was because NVIDIA's display list implementation absorbed the vertex data from the client-side arrays and put them in GPU memory. It could select an optimal format for each vertex attribute array. And since the GPU memory could never be modified (unlike buffer objects), it could put these vertices in the most optimal place.

Every AAA game now days have the same treatment too :)



Equally importantly, it's unclear if such applications need that performance. Granted, mobile apps prove that everyone needs better CPU performance (for lower battery drain). But outside of that, do "visual engines for simulations" really need such performance?
And if they did, wouldn't it be much easier for them to just buy a graphics engine built on Vulkan?


Those applications need performance too. For example they can't cheat when rendering some visual effect ( what games are doing ).
Those engines was develop since OpenGL 1.0 :) and many many man hours was put to extend theirs capabilities and
and optimization. So now dropping all that code and rewrite on Vulkan API? Managment will be not be happy because of this.
Correct me if I'm wrong. You what to say "Sorry guys OpenGL will not bring new features which help with performance. Go to use Vulkan."

Gedolo2
06-09-2016, 11:31 AM
Much of what you list here isn't actually anything to do with hardware though; what you're talking about is evolution of a software abstraction, and you're requesting to move the OpenGL software abstraction so close to Vulkan that it may as well just be Vulkan and be done with it.

How OpenGL should evolve in a post-Vulkan world is a valid topic for discussion of course, and some may even make a case that it's more useful for OpenGL to evolve towards an even higher-level abstraction than it currently is.

Interesting remark about Vulkan and OpenGL.
Making OpenGL into Vulkan.


OpenGL is in a strange place, abstraction-wise. Its abstraction is not a good fit for modern hardware from a performance standpoint, so it doesn't really work there. But abstracting things more branches out into the realm of scene graphs, and there are innumerable ways of designing a scene graph. OpenGL is as high-level as you can reasonably get without going there.

The only real advantage OpenGL's abstraction has is that it strikes an interesting balance between performance and ease-of-use. Handling synchronization as well as whatever gymnastics are needed in order to change framebuffers willy-nilly and so forth. You can get reasonable performance out of OpenGL as well as access to good hardware features, but without a lot of the explicit work that Vulkan requires.

At yet, engines like Unity, Unreal, and the like give you all kinds of power while hiding the details of APIs like Vulkan, D3D12, etc. They are easier to use than OpenGL, and they don't really lose performance. But at the same time, they do lose the generality that OpenGL provides. If you're not making a game, if it's just a graphics demo or whatever, then there's a lot that those engines do which you won't care about.

Abstraction wise, this position OpenGL has brings forth some interesting questions about abstractions and layers.
What is your opinion about making OpenGL a layer on top of Vulkan?
Using SPIR-V and other Vulkan features such as Vulkans resource description stuff.
Of course OpenGL would need to have a big rewrite.
Such a big rewrite won't be out this year or maybe even not in the following year when hypothetically starting this year (summer).
Such big change would call for a major version change.
An OpenGL 5.0 release.

Gedolo2
06-09-2016, 11:42 AM
Intel doesn't support it (http://opengl.gpuinfo.org/gl_listreports.php?listreportsbyextension=GL_ARB_s hader_draw_parameters). It would be better to just bring in the `gl_InstanceIndex` functionality from khr_vk_glsl. That's the most important part of draw parameters that OpenGL doesn't support, and it's something we know Intel can support.


First and foremost, thanks for bringing feedback with your constructive criticism and insight into hardware.

About only bringing in gl_InstanceIndex.
It's a great idea to bring in gl_InstanceIndex functionality as a good discrete jump in functionality for an OpenGL release if the whole shader draw parameters extension/functionality can't be added yet.




From where have you heard of this summer release? Is it scheduled for SIGGRAPH?

Summer release rumours mentioned in article on phoronix.com also mentioning a SIGGRAPH timeslot lacking subject description:
New Vulkan Slides; Wondering If "OpenGL 4.6" Will Be Out This Summer
http://www.phoronix.com/scan.php?page=news_item&px=Vulkan-DevDay-2016-Slides

Alfonse Reinheart
06-09-2016, 12:28 PM
It's great to hear that NVIDIA is doing something to improve performance of OpenGL even if it is something small and old as display list. This is big plus for NVIDIA.

That was done last decade. It's not something new.


Those applications need performance too. For example they can't cheat when rendering some visual effect ( what games are doing ).

Will they care about the CPU overhead of render calls when they're trying to do something highly complex? Or will they simply consider it the cost of doing business?

Or will they switch to a graphics engine that internally uses Vulkan?


Those engines was develop since OpenGL 1.0 and many many man hours was put to extend theirs capabilities and
and optimization. So now dropping all that code and rewrite on Vulkan API? Managment will be not be happy because of this.

You say that as if using an NV_command_list-style API would be any less of a rewrite.

We've seen what happens to an API when you try to evolve it in line with people who aren't willing to rewrite their code. You get the horrible nonsense of `glTexImage` allocating mipmap levels instead of full texture storage. You get `glVertexAttribPointer` relying on some parameter that's specified by `glBindBuffer` instead of by the function itself. And any number of other stupidities of OpenGL.

Also, people who were unwilling to rewrite their code were the ones who were responsible for the failure of Longs Peak. I see no reason to cater to them. If they want to stick with their slow API, so be it. But if they want to get into the 21st century with the rest of us, they should use Vulkan.


You what to say "Sorry guys OpenGL will not bring new features which help with performance. Go to use Vulkan."

Not exactly. I'm saying that OpenGL should not try to become Vulkan. People who need Vulkan should use Vulkan.

Would adding descriptor sets to OpenGL be a good idea? Maybe. It would allow more bindless-like functionality in a hardware-neutral way. But such a system would still provide OpenGL's normal validation and implicit synchronization mechanisms. So would it be as fast as Vulkan? No. Something similar could be said for push-constants and dynamic uniform/SSBO binding points. Good and useful features that improve performance.

But they wouldn't match Vulkan's performance.

The other thing you forget is that Vulkan's primary advantage in CPU performance is the ability to thread command buffer construction. Oh, validation and synchronization matter. But not nearly as much as being able to create command buffers from different threads.


What is your opinion about making OpenGL a layer on top of vulkan?

The more I look at Vulkan's API, the worse I see that being.

Despite Vulkan being lower-level, it is still an abstraction of the hardware. And a very different one from OpenGL. Consider pipeline objects and render passes. If you built an OpenGL implementation on top of Vulkan, you would have to develop a complex system of caching for pipelines and render passes. Whereas if you built your OpenGL implementation on top of the actual hardware directly, you could probably simplify a lot of things, because you can make assumptions about what that specific hardware would be doing. With Vulkan, you have to cache big pipeline objects. When implementing directly to hardware, if a render pass changes, you won't necessarily rebuild the shader code. Or if you do, it will be for hardware-specific reasons.

At best, what you would have is an API that has a huge number of performance traps, but can work perhaps slightly faster if you do everything exactly like the underlying Vulkan implementation wants.

There is a way to build a safer and easier-to-use API on top of Vulkan. But it wouldn't be as free-wheeling as OpenGL.

jurgus
06-09-2016, 01:52 PM
That was done last decade. It's not something new.

So this mean that only NVIDIA cares about performance of OpenGL from last decade ? :)


You say that as if using an NV_command_list-style API would be any less of a rewrite.

How I see this rewriting engine, is easier pack state changes to NV_command_list-style API ( and not call lots of OpenGL API ). And save all code with FBO, textures, and other buffers etc. Then making a new Engine based on Vulkan which will take a quite some time . And you know time = money :).


We've seen what happens to an API when you try to evolve it in line with people who aren't willing to rewrite their code. You get the horrible nonsense of `glTexImage` allocating mipmap levels instead of full texture storage. You get `glVertexAttribPointer` relying on some parameter that's specified by `glBindBuffer` instead of by the function itself. And any number of other stupidities of OpenGL.

As like you I'm sad and disappointed with failure of Longs Peak. That is big lost for OpenGL. But do you think that OpenGL 5.0 can't do it what idea behind the Longs Peak?
If it was depend on me I would be willing to switch to Vulkan ( if it will be more stable and mature ).
For simulator industry one thing will convince managers to put theirs money for switching engines to Vulkan. Multi devices and multi display support done a magnitude better then on OpenGL. I hope Khronos will don't forget to think about this subject.

Alfonse Reinheart
06-09-2016, 04:52 PM
Multi devices and multi display support done a magnitude better then on OpenGL.

First, a rock could do that better than OpenGL, so that's not expecting much ;)

Second, Vulkan already does. You can query what devices are available. These devices have string names, but they also come with a vendorId field that Khronos actually manages. Each ID uniquely identifies a particular vendor of Vulkan implementations. They also have driver versions, unique identifiers for a specific device, etc.

As for multiple displays, Vulkan's WSI handles that just fine. Each display can have its own properties, swap chains, etc.

jurgus
06-10-2016, 01:31 AM
First, a rock could do that better than OpenGL, so that's not expecting much ;)

You see how not much is needed to make somebody happy :).



Second, Vulkan already does. You can query what devices are available. These devices have string names, but they also come with a vendorId field that Khronos actually manages. Each ID uniquely identifies a particular vendor of Vulkan implementations. They also have driver versions, unique identifiers for a specific device, etc.
As for multiple displays, Vulkan's WSI handles that just fine. Each display can have its own properties, swap chains, etc.

Thanks for confirmation how can be it done in Vulkan. But I forgot to add that synchronisation for all displays is needed too. Like in NV_swap_group.
But I hope it will can be done without support from specialised hardware. Can I dream about it ? :)

Gedolo2
06-11-2016, 05:42 AM
Would adding descriptor sets to OpenGL be a good idea? Maybe. It would allow more bindless-like functionality in a hardware-neutral way. But such a system would still provide OpenGL's normal validation and implicit synchronization mechanisms. So would it be as fast as Vulkan? No. Something similar could be said for push-constants and dynamic uniform/SSBO binding points. Good and useful features that improve performance.

But they wouldn't match Vulkan's performance.

The other thing you forget is that Vulkan's primary advantage in CPU performance is the ability to thread command buffer construction. Oh, validation and synchronization matter. But not nearly as much as being able to create command buffers from different threads.

Would adopting descriptor sets in OpenGL improve functionality and performance?
Would putting descriptor sets in place of current OpenGL HW abstractions in but not limited to bindless functionality and bindless extensions a good idea and provide (an) advantage(s)?
Could descriptor sets be added in such a way it replaces what's used in current bindless extensions without changing the functions form? Becoming an under the hood drop-in replacement? A transparent change for application programmers?
How do the current techniques including but not limited to bindless extensions stack up against descriptor sets?
Would descriptor sets allow better performance and programming flexibility/ease/techniques than other extensions including but not limited to bindless functionality?

Would bringing the WSI or some of the WSI stuff in Vulkan to OpenGL (ES) be a good idea?
Would bringing SPIR-V 1.1 (release this summer) to OpenGL be a good idea?
Would it be a good idea to bring other Vulkan features to OpenGL?



The more I look at Vulkan's API, the worse I see that being.

Despite Vulkan being lower-level, it is still an abstraction of the hardware. And a very different one from OpenGL. Consider pipeline objects and render passes. If you built an OpenGL implementation on top of Vulkan, you would have to develop a complex system of caching for pipelines and render passes. Whereas if you built your OpenGL implementation on top of the actual hardware directly, you could probably simplify a lot of things, because you can make assumptions about what that specific hardware would be doing. With Vulkan, you have to cache big pipeline objects. When implementing directly to hardware, if a render pass changes, you won't necessarily rebuild the shader code. Or if you do, it will be for hardware-specific reasons.

At best, what you would have is an API that has a huge number of performance traps, but can work perhaps slightly faster if you do everything exactly like the underlying Vulkan implementation wants.

There is a way to build a safer and easier-to-use API on top of Vulkan. But it wouldn't be as free-wheeling as OpenGL.

Interesting considerations.
Is there in your opinion room in Vulkan for some additional and higher-level constructs (with good defaults) instead of extra layers?

Sounds like Vulkan and OpenGL will never unify into a layered architecture.
It also sounds like you hint at there being room for some feature cross-pollination between the API's.
What features would you like to see appear in OpenGL from Vulkan?

Gedolo2
06-11-2016, 06:20 AM
First, a rock could do that better than OpenGL, so that's not expecting much ;)

Second, Vulkan already does. You can query what devices are available. These devices have string names, but they also come with a vendorId field that Khronos actually manages. Each ID uniquely identifies a particular vendor of Vulkan implementations. They also have driver versions, unique identifiers for a specific device, etc.

As for multiple displays, Vulkan's WSI handles that just fine. Each display can have its own properties, swap chains, etc.

Can each application window (full-screen and windowed/non full-screen) inside a display also have it's own properties (refresh rate, independent synchronization) when supported with graceful fallback?
I'm asking because of Multi-Stream Transport (MST).
https://en.wikipedia.org/wiki/DisplayPort#Multiple_displays_on_single_DisplayPor t_connector


multiple independent video streams (daisy-chain connection with multiple monitors) called Multi-Stream Transport
https://en.wikipedia.org/wiki/DisplayPort#1.2

Alfonse Reinheart
06-11-2016, 10:29 AM
Would adopting descriptor sets in OpenGL improve functionality and performance?

You could possibly get some performance improvement out of it.


Could descriptor sets be added in such a way it replaces what's used in current bindless extensions without changing the functions form? Becoming an under the hood drop-in replacement? A transparent change for application programmers?

No. Bindless texturing and especially the buffer stuff is all about avoiding context state. Descriptor sets are context state.


Would bringing SPIR-V 1.1 (release this summer) to OpenGL be a good idea?

I see no reason not to adopt SPIR-V 1.0 (https://www.opengl.org/discussion_boards/showthread.php/186156-SPIR-V-consumption-in-OpenGL?p=1266021#post1266021). It has more or less everything OpenGL needs already.


Sounds like Vulkan and OpenGL will never unify into a layered architecture.

You could implement OpenGL on top of Vulkan. But it would have even more performance traps than pure OpenGL does. Though at least the implementation would be available, so that bugs could be fixed.


It also sounds like you hint at there being room for some feature cross-pollination between the API's.
What features would you like to see appear in OpenGL from Vulkan?

Only hardware capabilities that are actually missing from Vulkan: conditional rendering and transform feedback. NVIDIA seems to think (PDF) (https://developer.nvidia.com/sites/default/files/akamai/gameworks/VulkanDevDaypdaniel.pdf) that we need more dynamic state in pipelines, but I'm not sure how most of that would impact mobile platforms.

I have no idea why NVIDIA seems to think that shader subroutines are important...

Alfonse Reinheart
06-11-2016, 07:20 PM
Would bringing SPIR-V 1.1 (release this summer) to OpenGL be a good idea?

I just discovered this. SPIR-V 1.10 already exists (https://www.khronos.org/registry/spir-v/); it was released in April.

And just about everything it added relative to 1.00 requires the Kernel capability. So it's stuff that's not appropriate for either Vulkan or OpenGL.

Gedolo2
06-12-2016, 05:20 AM
You could possibly get some performance improvement out of it.



No. Bindless texturing and especially the buffer stuff is all about avoiding context state. Descriptor sets are context state.
Got it, the descriptor sets and the bindless functionality are complementary.
Would it be good for OpenGL in the future in the next years, not this summer OpenGL 4.6 release, to move both bindless and descriptor sets in core for achieving maximum performance in applications?


I just discovered this. SPIR-V 1.10 already exists (https://www.khronos.org/registry/spir-v/); it was released in April.

And just about everything it added relative to 1.00 requires the Kernel capability. So it's stuff that's not appropriate for either Vulkan or OpenGL.

Oops, looks like mixed up Vulkan and SPIR-V versions.
I meant the SPIR-V version that will be released this summer together with new Vulkan and OpenGL releases.

Would it be feasible to adopt the newer version of SPIR-V without any disadvantages for GPUs?
Making it not harmful for Vulkan and OpenGL to use the 1.1 version or do you see a need for a GPU profile for SPIR-V to avoid disadvantages for GPU API's?
Although you say the SPIR-V 1.1 does not offer anything substantial for GPU's.
Future SPIR-V releases might include features that substantially help GPUs.
A GPU profile would be ideal for taking care of this use case.

Alfonse Reinheart
06-12-2016, 07:54 AM
Got it, the descriptor sets and the bindless functionality are complementary.

No, they're not complementary. They're different. You don't need or want both.


Although you say the SPIR-V 1.1 does not offer anything substantial for GPU's.

No, I said it doesn't offer anything for graphics. Kernels run on GPUs too; they just don't run on graphics APIs.

jurgus
06-21-2016, 01:05 PM
I have question.
One of problems with OpenGL is that driver have no clue what and how many stuff will have to process.
This why in Vulkan there are command buffers.
So developing concept behind GL_NV_command_list can help driver a lot? (packing draw command and state will makes for driver less guessing)
Whats yours thoughts on this.

Alfonse Reinheart
06-21-2016, 01:34 PM
One of problems with OpenGL is that driver have no clue what and how many stuff will have to process.
This why in Vulkan there are command buffers.

Command buffers don't solve that problem. At all. The problem that command buffers solve is being able to build sequences of commands asynchronously on multiple threads.

jurgus
06-22-2016, 01:44 AM
Correct me if I'm wrong, but when recording command buffer those commands are not executed immediately.
But rather when this command buffer is submited to the render queue. Only then driver process it and send it to the graphics card.
So not only " command buffers solve is being able to build sequences of commands asynchronously on multiple threads." But additional driver have more knowledge about "what to draw".
One more plus that command buffers can be once bake and summited without recording them again. This is gain for CPU.
When a read GL_NV_command_list I see very similar ideas in it. So this is not a good way of development?
I'm not say that GL_NV_command_list must be add to the core but something base on it could be.

mhagain
06-22-2016, 02:36 AM
But additional driver have more knowledge about "what to draw".

I fail to see how command buffers could give this knowledge.

If you were restricted to a single command buffer per-frame, it might make sense. But you're not. So in any given frame, you're either going to build another command buffer or you're not, and because the decision takes place outside of the driver, the driver doesn't have this knowledge. If you do build a command buffer, you're either going to submit it or you're not. Because you don't have to submit command buffers as soon as you build them. You could, for example, record a command buffer for use in a future frame (which might be a performance win if the current frame is lightweight). and again, the driver can't have advance knowledge of this.

So the driver is still dealing with rendering commands as they come in, and with no advance knowledge - same as it ever was.

Of course, this is how drivers already behave under traditional APIs anyway. You didn't think that OpenGL calls went direct to the GPU, did you? The driver buffers them up, then submits the buffered-up commands at intervals - when it's buffer is full, at the end of a frame, when you explicitly call glFlush or glFinish, whatever. All that command buffers do is expose this functionality to the programmer, but don't get the idea that there's any kind of deep voodoo going on. Think of command buffers as being broadly analogous to the old display list functionality instead.

jurgus
06-22-2016, 03:02 AM
I fail to see how command buffers could give this knowledge.

What is my understanding I see command buffers (first and second layer) as a way for "packing" more information about rendering in one place.
So how you said
The driver buffers them up, then submits the buffered-up commands at interval this time ( used for buffering-up commands ) can be used for something else e.g.
for analysis for efficient dispatch commands to GPU.
This "pack" information was that "a deeper knowledge" what to render.
With that "pack structure of commands" will help OpenGL driver a lot.

mhagain
06-22-2016, 07:49 AM
What is my understanding I see command buffers (first and second layer) as a way for "packing" more information about rendering in one place.
So how you said this time ( used for buffering-up commands ) can be used for something else e.g.
for analysis for efficient dispatch commands to GPU.
This "pack" information was that "a deeper knowledge" what to render.
With that "pack structure of commands" will help OpenGL driver a lot.

You're not making much sense here.

The one thing which the new APIs (and NV_command_list) have which can allow this "deeper knowledge" is actually nothing to do with command buffers; it's state objects. What state objects allow is for all potentially interacting state to be defined together, and for creation-time validation rather than draw-time validation.

But you can have state objects without command buffers. And you can have command buffers without state objects.

I honestly don't understand the rest of what you're saying.

Alfonse Reinheart
06-22-2016, 09:59 AM
Correct me if I'm wrong, but when recording command buffer those commands are not executed immediately.
But rather when this command buffer is submited to the render queue. Only then driver process it and send it to the graphics card.
So not only " command buffers solve is being able to build sequences of commands asynchronously on multiple threads." But additional driver have more knowledge about "what to draw".

... so what?

What does the driver really know about what you're doing with a command buffer? It only knows the sequence of commands in that buffer. It doesn't know:

1: What commands were executed before that buffer.

2: What commands will be executed after that buffer.

So what does it really know about what you're doing?

You might wonder why it matters what commands were executed before or after. But that's very important.

Consider a vital concept in Vulkan: image layouts. Every image exists in a particular layout, which potentially restricts what operations it can be used with. Well, the layout an image "currently" is in is completely unknown to Vulkan. You, the user, are expected to keep up with that. After all, commands are built asynchronously and executed in a generally undefined order.

So when you're building a command buffer, and you use an image in a descriptor set... what layout is it in? Even if the image is in the wrong layout when you're building that CB, Vulkan cannot know if you will execute a command buffer before this one which will transition the layout to an appropriate one.

Therefore, Vulkan requires that you specify the layout of an image when you use it. And if, by the time that command executes, the image isn't actually in that layout, you're boned.

OpenGL doesn't have image layouts or anything even remotely like them. Why? Because OpenGL's execution model is essentially synchronous. Stuff like image layouts and transitions are an implementation detail to OpenGL, something the driver can handle behind the scenes. All thanks to the synchronous execution model.

If you attach an image to an FBO, render to it, then detach it and bind it as a sampler, OpenGL can see that you're doing all of those things in order. If the implementation needs to do layout transitions, it can do them as needed between those operations. This is possible because OpenGL's execution model requires all commands to behave as if they were executed synchronously.

The only way to make something like NV_command_list work is to make the execution model asychronous (and add other features like image layouts). At which point, why bother using OpenGL at all? It's not like you can slowly transition to using this. You have to rewrite far too much code to be able to handle asynchronous execution effectively and efficiently. You have to deal with things like image layouts and async memory transfer operations. And so forth.

Just look at NV_command_list as it is. To use it, you have to use non-core APIs for vertex specification, uniform buffer binding, SSBOs, and texture and image binding. By the time you're finished with all of this, how many OpenGL API calls are you using that don't have an NV or ARB suffix on them?

By the end of this process, all you will have is OpenGL-in-name-only. Better to just use Vulkan and get it over with.

There is no low-hanging fruit to be picked from the Vulkan tree. You can't just pull parts of Vulkan over and expect the result to make sense. Direct3D11 tried that with deferred contexts. Notice how NVIDIA implemented that in their hardware.

Ever notice that AMD didn't? I'd bet that image layout issues were a huge part of the reason why. NVIDIA has no such concept in their hardware; they ignore all the Vulkan layout stuff. AMD's hardware really relies on it.

SPIR-V makes sense to be able to be consumed by OpenGL. Descriptor sets might make sense for OpenGL, in some respects. And maybe push-constants & input attachments. Anything else requires fundamental changes to the very execution model of OpenGL before they can actually work.

And if you change that much, why are you using OpenGL? Because you don't like writing `vk` in front of your function names?

----

As for more of the "what does the driver really know with CBs", consider Vulkan's complex render-pass system. Why does this system exist? To permit implementations to actually know something about how you're going to render.

Pipelines are built based on being executed within a specific subpass of a render pass. At pipeline creation time, the implementation knows the numbers and formats of the images that are used as render targets. It knows when you're going to do read/modify/write glTextureBarrier gymnastics. It knows this at compile time. It can therefore compile your shaders (particularly for tile-based architectures) into efficient forms to utilize this knowledge.

If command buffers were so useful for knowing something about what you're rendering, why would render passes and subpasses exist? Surely, the command buffer building system could just read the current framebuffer attachments and make similar inferences based on them, right?

No. Because that happens at command submission time. Whereas the way Vulkan does things, it happens at pipeline creation time, a thing that is expected to be a heavyweight operation. Command submissions are expected to be lightweight operations, and Vulkan makes sure that implementations never have a reason to do things like recompile or reconfigure shaders based on various state.

So no, command buffers don't exist to allow implementations to know more about what you're trying to render. The thing Vulkan does to permit this is that it forces users to specify things up-front.

kaufenpreis
10-28-2016, 04:05 AM
Multi devices and multi display support done a magnitude better then on OpenGL

mhagain
10-28-2016, 09:44 AM
Multi devices and multi display support done a magnitude better then on OpenGL

OpenGL doesn't handle these, the OS does.