PDA

View Full Version : Official feedback on OpenGL 3.2 thread



Khronos_webmaster
08-03-2009, 07:06 AM
Latest desktop GPU functionality now fully accessible through cross-platform open 3D standard; Close alignment with OpenCL for parallel compute, OpenGL ES for mobile graphics and new WebGL standard for 3D on the Web

The Khronos™ Group, today announced OpenGL® 3.2, the third major update in twelve months to the most widely adopted 2D and 3D graphics API (application programming interface) for personal computers and workstations. This new release continues the rapid evolution of the OpenGL standard to enable graphics developers to portably access cutting-edge GPU functionality across diverse operating systems and platforms. The full specification is available for immediate download at http://www.opengl.org/registry. (http://www.opengl.org/registry)

OpenGL 3.2 adds features for enhanced performance, increased visual quality, accelerated geometry processing and easier portability of Direct3D applications. In addition, the evolution of OpenGL and other standards within Khronos, including OpenCL™ for parallel compute, OpenGL ES for mobile 3D graphics and the new WebGL™ standard for 3D on the web are being coordinated to create a powerful graphics and compute ecosystem that spans many application, markets and devices. The installed base of OpenGL 3.2 compatible GPUs already exceeds 150 million units.

The OpenGL ARB (Architecture Review Board) working group at Khronos has defined GLSL 1.5, an updated version of the OpenGL Shading language, and two profiles within the OpenGL 3.2 specification providing developers the choice of using the streamlined Core profile for new application development or the Compatibility profile which provides full backwards compatibility with previous versions of the OpenGL standard for existing and workstation applications.

OpenGL 3.2 has been designed to run on a wide range of recent GPU silicon and provides a wide range of significant benefits to application developers, including:
Increased performance for vertex arrays and fence sync objects to avoid idling while waiting for resources shared between the CPU and GPU, or multiple CPU threads; Improved pipeline programmability, including geometry shaders in the OpenGL core; Boosted cube map visual quality and multisampling rendering flexibility by enabling shaders to directly process texture samples.
In addition, Khronos has defined a set of five new ARB extensions that enable the very latest graphics functionality introduced in the newest GPUs to be accessed through OpenGL – these extensions will be absorbed into the core of a future version of OpenGL when this functionality is proven and widely adopted.
“Khronos has proven to be a great home for the OpenGL ARB,” stated Dr. Jon Peddie founder and principal of Jon Peddie Research. “Not only has the ARB has put the pedal to the metal to enable OpenGL to be a true platform for graphics innovation, but the synergy of coherently developing a family of related standards is leveraging OpenGL’s strengths - OpenGL is truly the foundation on which rich graphics for mobile devices and the Web is being built.”

Groovounet
08-03-2009, 07:55 AM
Maybe the most interesting progress come from all these D3D 10.1 features provided by the new extensions!

GL_ARB_sync is such a good news ...

Again after OpenGL 3.1 release, I wasn't expecting that many new features!

Deep reading of the spec in progress!

Jose Goruka
08-03-2009, 08:24 AM
I'm very happy to see where OpenGL is going, and makes me glad to see the release cycles gain so much importance recently. OpenGL is truly becoming relevant again in the multimedia and games industries.

Executor
08-03-2009, 11:15 AM
Where anisotropy in core? :(

Alfonse Reinheart
08-03-2009, 11:26 AM
I have one question:

Can we assume that GL 3.2 functionality will be exposed in all GL 3.0 capable hardware?


Where anisotropy in core?

Rob explained this here (http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=261647#Post2616 24).

mfort
08-03-2009, 11:38 AM
http://developer.nvidia.com/object/opengl_3_driver.html

Q: What NVIDIA hardware will support OpenGL 3.0, OpenGL 3.1 or OpenGL 3.2?
A: The new features in OpenGL 3.0, OpenGL 3.1 and OpenGL 3.2 require G80, or newer hardware. Thus OpenGL 3.0/3.1/3.2 is not supported on NV3x, NV4x nor G7x hardware.

Rosario Leonardi
08-03-2009, 11:47 AM
Wonderful now I have to spend my vacation reading the new specifics. I hate/love you, khronos. :)

By the way, maybe it's time to update the online documentation, it's still at 2.1 version. :-S

Stephen A
08-03-2009, 12:33 PM
By the way, maybe it's time to update the online documentation, it's still at 2.1 version. :-S

++

I'm generating C# bindings for the new specs as we speak.

Groovounet
08-03-2009, 02:53 PM
My second impression: http://www.g-truc.net/#news0170

Stephen A
08-03-2009, 03:24 PM
Thanks for fixing the LightProperty-instead-of-LightParameter error. This has been in the specs since the early SGI days!

A few more errors in the latest specs:
1. "R_SNORM" should be defined in VERSION_3_1 enum. However, it's value is not defined *anywhere*.
2. "TIMEOUT_IGNORED" is defined twice in VERSION_3_2 enum.
3. "2X_BIT_ATI" is defined twice in ATI_fragment_shader enum.
4. "FRAMEBUFFER_ATTACHMENT_TEXTURE_LAYER" is defined twice in FramebufferParameterName enum.
5. "RGB5" is defined twice in RenderbufferStorage enum.

Apart from the first issue (bug #195 (http://www.khronos.org/bugzilla/show_bug.cgi?id=195)), the rest are not serious. However, it would be nice to have them fixed, if only to make spec converters, like the one that generates the C headers, simpler.

glfreak
08-03-2009, 05:45 PM
Excellent! Now we can say GL rules. With GS it's now the true multi-platform solution where we can benefit from latest hardware on all platforms, without having to upgrade the operating system. ;)

Lets give it some time to see a good working drivers in actions :)

Alfonse Reinheart
08-03-2009, 05:52 PM
With GS it's now the true multi-platform solution where we can benefit from latest hardware on all platforms, without having to upgrade the operating system.

Too bad that Win7 is almost out. GL might have gained some users if this had been out while the Vista FUD kept people using WinXP.

BTW, geometry shaders are not widely used. ARB_draw_elements_base_vertex, ARB_texture_multisample, ARB_sync, and ARB_seamless_cubemap will be far more used and are far more useful than geometry shaders.

glfreak
08-03-2009, 06:23 PM
No body cared about Vista anyway...I saw all ppl complaining and re-installing XP. However, GL 3.2 will be a revolutionary change int he 3D realm. From CAD to real time simulation and games. But again we want to see good drivers.

Alfonse Reinheart
08-03-2009, 06:28 PM
However, GL 3.2 will be a revolutionary change int he 3D realm.

How? It's good for OpenGL, but I fail to see how this is a revolutionary change in anything.

glfreak
08-03-2009, 06:30 PM
I meant to say many high-end and game developers will adopt it as soon as drivers are out there.

dpoon
08-03-2009, 06:44 PM
Noticed the following whilst reading the GLSL 1.50 specs with changes (GLSLangSpec.1.50.09.withchanges.pfd).

On page 45:
The fragment language has the following predeclared globally scoped default precision statements:
precision mediump int;
precision highp float;

The float precision predeclaration is new in GLSL 1.50 and is not marked in magenta.

Alfonse Reinheart
08-03-2009, 07:13 PM
I meant to say many high-end and game developers will adopt it as soon as drivers are out there.

Which ones? And by that, I mean the ones that aren't already using OpenGL?

More specifically, what is it about OpenGL 3.2 that would make a game developer stop doing work on making their game, abandon Direct3D, and adopt OpenGL?

Eosie
08-03-2009, 09:19 PM
Great work. Nice to see OpenGL moving forward so rapidly.

Just a question: What's holding back incorporating S3TC into the core? Is VIA not willing to give up their intellectual property in favor of OpenGL?

glfreak
08-03-2009, 10:40 PM
More specifically, what is it about OpenGL 3.2 that would make a game developer stop doing work on making their game, abandon Direct3D, and adopt OpenGL?

Based on the fact that GL 3.2 is now competent with D3D 10 or 11 and evolving rapidly, but providing there will be reliable drivers soon. D3D will not be abandoned in a day. Will never. But supporting two rendering paths is now possible.

I have experience with both APIs, and frankly choosing an API over the other is much of driver issue.

Hope IHVs work hard on good drivers.

Major CADs are now recommending D3D drivers for stability issues.

Jean-Francois Roy
08-03-2009, 11:11 PM
Not adding too much to the discussion, but here's my quick overview of the major new features in OpenGL 3.2 and GLSL 1.5.

http://www.devklog.net/2009/08/03/opengl-3-2-officially-released/

Heiko
08-04-2009, 12:12 AM
Noticed the following whilst reading the GLSL 1.50 specs with changes (GLSLangSpec.1.50.09.withchanges.pfd).

On page 45:
The fragment language has the following predeclared globally scoped default precision statements:
precision mediump int;
precision highp float;

The float precision predeclaration is new in GLSL 1.50 and is not marked in magenta.

Are you sure? I already used that in GLSL 1.30.

Demirug
08-04-2009, 12:16 AM
Based on the fact that GL 3.2 is now competent with D3D 10 or 11 and evolving rapidly, but providing there will be reliable drivers soon. D3D will not be abandoned in a day. Will never. But supporting two rendering paths is now possible.

It was always possible. I could not see why 3.2 should motivate more game developers going this way than before. If you don’t target the Mac or even more unlikely Linux there is simply no good reason to write an additional OpenGL rendering path. While the pure API Spec is getting better there are still so many holes when it comes to the infrastructure.

GeLeTo
08-04-2009, 01:04 AM
Are there plans to use ARB_sync with VBOs?

For instance after calling glBufferData - the data will not be copied immediately, and it can be safely changed or deleted only after the sync object is signaled. Calling ClientWaitSync will copy the data immediately (as is the current behavior of glBufferData ).

Is there a way to do this with the current API or an extension?

Jon Leech (oddhack)
08-04-2009, 01:29 AM
Are there plans to use ARB_sync with VBOs?

For instance after calling glBufferData - the data will not be copied immediately, and it can be safely changed or deleted only after the sync object is signaled. Calling ClientWaitSync will copy the data immediately (as is the current behavior of glBufferData ).

Is there a way to do this with the current API or an extension?

I'm not certain what you're asking, but if it is whether we would make sync objects affect the behavior of previously existing API calls, there are no plans to do that. BufferData in GL 3.2 does just what BufferData always did before. Placing a fence after a GL command and waiting on the corresponding fence sync object in another context simply provides a (potentially) more efficient way to know that the command has completed from the point of view of the other context.

Dan Bartlett
08-04-2009, 02:15 AM
Are ARB extensions #74 + #75 (at http://www.opengl.org/registry/) meant to be labelled as "WGL_ARB_create_context" and "GLX_ARB_create_context", or "WGL_ARB_create_context_profile" and "GLX_ARB_create_context_profile", or are they a special case because they modify existing extensions?

Also just noticed that with wglGetExtensionsStringARB, NVidia beta drivers 190.56 display the extension string "WGL_ARB_create_context_profile", but not "WGL_ARB_create_context".

GeLeTo
08-04-2009, 02:26 AM
I'm not certain what you're asking, but if it is whether we would make sync objects affect the behavior of previously existing API calls, there are no plans to do that....
Ok, then lets call it glBufferDataRetained / glBufferSubDataRetained.

Currently sending data to VBO (or any other buffer data object) works like this:
1. Allocate and fill the data
2. Give it to glBufferData
3. glBufferData immediately makes a copy of the data (or sends it directly to the card which is unlikely)
4. When glBufferData returns the data is no longer needed and I can delete or modify it

What I want is to skip #3 to avoid the extra copy done by glBufferData. When using glBufferDataRetained the data will not be immediately copied and it cannot be changed/deleted till OpenGL signals that this data is no longer needed.

The ARB_Sync API is probably not designed for this case but something similar can be very useful. Using the ARB_Sync semantics this functionality will work like this:

1. Allocate and fill the data
2. Give the data and a sync object to glBufferDataRetained
3. glBufferDataRetained returns immediately without copying anything
4. The next time I need to change or delete the data I check if the sync object is signaled - and if so I can use it right away. If the object is not signaled - I can either call ClientWaitSync (or whatever) to ensure the data is copied right away or I can allocate the changed data in a new place.

Currently the only way (that I know of) to avoid the extra copying is to use glMapBuffer. And this is another can of worms...

Executor
08-04-2009, 02:32 AM
Where anisotropy in core?

Rob explained this here (http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=261647#Post2616 24).

Waiting OGL4 for features ten years ago? So stupid... :(

Heiko
08-04-2009, 02:41 AM
Where anisotropy in core?

Rob explained this here (http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=261647#Post2616 24).

Waiting OGL4 for features ten years ago? So stupid... :(


There is no need to wait. The functionality is available for any vendor anyway. Just use the extension, why would that be a problem?

Executor
08-04-2009, 02:49 AM
There is no need to wait. The functionality is available for any vendor anyway. Just use the extension, why would that be a problem?

I want use core functionality only, but i forced use extension too, because core not have needed functionality. This is sad.

Jon Leech (oddhack)
08-04-2009, 03:01 AM
Are ARB extensions #74 + #75 (at http://www.opengl.org/registry/) meant to be labelled as "WGL_ARB_create_context" and "GLX_ARB_create_context", or "WGL_ARB_create_context_profile" and "GLX_ARB_create_context_profile", or are they a special case because they modify existing extensions?

We folded the profile extension into the same documents as the original create_context extensions, largely because it was necessary to get them out on time for SIGGRAPH, for arcane reasons involving the Khronos spec approval process. So they are two extensions, in one file.

Jon Leech (oddhack)
08-04-2009, 03:05 AM
Currently sending data to VBO (or any other buffer data object) works like this:
1. Allocate and fill the data
2. Give it to glBufferData
3. glBufferData immediately makes a copy of the data (or sends it directly to the card which is unlikely)
4. When glBufferData returns the data is no longer needed and I can delete or modify it

What I want is to skip #3 to avoid the extra copy done by glBufferData. When using glBufferDataRetained the data is not immediately copied and I cannot change/delete it till OpenGL signals that this data is no longer needed.

OK, I understand now. I think both Sun and Apple have done vendor extensions along these lines, and we have sometimes discussed it as a future use case in the ARB. If we do something like this in a future release I hope it would use sync objects to signal the driver being done with the client buffer, but at present it's not being actively discussed in the group.

Scribe
08-04-2009, 03:23 AM
Now I'm just waiting for extension APIs like GLEW to catch up with a major release.

Does anyone know of any APIs like GLEW that allow for the easy handling of extensions in openGL 3.1/3.2?

GeLeTo
08-04-2009, 03:31 AM
I think both Sun and Apple have done vendor extensions along these lines, and we have sometimes discussed it as a future use case in the ARB. If we do something like this in a future release I hope it would use sync objects to signal the driver being done with the client buffer, but at present it's not being actively discussed in the group.
Now that we have sync objects, the API to implement this functionality is a no-brainer, so I hope to see it implemented sooner rather than later.

mfort
08-04-2009, 03:40 AM
Currently the only way (that I know of) to avoid the extra copying is to use glMapBuffer. And this is another can of worms...

Al least for PBO the glMapBuffer API works fine.
- Mapping buffer is pretty straight forward usage pattern.
- NVIDIA drivers provides good performance.
- Loading the data using glTexImage and similar is async
- Current ARB_sync should fill the missing gap

Scribe
08-04-2009, 03:49 AM
In my experience glMapBuffer works very well for the initial loading of any buffer. It's once you start to use that buffer that you suffer synch issues. Certainly it's much faster than multiple calls to glBufferSubData.

mfort
08-04-2009, 04:00 AM
yes, calling glBufferData with data set to NULL is painless.

Maybe glCopyBufferSubData (ARB_copy_buffer) for updating subdata can help.

Stephen A
08-04-2009, 04:06 AM
C# wrappers for OpenGL 3.2 available here (http://www.opentk.com). Also usable by VB.Net, F# and the rest of the Mono/.Net family of languages.

Y-tension
08-04-2009, 04:18 AM
Very well done Khronos!!Now let's see some drivers!..NVIDIA (officially) released 3.1 just a few weeks ago.
Again, great release!

GeLeTo
08-04-2009, 04:26 AM
Al least for PBO the glMapBuffer API works fine.
- Mapping buffer is pretty straight forward usage pattern.
- NVIDIA drivers provides good performance.
- Loading the data using glTexImage and similar is async
- Current ARB_sync should fill the missing gap Straightforward usage pattern? Mapping a buffer to update data will most certainly lead to sync issues. Even discarding the buffer with glBufferData(…, NULL) and updating the whole thing might not help (http://www.stevestreeting.com/2007/03/17/glmapbuffer-vs-glbuffersubdata-the-return/). And of course updating the whole buffer migh not be what you want.
glTexImage being async comes at a cost - it will most likely create a copy of your image data right away. And this is what I want to avoid.

In my experience glMapBuffer works very well for the initial loading of any buffer. It's once you start to use that buffer that you suffer synch issues. Certainly it's much faster than multiple calls to glBufferSubData.
This very much depends on the sizes of your data/subdata.
The intricacies of glMapBuffer have been discussed to death. The solution to use buffer objects with sync objects is both elegant and simple.

mfort
08-04-2009, 05:10 AM
And of course updating the whole buffer migh not be what you want.

OK


glTexImage being async comes at a cost - it will most likely create a copy of your image data right away. And this is what I want to avoid.

It is hard to argue on this. Only driver guys can tell. But I see no reason why driver needs to make a copy. Probably it starts some DMA to copy the data from system memory to GPU memory.



This very much depends on the sizes of your data/subdata.
The intricacies of glMapBuffer have been discussed to death. The solution to use buffer objects with sync objects is both elegant and simple.

Using buffers for small chunks of data is questionable. The overhead could be larger then using simple memory pointers and letting driver to copy it.

I am using buffers glBufferData(…, NULL) for streaming several hundreds of megabytes per seconds without problems. Of course I am not using SubData. Driver developers should write some performance hints&tips for using smaller chunks with SubData. I can imagine it is not for free.

If you are afraid of copying data in system memory try to use WriteCombined memory and/or SSE instructions with non temporal hint to not pollute cache.

dpoon
08-04-2009, 05:28 AM
Noticed the following whilst reading the GLSL 1.50 specs with changes (GLSLangSpec.1.50.09.withchanges.pfd).

On page 45:
The fragment language has the following predeclared globally scoped default precision statements:
precision mediump int;
precision highp float;

The float precision predeclaration is new in GLSL 1.50 and is not marked in magenta.

Are you sure? I already used that in GLSL 1.30.

All I meant in my original post was that in the GLSL 1.50 spec they've added the precision preclaration for the float type to the global scope of the fragment language. In previous versions of the GLSL spec only the int type was predeclared in the global scope of the fragment language. So in the GLSL 1.50 spec the precision predeclaration for the float type should have been highlighted in magenta.

GLSLangSpec.Full.1.30.08.withchanges.pdf (page 36):

The fragment language has the following predeclared globally scoped default precision statement:
precision mediump int;

GLSLangSpec.Full.1.40.05.pdf (page 37):

The fragment language has the following predeclared globally scoped default precision statement:
precision mediump int;

GLSLangSpec.1.50.09.withchanges.pdf (page 45):

The fragment language has the following predeclared globally scoped default precision statements:
precision mediump int;
precision highp float;

GeLeTo
08-04-2009, 05:42 AM
It is hard to argue on this. Only driver guys can tell. But I see no reason why driver needs to make a copy. Probably it starts some DMA to copy the data from system memory to GPU memory.Which will cause glTexImage not to return untill the copying is done. Which is another thing that I want to avoid. And then there is glTexSubImage, which depending on how the driver works may stall waiting for a sync or create a copy of the data...


I am using buffers glBufferData(…, NULL) for streaming several hundreds of megabytes per seconds without problems.glBufferData(…, NULL) also has a cost - it uses two buffers on the card where in some cases it could use only one (wait for the buffer to be no longer needed and then - DMA copy the data and signal the related sync object ). When you "stream hundreds of megabytes per second" - this may have an impact.

I also use glBufferData(…, NULL) currently. Which is a pity as in many cases only a small part of the mesh has changed. I will probably separate the mesh in chunks, among other reasons - to avoid sending a million verts when only 1000 have changed. Will also test glSubBufferData vs changing parts of the mapped buffer vs updating the whole buffer. Then the app will use selectively whatever of the 3 approaches works better in the current case, but hey - is that not ugly. There has to be a better way to do this.

Scribe
08-04-2009, 06:04 AM
It is hard to argue on this. Only driver guys can tell. But I see no reason why driver needs to make a copy. Probably it starts some DMA to copy the data from system memory to GPU memory.Which will cause glTexImage not to return untill the copying is done. Which is another thing that I want to avoid. And then there is glTexSubImage, which depending on how the driver works may stall waiting for a sync or create a copy of the data...


I am using buffers glBufferData(…, NULL) for streaming several hundreds of megabytes per seconds without problems.glBufferData(…, NULL) also has a cost - it uses two buffers on the GPU where in some cases it could use only one (wait for the buffer to be no longer needed and then - DMA copy the data and signal the related sync object ). When you "stream hundreds of megabytes per second" - this may have an impact.

I also use glBufferData(…, NULL) currently. Which is a pity as in many cases only a small part of the mesh has changed. I will probably separate the mesh in chunks, among other reasons - to avoid sending a million verts when only 1000 have changed. Will also test glSubBufferData vs changing parts of the mapped buffer vs updating the whole buffer. Then the app will use selectively whatever of the 3 approaches works better in the current case, but hey - is that not ugly. There has to be a better way to do this.

I recently coded a library for loading bit-mapped TTF fonts into openGL textures (and VBOs). Even loading a single Font showed a noticeable difference in loading times when using glMapBuffer vs glBufferSubData. Running on the latest nVidia drivers the latter is completely useless for small chunks of data, I'm unsure how well it scales up with larger data chunks.

Scribe
08-04-2009, 06:14 AM
C# wrappers for OpenGL 3.2 available here (http://www.opentk.com). Also usable by VB.Net, F# and the rest of the Mono/.Net family of languages.

It's quite embarrassing when there's better wrappers available for C# over C++! People seem reluctant to integrate openGL 3.1/3.2 into their wrappers and APIs, why is this? Are the changes that difficult for them make? (that's not me being rude).


Very well done Khronos!!Now let's see some drivers!..NVIDIA (officially) released 3.1 just a few weeks ago.
Again, great release!

nVidia have already released openGL 3.2 drivers. Its more AMD/ATI that are the problem in pushing the post 3.0 spec

mfort
08-04-2009, 06:48 AM
I feel we should move to another thread.



Which will cause glTexImage not to return untill the copying is done. Which is another thing that I want to avoid. And then there is glTexSubImage, which depending on how the driver works may stall waiting for a sync or create a copy of the data...

Not sure about glTexImage, I am using it only to "create" texture. BTW. At least in NV, this is almost no-op. It does almost nothing. The real hard work is done when I call glTexSubImage for the first time (even with PBO is use).

glTexSubImage with PBO is async for sure. It returns immediately (in less then 0.1 ms)




glBufferData(…, NULL) also has a cost - it uses two buffers on the card where in some cases it could use only one (wait for the buffer to be no longer needed and then - DMA copy the data and signal the related sync object ). When you "stream hundreds of megabytes per second" - this may have an impact.


Yes, it has some cost. But this way you can trade memory for CPU clocks. The driver can make second buffer to avoid waiting for PBO until it is available.

aqnuep
08-04-2009, 07:24 AM
I agree that OpenGL is going to the right direction. Especially the ARB_sync extension is nice.
I am pretty surprised by the way that ARB_geometry_shader4 is core from now because it's a deprecated feature. I think it was put into core just because D3D supports it. I would rather go into the direction of the tesselation engine provided by AMD/ATI since HD2000 series cards. That is a much more flexible functionality and it's already or will be soon supported by D3D. The same things can be done with it as with geometry shaders and even much more.
This geometry shader thing is only present because at the time HD2000 came out, NVIDIA's G8x cards weren't able to do such thing.

P.S.: This buffer object performance related discussion gone out of control by the way so you should better continue it in a much more appropriate place :)

Aleksandar
08-04-2009, 08:01 AM
All I meant in my original post was that in the GLSL 1.50 spec they've added the precision preclaration for the float type to the global scope of the fragment language. In previous versions of the GLSL spec only the int type was predeclared in the global scope of the fragment language. So in the GLSL 1.50 spec the precision predeclaration for the float type should have been highlighted in magenta.


Have you read GLSLangSpec.1.40.07 (May 1st 2009)?

GLSLangSpec.1.40.07 (Pg.36)

The fragment language has the following predeclared globally scoped default precision statements:
precision mediump int;
precision highp float;


And I don't know why it is important at all? Because...

GLSLangSpec.1.40.07 (pg.35) / GLSLangSpec.1.50.09 (pg.44)

4.5 Precision and Precision Qualifiers

Precision qualifiers are added for code portability with OpenGL ES, not for functionality. They have the
same syntax as in OpenGL ES, as described below, but they have no semantic meaning, which includes no
effect on the precision used to store or operate on variables.
Only Catalyst drivers require precision qualifiers in fragmen shaders. But, maybe even that will be changed when OpenGL 3.1 support comes.

Alfonse Reinheart
08-04-2009, 10:38 AM
nVidia have already released openGL 3.2 drivers.

Beta drivers don't count.

kRogue
08-04-2009, 10:44 AM
I am glad that NV_clamp_depth finally got into core (and an ARB back port extension too!) Also we have geometry shaders too.

Too bad that GL_EXT_separate_shader_objects did not in some form make it to core (and there is a bit that is kind of troubling, in it one uses gl_TexCoord[] to write to, but GL3.2 says that is deprecated, so to use GL_EXT_separate_shader_objects does one need to make a compatible context then?)

Also a little perverse is that context creation now has another parameter, so now we have:

forward compatible GLX_CONTEXT_FORWARD_COMPATIBLE_BIT_ARB for the attribute GLX_CONTEXT_FLAGS_ARB
and
GLX_CONTEXT_CORE_PROFILE_BIT_ARB/GLX_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB for the attribute GLX_CONTEXT_PROFILE_MASK_ARB
(and similarly under windows).

I also have the fear that for each version of GL, the description of GLX_ARB_create_context/GLX_ARB_create_context_profile will grow, i.e. a case for each GL version >= 3.0 (shudders)


So now I dream of:
1) direct state access for buffer objects, i.e. killing off bind to edit buffer objects.
2) decoupling of filter and texture data and a direct state access API to go with it.
3) nVidia's bindless API: GL_NV_shader_buffer_load and GL_NV_vertex_buffer_unified_memory

nVidia's bindless graphics is pretty sweet IMO.

Jon Leech (oddhack)
08-04-2009, 11:28 AM
I also have the fear that for each version of GL, the description of GLX_ARB_create_context/GLX_ARB_create_context_profile will grow, i.e. a case for each GL version >= 3.0 (shudders)

Well, we kinda had to introduce an enabling mechanism to select the profile, once we introduced profiles. If you don't specify a profile or version or context flags in the attribute list, then something (hopefully) reasonable still happens: you get the core profile of the latest version supported by the driver. There is some redundancy in that a forward-compatible compatibility profile is exactly equivalent to a compatibility profile, since nothing is deprecated from the compatibility profile, but I think the attributes all make sense.

We don't have any more profiles under discussion, if that's a concern. Conceivably OpenGL ES and OpenGL could eventually merge back together through the profile mechanism, but that's a long way into the future if it happens at all.

Jon Leech (oddhack)
08-04-2009, 11:42 AM
So in the GLSL 1.50 spec the precision predeclaration for the float type should have been highlighted in magenta.

To speak somewhat in our own defense, the GL and GLSL specs are huge, complex documents that go through many, many revisions during the process of creating a new version. John and I try to keep the change markings sensible but it's not going to be a 100% error free process; they're mostly just a guideline to finding changed stuff, to accompany the new feature summaries. We don't mark the places where a lot of text is removed with strikethroughs, for example. Our focus is on getting the new spec out on schedule, so some stuff like this is always going to fall through the cracks.

kRogue
08-04-2009, 12:35 PM
I see the use/reason of having forward compatible and profile selection (i.e. core vs compatibility) separate as they address 2 different issues, but those issues over lap (at this point a great, great deal)... and can you imagine the mess that occurs when trying to teach this stuff to people new to GL? They will scream that one is splitting hairs.

But I am really surprised that a bindless edit buffer object, I figured that it would have been a kind of no brainer to specify and for driver writers to deal with, something simple as for each buffer object type function glSomeBufferFunction, a new function glSomeBufferFunctionNamed which had one extra parameter, a GLuint of the buffer object; ditto for texture stuff too.

One point of irony for me right now is the deprecating (or saying only in compatible profiles) is the naming of interpolators in shaders, I prefer, in the language of GL_EXT_separate_shader_objects, "rendezvous by resource" instead of the current "rendezvous by name". The argument of removing an interpolator at link time to optimize I have always found to be kind of thin (especially since for each tuple (vertex,fragment) or (vertex,geometry,fragment) shader one had to have a different program).... does anyone have some reasonable real world examples where one leaves an interpolator present if it would get optimized out?


Just out of curiosity, is there active movement on:
1) direct state access
2) separation of texture data from filtering method
?

I am starting to think, given that 2) is quite hairy to do well, we won't see anything like that till GL 4.0...

Also, I am really, really overjoyed that the GL API is now getting updated more regularly, is that pace expected to remain? (I suspect that the pace of so many updates will slow once GL core exposes all features that D3D of that time exposes, and then once there only big updates at new GFX card generations and minor updates handling making things "cleaner")

Jon Leech (oddhack)
08-04-2009, 12:46 PM
I see the use/reason of having forward compatible and profile selection (i.e. core vs compatibility) separate as they address 2 different issues, but those issues over lap (at this point a great, great deal)... and can you imagine the mess that occurs when trying to teach this stuff to people new to GL? They will scream that one is splitting hairs.

They overlap but the use cases are very different. The FC context is really only intended as a futureproofing aid for developers, along the lines of the (otherwise so-far mythical) "debug context". Personally I wouldn't even tell someone new to GL about it for quite a while.

Separation of filters and samplers is on the agenda for a future release, yes. DSA has been brought up as well, though it is perhaps less far along in terms of being seen as desirable by everyone.

Eosie
08-04-2009, 01:06 PM
glTexImage being async comes at a cost - it will most likely create a copy of your image data right away. And this is what I want to avoid.
You can't avoid the copy with glBufferData. The data MUST be stored in page-locked system memory (to make sure your operating system won't move it) before it can be asynchronously transfered to the GPU. There is only one way to access this memory directly and so to avoid the copy - glMapBuffer{Range}. The same applies for textures: use PBO. The buffer object represents both a GPU buffer and a corresponding buffer in page-locked system memory.

If I understand correctly, you want to have a fine-grained control over PCIe transfers. In CUDA it's easy, you can use cudaMemcpyAsync and place a fence (called "event" in CUDA) after that, and then ask whether all preceding commands have ended. Because you have no such control in OpenGL, it's hard to give advice. ARB_copy_buffer might help here but I am not sure. As someone already said, "only driver guys can tell".

kRogue
08-04-2009, 01:14 PM
... Personally I wouldn't even tell someone new to GL about it for quite a while.


That is my point, someone new to GL would say "What the?" But it is quite awkward if you look at what effective practice was:

GL3.0: request forward compatible context to make sure one was not using naughty deprecated stuff, mostly fixed function pipeline.

GL3.1: ditto, but if you did not request a forward compatible context then expect ARB_compatibility to be present, chances are then code to check if it there; often the context creation code is buried in some platform dependent monster that nobody likes to read, worse for cross platform code.

GL3.2: now be aware of difference between deprecated features and compatibility features, in theory a twisted one could request a forward compatible context and request a compatible profile (shudders).

hopefully, no more new context creation attributes will be added.

On a related note to my post (but not to GL 3.2) reading GL_EXT_separate_shader_objects, I found something that I thought was just plain *bad*



16. Can you use glBindFragDataLocation to direct varying output
variables from a fragment shader program created by
glCreateShaderProgramEXT to specific color buffers?

UNRESOLVED:

Tenative resolution: NO for much the same reason you can't do
this with attributes as described in issue 15. But you could
create the program with the standard GLSL creation process where
you attach your own shaders and relink.

For fragment shader programs created with
glCreateShaderProgramEXT, there is already the gl_FragData[]
builtin to output to numbered color buffers. For integer
framebuffers, we would need to add:

varying out ivec4 gl_IntFragData[];

User-defined output fragment shader varyings can still be used
as long as the application is happy with the linker-assigned
locations.


my thoughts on that are *ick*... since if one is doing MRT, then you would possibly need to call glDrawBuffers on every shader switheroo... ick... would be better if they just added a pragma like interface to fragment shaders:



pragma(out vec4 toFrag0, 0)
pragma(out ivec4 toFrag1, 1)

or extend the layout() deal in fragment shaders:



layout(fragment_output, 0) out vec4 toFrag0;
layout(fragment_output, 1) out ivec4 toFrag1;

and along those lines one could then use that kind of mentality on interpolators, i.e in vertex shaders:



layout(vertex_output, 0) out vec4 myValue;
layout(vertex_output, 1) out flat myFlatValue;


and in geometry shaders:



layout(geometry_input, 0) in vec4 myValue;
layout(geometry_input, 1) in vec4 myFlatValue;

layout(geometry_output, 0) out vec4 myValueForFragging;


and then even in fragment shaders:



layout(fragment_input, 0) in vec4 myValueForFragging;

but on closer inspection since out/in qualifier is already there, it can all be collapsed to:



layout(location, N) in/out [flat, centroid, etc] type variable_name;


The sweet part of this being that one can then dictates where attributes and interpolators are, the lazy could even skip calling glAttrbuteLocation...

This comment probably would be best in suggestion for next release or some kind of GL_EXT_separate_shader_objects thread.

Jon Leech (oddhack)
08-04-2009, 01:31 PM
GL3.2: now be aware of difference between deprecated features and compatibility features, in theory a twisted one could request a forward compatible context and request a compatible profile (shudders).

You can do that, but as previously noted, nothing is deprecated from the compatibility profile, so a forward-compatible compatibility profile is exactly the same thing as a non-FC CP :-)

I would be a little surprised if any of the remaining deprecated-but-not-removed features in the core profile are actually removed from core anytime in the near future. It's challenging enough dealing with the number of options we have today.

AlexN
08-04-2009, 01:46 PM
...
On a related note to my post (but not to GL 3.2) reading GL_EXT_separate_shader_objects, I found something that I thought was just plain *bad*

[quote]
16. Can you use glBindFragDataLocation to direct varying output
variables from a fragment shader program created by
glCreateShaderProgramEXT to specific color buffers?

UNRESOLVED:

Tenative resolution: NO for much the same reason you can't do
this with attributes as described in issue 15. But you could
create the program with the standard GLSL creation process where
you attach your own shaders and relink.

For fragment shader programs created with
glCreateShaderProgramEXT, there is already the gl_FragData[]
builtin to output to numbered color buffers. For integer
framebuffers, we would need to add:

varying out ivec4 gl_IntFragData[];

User-defined output fragment shader varyings can still be used
as long as the application is happy with the linker-assigned
locations.


my thoughts on that are *ick*... since if one is doing MRT, then you would possibly need to call glDrawBuffers on every shader switheroo...

...

This comment probably would be best in suggestion for next release or some kind of GL_EXT_separate_shader_objects thread.


I don't like this, either. The previous issue, about vertex input attribute binding locations, has the same problem. I'm used to binding my vertex attributes to known locations when I load a shader, to avoid both excessive vertex array rebinding and to remove the need to query and store the locations of every attribute, for every shader...

Brolingstanz
08-04-2009, 02:48 PM
A big sloppy kiss for the quick ref cards! :)

Alfonse Reinheart
08-04-2009, 04:40 PM
I don't like this, either.

Which is why this is an nVidia extension (despite the EXT name) and not the eventual solution to this problem. It tries to solve the problem by making OpenGL pretend to do things the D3D way, but without realizing that OpenGL has issues with doing things that way.

Scribe
08-04-2009, 04:47 PM
By the way, does anyone know if glu.h is ok with the gl3.h header file? I know some of glu is no longer relevant to openGl 3.2, is it compatible at all with a forward context?

Many thanks

GeLeTo
08-04-2009, 05:04 PM
You can't avoid the copy with glBufferData...
There is only one way to access this memory directly and so to avoid the copy - glMapBuffer....
If I understand correctly, you want to have a fine-grained control over PCIe transfers...
No, I don't want that. I want a modification of glBufferData/glBufferSubData that will not copy the data right away but will return immediately and then the driver will wait for the GPU buffer to be available before DMA-copying the data from my original pointer. When this is done the driver will use the new sync objects API to signal that this data is now safe to be deleted or modified. If the app at some time wants the data to be available for deletion or modification before it's copying is ready - it can force this by calling ClientWaitSync.

See my first two posts in this thread and John Leech's answer for more details on pages 3 and 4.

Rob Barris
08-04-2009, 06:30 PM
You can't avoid the copy with glBufferData...
There is only one way to access this memory directly and so to avoid the copy - glMapBuffer....
If I understand correctly, you want to have a fine-grained control over PCIe transfers...
No, I don't want that. I want a modification of glBufferData/glBufferSubData that will not copy the data right away but will return immediately and then the driver will wait for the GPU buffer to be available before DMA-copying the data from my original pointer. When this is done the driver will use the new sync objects API to signal that this data is now safe to be deleted or modified. If the app at some time wants the data to be available for deletion or modification before it's copying is ready - it can force this by calling ClientWaitSync.


Are you not able to accomplish what you want to do using MapBufferRange? Pay careful attention to the extra mapping options it provides that are not available to the original MapBuffer call.

Heiko
08-05-2009, 12:38 AM
A big sloppy kiss for the quick ref cards! :)

Yes, I noticed them as well. Very nice! I've been thinking about printing them in full colour and seal them.

mfort
08-05-2009, 12:52 AM
Nice, Please also make RefCard for Core profile.

GeLeTo
08-05-2009, 01:38 AM
Are you not able to accomplish what you want to do using MapBufferRange? Pay careful attention to the extra mapping options it provides that are not available to the original MapBuffer call. No.
I am moving this discussion here:
http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=261827

Jan
08-05-2009, 02:13 AM
... would be better if they just added a pragma like interface to fragment shaders:



pragma(out vec4 toFrag0, 0)
pragma(out ivec4 toFrag1, 1)

or extend the layout() deal in fragment shaders:



layout(fragment_output, 0) out vec4 toFrag0;
layout(fragment_output, 1) out ivec4 toFrag1;

and along those lines one could then use that kind of mentality on interpolators, i.e in vertex shaders:



layout(vertex_output, 0) out vec4 myValue;
layout(vertex_output, 1) out flat myFlatValue;


and in geometry shaders:



layout(geometry_input, 0) in vec4 myValue;
layout(geometry_input, 1) in vec4 myFlatValue;

layout(geometry_output, 0) out vec4 myValueForFragging;


and then even in fragment shaders:



layout(fragment_input, 0) in vec4 myValueForFragging;

but on closer inspection since out/in qualifier is already there, it can all be collapsed to:



layout(location, N) in/out [flat, centroid, etc] type variable_name;


The sweet part of this being that one can then dictates where attributes and interpolators are, the lazy could even skip calling glAttrbuteLocation...

This comment probably would be best in suggestion for next release or some kind of GL_EXT_separate_shader_objects thread.



I fully agree !!!

The suggested method to bind attributes to locations from within the shader is an extremely good idea.

This will make writing GL applications soo much easier AND it would bring us a big step forward in the "binary blob" issue.

A central part of my GL code is either inefficient or wastes a lot of memory for caching all those states for attribute locations, uniform locations and such. I automatically build shader combos, which generates a few hundred versions of a shader and all those need to be kept track of.
Often i simply rebind everything, just to make sure, because it would be too complex to keep track of everything.

I'd say, the current method could be kept as fallback, but allowing the shader writer to fix locations would be great. And the GL could detect, when all locations are fixed, and then allow to query for the shaders binary blob.

Also an engine could reject a shader, where someone forgot to fix some attribute's location.

Well, thinking a bit more about it, it is a very complicated issue. For example you will definitely have a problem when you want to do this with uniforms, because an engine could provide so many different uniforms, that when their locations are fixed, some shader will always need some, that would clash.

Jan.

Khronos_webmaster
08-05-2009, 05:14 AM
Would folks be interested in purchasing a pre-printed and plasticized Quick Reference Card? If so how much would be considered the right price.

PDF would remain free to download course.

Aleksandar
08-05-2009, 05:29 AM
Has anyone tried to program using new NV drivers? I have a problem with them (190.56 beta for XP32), because when try to bind VAO (e.g. glBindVertexArray(m_vaoID[0]);) CPU is 100% utilized (both cores on C2D CPU), and program is stopped. It happens even if I use GL 3.0 or 3.1 rendering context. Till these drivers everything worked perfectly using GL 3.1 forward compatible context. Any clue?

Heiko
08-05-2009, 07:13 AM
Would folks be interested in purchasing a pre-printed and plasticized Quick Reference Card? If so how much would be considered the right price.

PDF would remain free to download course.

Perhaps... did a quick lookup on some prices for full colour plasticized prints at a copy shop (assuming double sided prints to safe on plasticizing costs) and that made a total of 20 euro's. Of course this can be done cheaper if mass produced... and they have to be shipped as well (not a big package, but still).

So I assume for something like 10-15 dollars/euros including shipment I would be interested.

I agree with rsp91 (below) that a reference card with only the OpenGL 3.2 core and GLSL 1.50 core functions on it would be even more appreciated. Of course, this means less pages, but I'd still buy it for about 10 euros.

rsp91
08-05-2009, 07:14 AM
Would folks be interested in purchasing a pre-printed and plasticized Quick Reference Card? If so how much would be considered the right price.

PDF would remain free to download course.
I'm not interested in the quick ref card at all until there is one without the fixed pipeline functions on it... Other than that it looks good and I'd be willing to get one in real life as well as some other nerd gear like an OpenGL baseball cap or something...

Scribe
08-05-2009, 07:23 AM
Would folks be interested in purchasing a pre-printed and plasticized Quick Reference Card? If so how much would be considered the right price.

PDF would remain free to download course.
I'm not interested in the quick ref card at all until there is one without the fixed pipeline functions on it... Other than that it looks good and I'd be willing to get one in real life as well as some other nerd gear like an OpenGL baseball cap or something...

I have similar feelings, I just had the OpenGL programming guide for 3.0 and 3.1 delivered today, it's even thicker filled with the crap from 2.x (which I already have a book for) why is anyone new to openGL going to want to purchase a 3.x spec book when if they really wanted to learn 2.x they could buy an old edition? I feel a little ripped-off. It's bloated and difficult to navigate. I feel that, like the API itself, the Quick-Reference card and the programming guides should not include any of the old functionality, there's no logic to it.

What's worse is I've had the book on pre-order for weeks, and they dispatch it the day the 3.2 spec is released, that's terrible =S I'll end up having to buy the 3.2 book in no time. So please, at least remove the crap and slim it down so that it's cheaper!

Groovounet
08-05-2009, 08:14 AM
The quick reference card includes le all crap but in a different color! I actually really like having it included to see the functions I don't want to used. I have been surprised to notice that texture priority is deprecated and without this quick card I would still no know about that!

However, beside the goodies and geeky side of having it printed I don't see the point of such idea. I don't like loss paper on my desk and I have third screens for such purpose: the documentation! When it comes to documentation I like "references" on the screen, copy/paste, "deep" articles in books.

So instead of the redbook I would perfect an updated OpenGL reference page to OpenGL 3.2. The quick reference is good too.

To be fair my first documentation is my own code, it gives an instance of use of a function, a concept, that's the best!

Alfonse Reinheart
08-05-2009, 10:53 AM
Would folks be interested in purchasing a pre-printed and plasticized Quick Reference Card?

I'd like to see a version of the Quick Reference card that was OpenGL 3.2 Core.

Jon Leech (oddhack)
08-05-2009, 12:20 PM
I'm not interested in the quick ref card at all until there is one without the fixed pipeline functions on it... Other than that it looks good and I'd be willing to get one in real life as well as some other nerd gear like an OpenGL baseball cap or something...

So maybe a baseball cap with the quick ref guide printed on it? :-)

rsp91
08-05-2009, 12:44 PM
I'm not interested in the quick ref card at all until there is one without the fixed pipeline functions on it... Other than that it looks good and I'd be willing to get one in real life as well as some other nerd gear like an OpenGL baseball cap or something...

So maybe a baseball cap with the quick ref guide printed on it? :-)Now we're talking.

kRogue
08-05-2009, 12:55 PM
Well, thinking a bit more about it, it is a very complicated issue. For example you will definitely have a problem when you want to do this with uniforms, because an engine could provide so many different uniforms, that when their locations are fixed, some shader will always need some, that would clash.


My take is that for uniforms you won't be able to bind the location apriori from the shader source. The simple reason being that not all uniforms are vec4's (I don't think drivers collapse 2 vec2 interpoaltors to one vec4, so the locations of ins and outs is kind of easy). To get location binding correct by hand of uniforms is a really, really nasty business when alignment comes into play and not to mention inadverdent aliasing if you get sizing wrong. (UBO's have a packing format and the rules can get icky on the details).

Also worth noting in GL_EXT_seperate_shader_objects, the scope of uniforms is _per_ shader, i.e. if one has a uniform named myUniform decalred in a fragment and vertex shader, in GL_EXT_seperate_shader_objects they refer to different variables (and thus if they are to mean the same thing need to be both set).

Brolingstanz
08-05-2009, 01:46 PM
I like the QR baseball cap idea, maybe even a quick ref t-shirt or a raincoat.

As far as price goes I'd be willing to toss my kids in for it. Heck I'll give them to you without anything in return. ;-)

Eric Lengyel
08-05-2009, 06:07 PM
I understand why most of the deprecated features in OpenGL 3.x were deprecated, but there are a few that don't make sense to me. Could someone on the inside please explain why the following features were ripped out? The only plausible explanation seems to be that these features aren't in DX10, so they were dropped from OGL3 just for parity. I know that nothing has actually been removed from OGL3 -- I'm just looking for the reasons that came up when the ARB decided to put the following items on the deprecated list.

1) Alpha test. Yes, we can put kill/discard statements in our fragment shaders, but alpha test has been in hardware forever and is faster than adding shader instructions. This feature also presents a negligible burden on driver developers.

2) Quads. These are extremely useful for all kinds of things, and they're supported directly by the setup hardware. Yes, I know the hardware splits quads into two triangles internally, but being able to specify the GL_QUADS primitive saves us from either (a) having to add two more vertices to the four needed for each quad, or (b) adding an unnecessary index array to the vertex data for a list of disjoint quads. This feature is also trivial for driver writers to implement.

3) Alpha, luminance, luminance/alpha, and intensity formats. These are very useful for specifying 1-2 channel textures! But the most important reason to keep these is that the hardware has remapping circuitry in the texture units that's independent of the shader units, so a shader doesn't have to be modified in order to work with an RGBA texture or a LA texture. If these formats can't be used, then two separate shaders would be necessary: one that operates on an RGBA sample, and another that reads an R or RG sample and then swizzles/smears to get the proper result.

mfort
08-05-2009, 10:39 PM
3) Alpha, luminance, luminance/alpha, and intensity formats. These are very useful for specifying 1-2 channel textures! But the most important reason to keep these is that the hardware has remapping circuitry in the texture units that's independent of the shader units, so a shader doesn't have to be modified in order to work with an RGBA texture or a LA texture. If these formats can't be used, then two separate shaders would be necessary: one that operates on an RGBA sample, and another that reads an R or RG sample and then swizzles/smears to get the proper result.


Solution to this is using R or RG textures with GL_EXT_texture_swizzle.
IMO, once this ext. is in core we don't need I,IA,A textures.
Would be nice to promote this extension to core one day.

Jan
08-06-2009, 12:59 AM
Quick Ref Crads:

I wouldn't pay money for such a thing in general. The idea as a PDF is really nice. The fact that it includes the GL2.1 [censored] is sooo GL 3.0....

Deprecated stuff:

I am actually happy to see 1) and 3) go away. One state less to manage, the shader clearly says what it does (discard or not) and i always hated L,I,LA,etc. textures, because they were a solution to the FFP syndrom. Also we still have more than enough texture formats left.

Quads are a thing, that i would like to see included again, they are indeed useful.

Jan.

memory_leak
08-06-2009, 02:24 AM
Why having it in pdf ... pleeease - put in html- on opengl website.

As html I can read it in browser, cross navigate linkz, use +/- to simply zoom in out, bookmark pages, have several open in same window (at least in opera browser) ... and did I said easy follow links? I know I can follow links from acrobat reader, but it is so pain in the a*s,and I can't have two pages open at same time (unless they are following on each other). I really can't understand why ppl are monkeying about pdfs when html is so much more user friendly? About paper ... it is so waste on environment - realize it. In few months again you will wish new spec ... bläh. Not to mention that such literature is best read when you need it - and that is when you code ... You need something to do in your bed - get yourself a wife *coff coff* /* just a joke - hope you don't mind :-) */

plasmonster
08-06-2009, 12:55 PM
Thanks Khronos for a really nice update!

Agree with Eric and Jan - quads will be sorely missed.

Alfonse Reinheart
08-06-2009, 01:06 PM
Why having it in pdf ... pleeease - put in html- on opengl website.

You can't style HTML like that PDF. It also wouldn't be anywhere near as printable.

Scribe
08-06-2009, 01:19 PM
I understand why most of the deprecated features in OpenGL 3.x were deprecated, but there are a few that don't make sense to me. Could someone on the inside please explain why the following features were ripped out? The only plausible explanation seems to be that these features aren't in DX10, so they were dropped from OGL3 just for parity. I know that nothing has actually been removed from OGL3 -- I'm just looking for the reasons that came up when the ARB decided to put the following items on the deprecated list.

1) Alpha test. Yes, we can put kill/discard statements in our fragment shaders, but alpha test has been in hardware forever and is faster than adding shader instructions. This feature also presents a negligible burden on driver developers.

2) Quads. These are extremely useful for all kinds of things, and they're supported directly by the setup hardware. Yes, I know the hardware splits quads into two triangles internally, but being able to specify the GL_QUADS primitive saves us from either (a) having to add two more vertices to the four needed for each quad, or (b) adding an unnecessary index array to the vertex data for a list of disjoint quads. This feature is also trivial for driver writers to implement.

3) Alpha, luminance, luminance/alpha, and intensity formats. These are very useful for specifying 1-2 channel textures! But the most important reason to keep these is that the hardware has remapping circuitry in the texture units that's independent of the shader units, so a shader doesn't have to be modified in order to work with an RGBA texture or a LA texture. If these formats can't be used, then two separate shaders would be necessary: one that operates on an RGBA sample, and another that reads an R or RG sample and then swizzles/smears to get the proper result.


I'll attempt to answer some of this... basically openGL 3.x is geared towards ShaderModel 4.0 and above hardware. It is not here to cater to old hardware with non-programmable components. As you suggest, this will be to keep up with the performance of DirectX and to allow the API to better model itself around a programmable pipeline. For old hardware you'll just have to continue using openGL 2.1 with extensions, which makes sense really, you're not exactly losing out.

Taking the above into account, this explains the removal of alpha blending, many people will use their own custom techniques in shaders that requires data to be passed differently etc. Alpha blending is easy to emulate in shaders and fully programmable hardware has no fixed support for this so there's no performance loss. Given this knowledge, it's a pain for developers of new hardware/os's to have to implement something like alpha blending that will simply make life easy for 10% of their coders in order to achieve openGL certification. If OS developers see this as a pain, they wont adopt openGL and that's bad for everyone.

In regards to quads, again it's a pain for OS developers when from your point of view, you simply need to use GL_TRIANGLE_STRIP, maintaining the same number of vertices and will simply have to adjust slightly the ordering of these vertices for the strip to be drawn correctly.

In regards to texture formats, yes, texture formats are quite simple to implement, maybe this was jumping the gun but as others have said there are still plenty of extensions or alternatives that could be used in place of those dropped.

Hopefully this explains a bit of the considerations that may have been involved in dropping these features.

Eric Lengyel
08-06-2009, 03:35 PM
this explains the removal of alpha blending, many people will use their own custom techniques in shaders that requires data to be passed differently etc. Alpha blending is easy to emulate in shaders and fully programmable hardware has no fixed support for this so there's no performance loss.

You don't know what you're talking about, and you're speaking to a long-time OpenGL expert as if he's some ignorant newbie. (And you seem to have some confusion between alpha testing and alpha blending.) All modern hardware still has explicit support for alpha testing that's independent of shaders. For example, in the G80+ architecture, the alpha test is accessed through hardware command registers 0x12EC (enable), 0x1310 (reference value, floating-point), and 0x1314 (alpha function, OpenGL enumerant). There is a small decrease in performance if you use discard in simple shaders instead of using the alpha test. (Although for long shaders, there is sometimes an advantage to using discard instead of the alpha test because subsequent texture fetches can be suppressed for the fragment if there aren't any further texture fetches that depend on them.) I think it's a mistake to remove access to a hardware feature that actually exists and is useful.


In regards to quads, again it's a pain for OS developers when from your point of view, you simply need to use GL_TRIANGLE_STRIP, maintaining the same number of vertices and will simply have to adjust slightly the ordering of these vertices for the strip to be drawn correctly.

Again, you don't know what you're talking about. Triangle strips cannot be used to replace quads that aren't connected to each other.

Eric Lengyel
08-06-2009, 03:38 PM
3) Alpha, luminance, luminance/alpha, and intensity formats. These are very useful for specifying 1-2 channel textures! But the most important reason to keep these is that the hardware has remapping circuitry in the texture units that's independent of the shader units, so a shader doesn't have to be modified in order to work with an RGBA texture or a LA texture. If these formats can't be used, then two separate shaders would be necessary: one that operates on an RGBA sample, and another that reads an R or RG sample and then swizzles/smears to get the proper result.


Solution to this is using R or RG textures with GL_EXT_texture_swizzle.
IMO, once this ext. is in core we don't need I,IA,A textures.
Would be nice to promote this extension to core one day.


I agree. With the GL_EXT_texture_swizzle extension, the I, LA, and A formats are no longer necessary. But as you pointed out, this extension is not a core feature, so the problem I described still exists in OGL3. The proper solution would be to deprecate those texture formats *and* put the GL_EXT_texture_swizzle functionality in the core to avoid losing useful functionality.

Simon Arbon
08-07-2009, 12:22 AM
AMD_vertex_shader_tessellator and the DX11 tesselator both support the tesselation of quad patches.
But the core profile does not allow Quads.
So does that mean that vendors are not allowed to provide a tesselation extension for the core profile ?

Jan
08-07-2009, 01:14 AM
Good point!

I hope to see something like AMD_vertex_shader_tessellator in core (or ARB) in the near future. In that case quads are the foundation for quadpatches and thus will certainly be included again, anyway.

Jan.

Gedolo
08-07-2009, 04:12 AM
The new version is good news for OpenGL.
And I love the new features.

But still there are stuff that OpenGL is missing/can do better in my opinion.
These are suggestions for improving OpenGL.
They are ideas and thoughts, not more.

Please add tessellation in the next version of OpenGL.
And quads, quad patches.

Remove the binding system, it's a horrible thing.
Add/Enable Atomic operations.



Use the version numbers API Oriented!
What do I mean? Well let's explain:

Normally the first number, major version number is about big change, rewrite, incompatibility.
The second is much new features and bug fixes.
The third a bug fix, minor increment.
The fourth a build number.

x.y.z.b

When there are API-additions increment y.
Only when incompatibility should arise, collect as much as possible stuff and make a big leap, increment x.
Remove as much deprecated functionality as possible.


Following this logic, OpenGL would be:
OpenGL 3.2 => OpenGL 2.4
Which is quite different from the current version scheme.

ARB is currently using profiles for this, which is not a good idea because everything stays in the specification.
Thereby growing the specification without streamlining it.
The deprecation mechanism is a good improvement over having nothing. But there needs to be a clear cut off.

Dump the compatibility profiles, get rid of them.
I have nothing against profiles in general.
They just don't work for this problem.
We just need to work with versions.

What about the legacy stuff, compatibility?
The specification should use major versions for this.
The major versions could be implemented side-by-side by drivers to provide compatibility versions.
The versions that are available are the drivers creators responsibility.

Every application asks a certain OpenGL version.
There by asks a certain context to run on/in.
The driver can check it's versions and execute the programs with the right version.
The support depends on the driver level, not the specification profiles any more.

Deprecated features don't have to be in a compatibility profile of the newest version. Because the driver can implement an older version on which the applications run without complaining. The user and applications won't notice a thing about removed deprecated features.

This allows the specification to remove deprecated functionality completely in major newer versions while keeping backwards-compatibility!

The specification could state for backwards compatibility, that the earlier OpenGL versions can be added in the driver and contain a link, point to that specification.

This does a much better job than a compatibility profile, doesn't it?


For the current situation, a good version roadmap would be to continue the 3.y line.
Add the version use in the 2.x specification.
And start drafting a 4.y line with a new, revamped, clean and lean API.
(Let 4 spent a long, very long time in the development, draft process to iron everything out very good.)
The 4.y line can get:

- Cleaner API:
In the 4.y line remove deprecated features, no compatibility profile there.
Because adding the 3.y context + compatibility profile in the drivers would be the compatibility profile/mode, version for 4.y.


- Leaner API:

e.g. (There are without a doubt a lot of other and, or bigger such improvements possible to the API.) currently there is a command for cube maps and a command for seamless cube maps.
In OpenGL 4.y there would only be one command for seamless cube maps (That would be the shortest name, now it has a longer name.) And if only really necessary the command cube maps with seams should get added with a longer name.
Thus discouraging developers to use that.


What does everybody think of this idea to use major versions to provide backwards-compatibility AND API-cleanup?

Xmas
08-07-2009, 04:33 AM
All modern hardware still has explicit support for alpha testing that's independent of shaders.
Not true. But even if you had explicit support for alpha test in specific hardware, you could extract the necessary information from the shader code during compilation and convert a discard into fixed-function alpha test where it makes sense.

As hardware is becoming more and more programmable it is a good idea to get rid of state which might require the driver to modify shaders on-the-fly. Ideally the driver would know all state which may affect shader execution ahead of time, so that it can all be compiled into shader code.

Eosie
08-07-2009, 09:38 AM
It's not easy to get rid of alpha test. Hardware must support all the applications written using DX9 or GL1, where this feature is present and used extensively. Therefore, some hardware support is expected to be there for some time...

BTW, ATI R6xx-R7xx cards (i.e. all recent ones) support alpha test too, see:
Radeon R6xx/R7xx 3D Register Reference Guide (http://www.x.org/docs/AMD/R6xx_3D_Registers.pdf), page 126
Radeon R6xx/R7xx Acceleration (http://www.x.org/docs/AMD/R6xx_R7xx_3D.pdf), page 7, section 2.1.4

Alfonse Reinheart
08-07-2009, 11:08 AM
Larrabee won't.

I get your point with alpha test. And personally, I would have kept both alpha test and quads. But the reason for removing them is to make it that much easier for future hardware that won't have this kind of fixed-function thing.

OpenGL has always been an odd compromise between what happened before, what is now, and what things are trying to be in the future. That's why GL 1.3 use the in/out terminology rather than attributes and other stage-dependent terms.

If some future hardware that doesn't support alpha test has to modify your shader every time you turned it on/off, you'd rather that it was embedded in the shader to begin with.

Now personally, I would have done it by actually embedding it in the shader. That is, at link time, you can specify alpha test parameters. That way, it will always work like that, and the implementation can implement it in the most efficient way possible.

Eric Lengyel
08-07-2009, 03:55 PM
All modern hardware still has explicit support for alpha testing that's independent of shaders.
Not true.

Why do you feel you are qualified to tell me that my statement is not true? Do you write drivers for Nvidia or AMD? We have now given you the actual hardware register numbers where alpha test is explicitly supported in the latest chips from both Nvidia and AMD, so you are obviously wrong. I know what I'm talking about, but you're just making claims that you can't back up.

Jan
08-08-2009, 01:25 AM
Eric calm down. The people who know you, know that you are right. But it's still a forum, so people with different knowledge come together and not everybody knows who you are and what you do and thus does not know, how serious to take your claims.

Scribe really just wanted to help out and maybe xmas had some contradicting information from some source, too. I'm pretty sure they have been quite surprised by your harsh reply.

Now back to business.

Jan.

kRogue
08-08-2009, 03:56 AM
I have one comment aobut the alpha testing deal: considering that nvidia (and I imaginve ATI too) will support the compatibility profile you do get alpha test back in hardware, though ironically one has to use a compatibility profile to access a hardware feature.... I feel bad for the new hardware vendors when they do a desktop GL driver: they almost have two standards to deal with.

as for killing of alpha test in GL3.x... my bet is that because it is not in GLES2, that might seem like an odd reason, but perhaps there is some long term dream goals of unifying all the GL's we have (below is oversimplified too):
1. Desktop GL: 1.x/2.x, 3.x-core, 3.x-compatibile
2. GLES: GLES 1.x, GLES 2.x
3. Saftey Critical GL

though I cannot imagine how would could ever get GL 3.x to play with GL SC.

Gedolo
08-08-2009, 04:10 AM
This post is added stuff on my previous post.
Feetback on OpenGL 3.2 (http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=261969#Post2619 69)


Same problems about numbering count for the OpenGL ES family.
(And other OpenGL stuff in general.)

(Best practice also counts for standards in general.)

Don't use the version numbers for the difference in functionality.
What is going to happen if there is a need for OpenGL 1.y to get an major overhaul? Continuing numbering won't be possible. Using profiles would be the way to go here!

OpenGL ES 2.y can stay the same.

A new version of OpenGL ES, presumably 3.y could in a major revision get reduced functionality as OpenGL, but exact the same syntax. And would become a subset of OpenGL.

Backwards-compatibility concerns?
Drivers could provide OpenGL ES 2.y and 3.y side-by-side.
Allowing legacy and new programs to run without problems.

A programmer writing OpenGL-code could check if his code is also completely covered in OpenGL ES,
making porting very easy if there is enough scope in ES.

OpenGL ES 1.y would better get an extra tag:
e.g. OpenGL ES FF (Fixed Functions) something like this.


Hardware Vendors can be happy with this. Why?
Because they can slap more stickers on their product and give the impression it works with more stuff, does more.

e.g. A typical OpenGL graphic card would also get a sticker for OpenGL ES.

e.g. A graphic card could be compliant for:
OpenGL 3.y, OpenGL 4.y, OpenGL ES 3.y, WebGL and OpenCL

That's a lot more of just OpenGL, WebGL and OpenCL.


----------------

The binding system and the (bad/non/mis-use of the) numbering scheme are currently the two biggest problems of OpenGL.

arekkusu
08-08-2009, 07:58 AM
Eric, Xmas's point is valid. Nvidia and AMD are not the only companies making "modern" hardware. Consider embedded hardware and ES 2.0.

Alfonse Reinheart
08-08-2009, 12:25 PM
Eric, Xmas's point is valid. Nvidia and AMD are not the only companies making "modern" hardware. Consider embedded hardware and ES 2.0.

True, but OpenGL does not run on the hardware that OpenGL ES is implemented on and vice-versa. The whole point of having two separate specifications is to allow each to best serve the needs of their clients.

Alpha test is available on all desktop graphics hardware. Regular OpenGL is meant to serve desktop graphics hardware. Therefore, it should expose it. Alpha test may not be available on certain embedded systems hardware. OpenGL ES is meant to serve the needs of embedded systems. Thus, it makes sense for OpenGL ES to not support it.

Joining OpenGL and OpenGL ES is a bad idea so long as the hardware they support have substantive differences. It's one thing to have API similarity, such that similar functionality on one works in the same way on the other. Sharing GLSL is an example. But limiting functionality on one because of the other is just a terrible idea.

bobvodka
08-08-2009, 02:30 PM
Good point!
I hope to see something like AMD_vertex_shader_tessellator in core (or ARB) in the near future. In that case quads are the foundation for quadpatches and thus will certainly be included again, anyway.


If this happens then it would have to be part of some greater Tesselation Shader type setup anyways; the AMD Tesselator and what is coming in D3D11 hardware are not the same thing, the currently exposed AMD extension is 1/3 of the functionality in terms of pipeline stages.

As a side note; can anyone confirm GL3.1 support from AMD/ATI in the recent Cat drivers and if so on what OS?
I'm currently unable to confirm this is the case using the Cat9.8 'beta' drivers, nor the Cat9.7 or Cat9.6 in Win7 x64 (GL Extension viewer reports 2.1 and 3.1 forward compatible, GL Caps Viewer gives 2.1) and others are saying that they have support for 3.1.

(also, the text input box is HORRIBLY screwed when using IE8 on Win7, so much so once I got passed "what is coming in D3D11 hardware are not the same thing, the" I had to resort to finishing my post in notepad because the text box kept jumpping up and down as I typed.)

Scribe
08-08-2009, 03:03 PM
this explains the removal of alpha blending, many people will use their own custom techniques in shaders that requires data to be passed differently etc. Alpha blending is easy to emulate in shaders and fully programmable hardware has no fixed support for this so there's no performance loss.

You don't know what you're talking about, and you're speaking to a long-time OpenGL expert as if he's some ignorant newbie. (And you seem to have some confusion between alpha testing and alpha blending.) All modern hardware still has explicit support for alpha testing that's independent of shaders. For example, in the G80+ architecture, the alpha test is accessed through hardware command registers 0x12EC (enable), 0x1310 (reference value, floating-point), and 0x1314 (alpha function, OpenGL enumerant). There is a small decrease in performance if you use discard in simple shaders instead of using the alpha test. (Although for long shaders, there is sometimes an advantage to using discard instead of the alpha test because subsequent texture fetches can be suppressed for the fragment if there aren't any further texture fetches that depend on them.) I think it's a mistake to remove access to a hardware feature that actually exists and is useful.


In regards to quads, again it's a pain for OS developers when from your point of view, you simply need to use GL_TRIANGLE_STRIP, maintaining the same number of vertices and will simply have to adjust slightly the ordering of these vertices for the strip to be drawn correctly.

Again, you don't know what you're talking about. Triangle strips cannot be used to replace quads that aren't connected to each other.


My apologies, I misread alpha testing as blending. Though on that note, there are many advantages in a likely production environment of using shader based alpha tests such as avoiding aliasing by allowing the fragment to be sampled etc. As you say, any potential performance losses are minimal in the worst case to performance gains in best case and there are quality gains to be had under certain situations. It's worth reminding everyone that Fixed-Function Alpha Testing was also removed in DirectX10 almost 3 years ago and is not required any more for hardware to be compliant. As such it is likely that future hardware will choose to drop fixed support in favor of room for an extra shader core (or something along those lines of thinking). Given this I think dropping support in OpenGL was the right move, it's allowing new developers to future-proof and standardise the way they handle alpha testing. On the other hand if you really want the extra performance from supporting hardware, this is exactly what the compatability support is for or the use of extensions on a 2.x context.

In regards to QUADs, again I apologise, I did not realise that you were concerned from a tessellation/geometry perspective. Obviously for simple geometry and texturing they are equivalent. Perhaps when tessellation is further standardised and moved to core we'll see a new primitive like QUAD_PATCH or something. I can only suspect that keeping QUADs would have caused just a little confusion in regards to implementation as the way drivers handle these can vary and an extension that takes in QUADs as a set of connected vertices would require the data to be fed in in a specific manner? So again I would suggest this was perhaps done just for the sake of semantics and to limit implementation confusion where standardisation could have been difficult.

In relation to Xmas' comment on hardware alpha testing, there are other cards by companies such as Via S3, Intel and SIS. Whilst it is possible that these companies' latest cards also implement alpha testing in hardware, this could also be implemented via software emulation. Perhaps this is what he was getting at?

Mars_999
08-09-2009, 08:37 AM
All modern hardware still has explicit support for alpha testing that's independent of shaders.
Not true.

Why do you feel you are qualified to tell me that my statement is not true? Do you write drivers for Nvidia or AMD? We have now given you the actual hardware register numbers where alpha test is explicitly supported in the latest chips from both Nvidia and AMD, so you are obviously wrong. I know what I'm talking about, but you're just making claims that you can't back up.


Ouch! To Xmas, Eric isn't a noob and he knows his stuff. Maybe a bit more clarification on your part is needed...

I agree with Eric, IMO until we have a Alpha shader or whatever I would like to see alpha testing/blending kept around...

Mars_999
08-09-2009, 08:39 AM
Good point!
I hope to see something like AMD_vertex_shader_tessellator in core (or ARB) in the near future. In that case quads are the foundation for quadpatches and thus will certainly be included again, anyway.


If this happens then it would have to be part of some greater Tesselation Shader type setup anyways; the AMD Tesselator and what is coming in D3D11 hardware are not the same thing, the currently exposed AMD extension is 1/3 of the functionality in terms of pipeline stages.

As a side note; can anyone confirm GL3.1 support from AMD/ATI in the recent Cat drivers and if so on what OS?
I'm currently unable to confirm this is the case using the Cat9.8 'beta' drivers, nor the Cat9.7 or Cat9.6 in Win7 x64 (GL Extension viewer reports 2.1 and 3.1 forward compatible, GL Caps Viewer gives 2.1) and others are saying that they have support for 3.1.

(also, the text input box is HORRIBLY screwed when using IE8 on Win7, so much so once I got passed "what is coming in D3D11 hardware are not the same thing, the" I had to resort to finishing my post in notepad because the text box kept jumpping up and down as I typed.)

Yes it's coming on Aug 12th as stated by AMD... Figured you wouldn't care as you are in DX land now. Phantom!

Mars_999
08-09-2009, 08:49 AM
Would folks be interested in purchasing a pre-printed and plasticized Quick Reference Card? If so how much would be considered the right price.

PDF would remain free to download course.

$10 I would be game. If they are like these
http://www.barcharts.com/Products/Laminated-Reference?CID=1224

I would be also, but I would like to see another page added that states the depreciated functions or states and what one would or should look or use instead to get the same results on GL3+. I don't want to look up every old school way of doing something. It would be nice to glance at this card and say, ok I need to upload my Matrices to the vertex shader now instead. This is more for newbies and us GL2 coders that haven't kept up with all the GL3 features... Hence the price you pay. So I will pay if need be.

Thanks

aqnuep
08-09-2009, 12:51 PM
Hi,

I had to check out all the specs that got out lately (not just talking about GL spec but also extensions and hardware related stuff), so I come up with my future OpenGL release wish-list a bit late but I would like all you folks to comment on it if you think there's something to be added or you just don't feel that these are the most important things.

So enough from the chit-chat, here's the list:

ARB_atomic_operation (fictive)
I know everybody is talking about EXT_direct_state_access, but that extension is a bit weird and it's also written against OpenGL 2.1 so it wouldn't be a good idea to accept it in this form. This should be something similar but it should be a forward looking extension and should have a more clean design.

ARB_tesselation_shader (fictive)
This should be something similar to AMD_vertex_shader_tesselator, but that extension is also a bit crappy. One of the main point for a new extension for this is ARB_geometry_shader4 which is (IMO unfortunately) in core with OpenGL 3.2. I don't see any benefit in geometry shaders over tesselation, but I think it was introduced to expose NVIDIA's G80+ early tesselation hardware possibilities. Anyway if it's already in core we'll need a tesselation extension which can interact with geometry shaders, otherwise the API would be a bit confusing not allowing to have tesselation and geometry shaders at the same time. As it appears in the AMD_vertex_shader_tesselator extension there was an issue about whether introduce a new shader in the pipeline to replace vertex unpack or to modify vertex shader functionality. I think introducing a new so called tesselator shader before vertex shader would be a design that fits much better in the already existing API. Anyway, I think this should be a MUST for OpenGL 3.3 because ATI's hardware already supports it since the HD2000 series and it's sad that till now no graphics API provided a mechanism to expose this hardware capability (not even DX).

ARB_instanced_arrays
This extension is already provided by ATI for a long time and it's a quite useful feature from point of view of performance optimization and also as a room for new functionality in vertex shaders. Unfortunately I don't know if NVIDIA has hardware support for such mechanism.

EXT_timer_query
Another nice extension, this time provided by NVIDIA for years. It not just nicely fits into the already existing query API, but also opens a huge room for application developers to optimize their rendering and to easily identify bottlenecks. As far as I know there is no hardware limitation in ATI that would prevent them from implementing this if it would be core.

EXT_texture_swizzle
As many of you already mentioned, to replace some needed stuff as a result of the deprecation model, we would have to have something like this extension. It would also reduce the number of shaders needed to accomplish a certain series of operations. It's supported by both NVIDIA and ATI so I think there shouldn't be any reason why not to include in OpenGL 3.3.

ARB_texture_cube_map_array
For cube map textures associated with meshes to nicely with into a texture array based renderer, this is a MUST. I don't think that I need any further explanation.

AMD_texture_texture4/GL_ARB_texture_gather
This is actually an extension that provides custom filtering possibilities, especially 4xPCF. As I know this is a feature introduced to
DX with version 10.1. If OpenGL wants to keep up with DX then this extension is also a MUST for the next release. As far as I see it is a plan for Khronos as well.

ARB_blend_shader (fictive)
This would be a new shader that would replace the alpha blending mechanism. I think it would be a much better idea to have a separate shader for this purpose than extending ARB_draw_buffers with EXT_draw_buffers2 and ARB_draw_buffers_blend. Of course this is just my optinion and it also doesn't really expose any new hardware functionality, but hey! this is my wish-list not an order-list :)

ARB_gpu_association (fictive)
Based on AMD_gpu_association and NV_gpu_affinity there should an OpenGL provided mechanism to specify which GPU we would like to address with a specific command. Maybe the best way to put it into the API would be to have some GPU objects + different command queues for them. Anyway there is a big work to do with such an extension so it cannot be expected in the near future.

There should be also a new and clean API for handling texture objects because the old one stinks because it was designed for a different purpose that it is actually used nowadays in GL3+. I know that this would be a very big change so I don't expect it till e.g. OpenGL 4 or something like that, I just want to emphasize that there is a big need for such thing in the future (maybe it can be introduced with the fictive ARB_atomic_operation extension that I presented above).
For the new design I would expect that texture filtering and wrap modes won't be part of the texture object, instead those will be moved to the scope of shaders, so the texture fetching functions in GLSL will accept filtering and wrap mode related parameters. I think this would fit much better to the design of the API and also how hardware is/should evolve.
I think the guys at Khronos are working on something like this as well and for example that's why they don't put EXT_texture_filter_anisotropic into core because it uses the old way how things are done (and is also crappy IMO).

Even if I was a bit offensive in my post, as a final conclusion I would like to emphasize that I'm strongly committed to OpenGL and I also strongly appreciate the way how the guys are doing their job at Khronos nowadays because I really see that with such a schedule and pace OpenGL will not just keep up with DX but also can expose new hardware features in the future even sooner than it's rival.

Thanks for all of you and keep up the good work!

bobvodka
08-09-2009, 03:43 PM
Yes it's coming on Aug 12th as stated by AMD... Figured you wouldn't care as you are in DX land now. Phantom!

I care about putting out information/opinions which are based on the facts at hand; I claimed AMD didn't have a working GL3.1 driver right now, it was claimed by others they did, a claim I couldn't confirm with my own testing.

Thus far the claims are;
- AMD have a working 3.1 context which I can't confirm
- AMD will have a working 3.1 context by Aug 12th (wednesday).

I'll be intrested to see what happens wednesday as right now I'm jumpping on any new releases in the hope they will fix some other issues. (I suspect that, regardless of cost or power requirements my next card will be an NV one; after years of no ATI driver problems they have started to go south, might as well see what the other side is like when DX11 hardware appears).

Mars_999
08-09-2009, 04:30 PM
Yes it's coming on Aug 12th as stated by AMD... Figured you wouldn't care as you are in DX land now. Phantom!

I care about putting out information/opinions which are based on the facts at hand; I claimed AMD didn't have a working GL3.1 driver right now, it was claimed by others they did, a claim I couldn't confirm with my own testing.

Thus far the claims are;
- AMD have a working 3.1 context which I can't confirm
- AMD will have a working 3.1 context by Aug 12th (wednesday).

I'll be intrested to see what happens wednesday as right now I'm jumpping on any new releases in the hope they will fix some other issues. (I suspect that, regardless of cost or power requirements my next card will be an NV one; after years of no ATI driver problems they have started to go south, might as well see what the other side is like when DX11 hardware appears).

Well Aug 12th came from the horse's mouth... So yeah we'll see.

I would wait for the new GF300 series cards out this fall/winter. DX11 and from the leaked specs OMG! 6x better performance than a GTX 280 card!!! Can't wait to see if that is the case. BTW the new card is rumored to cost $500!! or more for top of the line and they are supposed to have various other levels of cards to accommodate the cheaper crowd.

BTW I heard you got a Core i7 920... How's that treating you? What about compile times? Are they any faster? I am in the market for a new machine and may wait until Lynnfield is out on Sep 6th... or build a Core Duo Quad on the cheap if Core i7 isn't worth it.

Alfonse Reinheart
08-09-2009, 04:40 PM
ARB_instanced_arrays
This extension is already provided by ATI for a long time and it's a quite useful feature from point of view of performance optimization and also as a room for new functionality in vertex shaders. Unfortunately I don't know if NVIDIA has hardware support for such mechanism.

This is an awful extension that should never be made core. ARB_draw_instanced is fundamentally superior and is already in the core.


AMD_texture_texture4/GL_ARB_texture_gather
This is actually an extension that provides custom filtering possibilities, especially 4xPCF. As I know this is a feature introduced to DX with version 10.1. If OpenGL wants to keep up with DX then this extension is also a MUST for the next release. As far as I see it is a plan for Khronos as well.

NVIDIA hardware does not support this, so making it core would prevent them from providing a core implementation for that version.

Thus far, all OpenGL 3.x core features work on the same kind of hardware. It is very important that the ARB maintains this.


EXT_texture_swizzle
As many of you already mentioned, to replace some needed stuff as a result of the deprecation model, we would have to have something like this extension. It would also reduce the number of shaders needed to accomplish a certain series of operations. It's supported by both NVIDIA and ATI so I think there shouldn't be any reason why not to include in OpenGL 3.3.

This is not a good extension. Or, let me put it another way. The idea behind the extension is good; the particulars are not.

Implementing this extension without dedicated hardware for it requires modifying the shader based on what textures are bound to it. There is already enough modifying of shaders based on uniforms and other such going on in drivers. We do not need to have extensions sanction this practice.


ARB_texture_cube_map_array
For cube map textures associated with meshes to nicely with into a texture array based renderer, this is a MUST. I don't think that I need any further explanation.

Same problem as ARB_texture_gather.

Eric Lengyel
08-09-2009, 10:45 PM
Eric calm down. The people who know you, know that you are right. But it's still a forum, so people with different knowledge come together and not everybody knows who you are and what you do and thus does not know, how serious to take your claims.

Scribe really just wanted to help out and maybe xmas had some contradicting information from some source, too. I'm pretty sure they have been quite surprised by your harsh reply.

Sorry about being harsh. I'm just very frustrated with some of the design decisions being made in OpenGL 3, and that has put me on kind of a short fuse when it comes to people telling me that things I know to be true about graphics hardware aren't true. I realize that the people here were trying to be helpful, but they also need to have a realistic understanding of their knowledge level and refrain from stating unsubstantiated information in an authoritative manner. I asked for reasoning from the "insiders" who've posted in this thread, not for arbitrary speculation by people who aren't actually familiar with the silicon.

Eric Lengyel
08-09-2009, 10:52 PM
EXT_texture_swizzle
As many of you already mentioned, to replace some needed stuff as a result of the deprecation model, we would have to have something like this extension. It would also reduce the number of shaders needed to accomplish a certain series of operations. It's supported by both NVIDIA and ATI so I think there shouldn't be any reason why not to include in OpenGL 3.3.

This is not a good extension. Or, let me put it another way. The idea behind the extension is good; the particulars are not.

Implementing this extension without dedicated hardware for it requires modifying the shader based on what textures are bound to it. There is already enough modifying of shaders based on uniforms and other such going on in drivers. We do not need to have extensions sanction this practice.

Fortunately, hardware support for this extension goes back a very long way on both NV and ATI chips. I have register layouts for this functionality for NV40+ and R400+, and earlier chips may support it as well.

Jan
08-09-2009, 11:19 PM
I think EXT_texture_swizzle is a good idea, and i don't know how that extension is "not good". How else could you do such a thing ?

I am definitely happy to see luminance / alpha / intensity / whatever textures go away and replace it by one very clear extension.

Jan.

Eric Lengyel
08-09-2009, 11:29 PM
As you say, any potential performance losses are minimal in the worst case to performance gains in best case.

I should mention that there is at least one case in which alpha test hardware is very important: when rendering a shadow map with alpha-tested geometry. Since the shader used for this is almost always very short (one cycle on a lot of hardware), adding a KIL instruction to the shader will cut performance by as much as 50%. The equivalent alpha test is free.

Also, since the alpha test must be supported by the driver for the foreseeable future, I don't think IHVs are going to drop hardware support for it any time soon. It's not difficult to implement in hardware, and it would be silly to burden the driver with recompiling a shader just because the alpha test was enabled/disabled or the alpha function was changed.

Groovounet
08-10-2009, 02:16 AM
ARB_instanced_arrays
This extension is already provided by ATI for a long time and it's a quite useful feature from point of view of performance optimization and also as a room for new functionality in vertex shaders. Unfortunately I don't know if NVIDIA has hardware support for such mechanism.

This is an awful extension that should never be made core. ARB_draw_instanced is fundamentally superior and is already in the core.


I can't remember my source so I'm going to say "I think" that ARB_draw_instanced is exposed on GeForce 6 and 7 but the "hardware feature" have been removed in GeForce 8 in favours of GL_ARB_draw_instanced.

Groovounet
08-10-2009, 02:35 AM
Eric, Xmas's point is valid. Nvidia and AMD are not the only companies making "modern" hardware. Consider embedded hardware and ES 2.0.

True, but OpenGL does not run on the hardware that OpenGL ES is implemented on and vice-versa. The whole point of having two separate specifications is to allow each to best serve the needs of their clients.


Actually it does run on hardware that OpenGL ES is implemented and vice-versa. For PowerVR SGX we have drivers for OpenGL ES, OpenGL, Direct3D 9 and 10. I don't know exactly how public those drivers are, I guest it dependents on the platform where the chip is used. I'm not saying that drivers other than OpenGL ES are as feature complete ...

aqnuep
08-10-2009, 03:19 AM
ARB_instanced_arrays
This extension is already provided by ATI for a long time and it's a quite useful feature from point of view of performance optimization and also as a room for new functionality in vertex shaders. Unfortunately I don't know if NVIDIA has hardware support for such mechanism.

This is an awful extension that should never be made core. ARB_draw_instanced is fundamentally superior and is already in the core.

I agree that ARB_draw_instanced is far more useful for instanced drawing but imagine the many other ways how you can use it. For example an attribute for every second triangle and so on...



AMD_texture_texture4/GL_ARB_texture_gather
This is actually an extension that provides custom filtering possibilities, especially 4xPCF. As I know this is a feature introduced to DX with version 10.1. If OpenGL wants to keep up with DX then this extension is also a MUST for the next release. As far as I see it is a plan for Khronos as well.

NVIDIA hardware does not support this, so making it core would prevent them from providing a core implementation for that version.

Thus far, all OpenGL 3.x core features work on the same kind of hardware. It is very important that the ARB maintains this.

NVIDIA hardware supports or will support it in the future because it's a DX 10.1 feature. I don't think that NVIDIA's DX11 hardware won't support this feature as well.



EXT_texture_swizzle
As many of you already mentioned, to replace some needed stuff as a result of the deprecation model, we would have to have something like this extension. It would also reduce the number of shaders needed to accomplish a certain series of operations. It's supported by both NVIDIA and ATI so I think there shouldn't be any reason why not to include in OpenGL 3.3.

This is not a good extension. Or, let me put it another way. The idea behind the extension is good; the particulars are not.

Implementing this extension without dedicated hardware for it requires modifying the shader based on what textures are bound to it. There is already enough modifying of shaders based on uniforms and other such going on in drivers. We do not need to have extensions sanction this practice.

I don't think it's not a good extension. Maybe if some new texture object mechanism is introduced then another extension should be introduced instead, but this does it's work and it's already in drivers.



ARB_texture_cube_map_array
For cube map textures associated with meshes to nicely with into a texture array based renderer, this is a MUST. I don't think that I need any further explanation.

Same problem as ARB_texture_gather.

Again, if NVIDIA would like to keep up with ATI in new features, they should also support this as well. Anyway, it's not probable that anything will prevent NVIDIA to adopt it in the future.

Xmas
08-10-2009, 03:40 AM
Sorry about being harsh. I'm just very frustrated with some of the design decisions being made in OpenGL 3, and that has put me on kind of a short fuse when it comes to people telling me that things I know to be true about graphics hardware aren't true. I realize that the people here were trying to be helpful, but they also need to have a realistic understanding of their knowledge level and refrain from stating unsubstantiated information in an authoritative manner. I asked for reasoning from the "insiders" who've posted in this thread, not for arbitrary speculation by people who aren't actually familiar with the silicon.
That should be fine then, as I am familiar with silicon and with the ARB as well.

I'm not taking offense. However, if you want to only talk about hardware from AMD and NVidia I'd suggest you say so instead of making sweeping statements about "all modern hardware".

Alfonse Reinheart
08-10-2009, 10:28 AM
Fortunately, hardware support for this extension goes back a very long way on both NV and ATI chips. I have register layouts for this functionality for NV40+ and R400+, and earlier chips may support it as well.

Oh. Well, never mind then.

Though it is still a concern for hardware that doesn't explicitly have such functionality.


I agree that ARB_draw_instanced is far more useful for instanced drawing but imagine the many other ways how you can use it. For example an attribute for every second triangle and so on...

NVIDIA already removed hardware support for it from the G80 line. So there is really no reason to bring it into the core.


NVIDIA hardware supports or will support it in the future because it's a DX 10.1 feature.

I think you misunderstand something.

The OpenGL 3.x core is all supported by a certain set of hardware. That hardware being G80 and above, and R600 and above. Core features for the 3.x line should not be added unless they are in this hardware range.

The two extensions, texture_gather and cube_map_array are not available in NVIDIA's current hardware line. They will be some day, but not in this current line of hardware. And therefore, it is not proper for these to be core 3.x features; they should be core 4.x features.

Extensions are not evil. They serve a purpose.

Oh, and NVIDIA is not going to support DX10.1 unless it is in DX11-class hardware.

barthold
08-10-2009, 10:43 AM
Would folks be interested in purchasing a pre-printed and plasticized Quick Reference Card? If so how much would be considered the right price.

PDF would remain free to download course.

$10 I would be game. If they are like these
http://www.barcharts.com/Products/Laminated-Reference?CID=1224



The reference cards we handed out at Siggraph last week were like the ones you linked to. Four letter sized pages, laminated, printed on both sides.




but I would like to see another page added that states the depreciated functions or states and what one would or should look or use instead to get the same results on GL3+. I don't want to look up every old school way of doing something.
Thanks

The reference cards (including the PDF http://www.khronos.org/files/opengl-quick-reference-card.pdf) mark all functionality not in the core profile as blue. Thus you can already get this info from the reference card.

Regards,
Barthold
(with my ARB hat on)

kRogue
08-10-2009, 11:35 AM
I just had one more odd thought of why killing alpha test in core profile can be a good thing: portability. Admittedly one might think it is of great utter stupidity to want one's 3D code to be portable between GLES2 and Desktop GL 3.x (limited) {i.e. only use what they have in common, but, ick that is not very true either FBO setup with regards to depth and stencil buffers is very different in practice} BUT, open up Qt. I don't like Qt, but it has a GLES2 drawing backend and a Desktop fixed function pipeline drawing backend {both suck actually in my eyes and the entire QPainter architecture needs some serious love}. Continuing this thought: one writes a simple 3D app and wish for it to run on embedded device or desktop; if one does not push the hardware at all on desktop and writes for shaders only carefully limited GL 2.1 (i.e. shader version 120, and avoiding most extensions unless they are proved by Qt wrappers) and a couple more icky ifs, then in theory the code ports between desktop and portable. Lots of ifs, but *cough* the Qt GL examples many of them are for both GLES2 and desktop GL.

The above might seem like WTF, no one would do that, but that is where I have seen Qt going... right now it has some bonked in the head framebuffer wrappers that map to GLES2 or desktop GL API and exposes only what one can do in both (actually the stencil part for GLES2 needs a little tweaking to get it to work correctly under GLES2). sighs, and it is written using GL_EXT_framebuffer_object. it also has some bonked in the head shader wrappers, etc.

Jan
08-10-2009, 11:52 AM
While we are at Qt: Does anyone know, whether / how it is possible to create a GL 3.x context using Qt? I'm not a Qt master and all my tries and searches have failed.

I'm pretty sure it is possible SOMEHOW (writing one's own qGLWidget replacement or such) but i would really need some pointers, how to do it.

That's the real point holding me back to switch to GL 3.x, some of my tools use Qt, but all my applications share shaders, sooo not possible to mix and match.

Jan.

Mars_999
08-10-2009, 03:09 PM
Would folks be interested in purchasing a pre-printed and plasticized Quick Reference Card? If so how much would be considered the right price.

PDF would remain free to download course.

$10 I would be game. If they are like these
http://www.barcharts.com/Products/Laminated-Reference?CID=1224



The reference cards we handed out at Siggraph last week were like the ones you linked to. Four letter sized pages, laminated, printed on both sides.




but I would like to see another page added that states the depreciated functions or states and what one would or should look or use instead to get the same results on GL3+. I don't want to look up every old school way of doing something.
Thanks

The reference cards (including the PDF http://www.khronos.org/files/opengl-quick-reference-card.pdf) mark all functionality not in the core profile as blue. Thus you can already get this info from the reference card.

Regards,
Barthold
(with my ARB hat on)



Yeah I like the format so far, but I would really like to see a reference page to all the depreciated functions to what one needs to use with GL3.x, I have looked at the new "Red Book" ver. 7 and it doesn't cover these kind of heads up or discuss what one will have to use in its place. IMO this is going to be very handy to someone like myself moving from GL2.x to GL3.x. e.g. Matrix math will now be handled by YOU... So state this in the card and the user will need a good math lib or make there own... Stuff like this needs to be explained so one doesn't go where is glScale, glRotate, glTranslate at? and the countless other functions that are more shader specific, just state to be completed in a shader... I am just throwing out ideas here.

Thanks

aqnuep
08-10-2009, 03:15 PM
I agree that ARB_draw_instanced is far more useful for instanced drawing but imagine the many other ways how you can use it. For example an attribute for every second triangle and so on...

NVIDIA already removed hardware support for it from the G80 line. So there is really no reason to bring it into the core.

OK! Maybe you're right but I think it is still useful as an optimization because sometimes you need per-triangle data or things like that and in such cases you either have to have redundant data or use things like buffer textures using primitive ID for lookup but I think that would hit performance a bit. Anyway, if NVIDIA had support for it maybe it's not a big deal to put it back again or just someone should come up with a similar stuff because in my rendering engine I would really be able to take advantage of it.



NVIDIA hardware supports or will support it in the future because it's a DX 10.1 feature.

I think you misunderstand something.

The OpenGL 3.x core is all supported by a certain set of hardware. That hardware being G80 and above, and R600 and above. Core features for the 3.x line should not be added unless they are in this hardware range.

The two extensions, texture_gather and cube_map_array are not available in NVIDIA's current hardware line. They will be some day, but not in this current line of hardware. And therefore, it is not proper for these to be core 3.x features; they should be core 4.x features.

Hmm. I don't think so that new OpenGL versions shall include just already existing hardware functionalities but should also look forward, otherwise it will be always behind DX by releasing core functionalities only after the hardware for it is already out.


Oh, and NVIDIA is not going to support DX10.1 unless it is in DX11-class hardware.

Yes, but DX11-class hardware will support DX10.1 as well and they will be soon out, so it's time for OpenGL to support such stuff (like tesselation for example, even if ATI supports it for a long time ago).

My vision about OpenGL's future is that the specification should be already out when hardware supporting it just appears. This can be achieved, because the ARB is a strong cooperation between vendors. Microsoft already achieved this why OpenGL shouldn't?

I have concerns with the attitude of most people working with OpenGL, because they are NVIDIA supporters. Of course I know why is this so, because NVIDIA had always the best support for OpenGL. Maybe I'm the only one believing that ATI/AMD can also be an excellent choice with OpenGL. Anyway, if we just care about what NVIDIA supports and we don't care about at least the second big player in desktop 3D world, then OpenGL will just become NVIDIA's "proprietary" API.

Off-topic, but two more points for ATI: they have quite good drivers nowadays and they really have more pure horsepower what I really like when using heavy weight shaders.

aqnuep
08-10-2009, 03:26 PM
Oh, I forgot to mention a very simple, but handy use case of ARB_instanced_arrays:

Think about using texture arrays for materials. How you can assign a particular material for each and every triangle/mesh? There are at least three choices:

1. use the primitive ID or instance ID and map it somehow to a layer index to use for addressing the texture array (the mapping it's at least a bit expensive)

2. have the material ID in the VBO for each and every vertex (at least triples the data, redundant and not optimal)

3. use an attribute divisor of e.g. 3 using ARB_instanced_arrays and voila

I think obviously the third option is the best. If you know better ideas please share it, I'm really interested because I need some stuff like this.

Ilian Dinev
08-10-2009, 04:35 PM
3. use an attribute divisor of e.g. 3 using ARB_instanced_arrays and voila

I think obviously the third option is the best. If you know better ideas please share it, I'm really interested because I need some stuff like this.
If you're using an IBO, with a "flat" vtx-attribute you could bring-down the vtx-count down to num-triangles. Still, if triangles don't share vertices, it'll again make vtx-count = num-tris*3 . I really liked the divisor, and am sad to hear it's dropped from GL3-class silicon; but the primitiveID/instanceID are obviously greater. (you can do more powerful division and resource-fetch inside the shader).

Alfonse Reinheart
08-10-2009, 04:49 PM
don't think so that new OpenGL versions shall include just already existing hardware functionalities but should also look forward, otherwise it will be always behind DX by releasing core functionalities only after the hardware for it is already out.

Looking forward would be releasing an OpenGL 4.0 specification now, similar to how Microsoft has released DX11 API documentation. Notice that Microsoft did not release DX10.2; it is DX11.

When making major hardware changes, you bump the major version numbers. OpenGL 3.x should not promote to core features that do not exist in 3.x-class hardware.


Yes, but DX11-class hardware will support DX10.1 as well and they will be soon out, so it's time for OpenGL to support such stuff (like tesselation for example, even if ATI supports it for a long time ago).

OpenGL core should support those things exactly and only when they make version 4.0. As pointed out previously, 3.x is for a particular level of hardware; 4.0 is for a higher level of hardware.

Breaking this model screws everything up. Take 3.2 for example. It includes GLSL 1.5, ARB_sync, and ARB_geometry_shader4 as part of the core, as well as the compatibility/core profile system. If you add on top of this a DX11 feature that can only be made available on DX11 hardware, then anyone using these features on standard 3.x hardware has to use them as extensions of 3.1, not core features of 3.2.

No. 4.0 should be where DX11 features are supported, not 3.2.


Think about using texture arrays for materials. How you can assign a particular material for each and every triangle/mesh?

Why would you? Outside of some form of gimmick, I can't think of a reason why you would need this.

Eric Lengyel
08-10-2009, 04:53 PM
I'm not taking offense. However, if you want to only talk about hardware from AMD and NVidia I'd suggest you say so instead of making sweeping statements about "all modern hardware".

Please give an example of a GPU currently in production that supports OpenGL 3, but does not have alpha test hardware.

Rob Barris
08-10-2009, 06:35 PM
I'm not taking offense. However, if you want to only talk about hardware from AMD and NVidia I'd suggest you say so instead of making sweeping statements about "all modern hardware".

Please give an example of a GPU currently in production that supports OpenGL 3, but does not have alpha test hardware.

For extra credit, name one that isn't supported by the ARB_compatibility extension, which would preserve that feature among many others..

Eric Lengyel
08-10-2009, 07:13 PM
Think about using texture arrays for materials. How you can assign a particular material for each and every triangle/mesh?

Why would you? Outside of some form of gimmick, I can't think of a reason why you would need this.

We actually allow this kind of material assignment for the voxel terrain system in the C4 Engine. The user is able to paint materials onto the terrain at voxel granularity, so the shader ends up needing to be able to fetch textures based on per-polygon material data. Array textures on SM4 hardware makes this somewhat easier, but we also have to maintain a fallback on older hardware that stuffs a bunch of separate textures into a single 2D palette texture. Unfortunately, array textures are currently broken under ATI drivers if you use S3TC, so we only get to use them on Nvidia hardware for now.


Off-topic, but two more points for ATI: they have quite good drivers nowadays

I would agree that the ATI drivers are much better than they have been in recent times, but I wouldn't go as far as calling them "quite good" yet. The problem with compressed array textures I mention above is only one of several open bug reports that we have on file with ATI right now.

Another particularly annoying bug is that changing the vertex program without changing the fragment program results in the driver not properly configuring the hardware to fetch the appropriate attribute arrays for the vertex program. This forces us to always re-bind the fragment program on ATI hardware. You can see this bug for yourself by downloading the following test case with source:

http://www.terathon.com/c4engine/ATIGLTest.zip

Under correctly working drivers, you'll see a red quad and a green quad. Under broken drivers, you'll see only a red quad.

An apparently related bug is that any program environment parameters that you've set for a fragment program are erased whenever you change the vertex program.

The TXB instruction also does not function correctly in fragment programs.

ATI still has a way to go before they catch up to Nvidia's stability.

Mars_999
08-10-2009, 10:12 PM
Hey Eric, are you using glGenerateMipmaps()? or GL_GENERATE_MIPMAP?

I found this same bug on Nvidia awhile back and it took awhile even after talking with Pat @ Nvidia to get it working on the current drivers you see now with texture arrays and compression. It was a mess I would get white textures IIRC or black... can't remember anyway I think I had to use GL_GENERATE_MIPMAP to get around it though....

Eric Lengyel
08-10-2009, 10:38 PM
Hey Eric, are you using glGenerateMipmaps()? or GL_GENERATE_MIPMAP?

Neither. We generate the mipmaps off-line and store them in resources.

Here's a test app (with source) for the array texture bug:

http://www.terathon.com/c4engine/TextureArrayTest.zip

Under working drivers, you'll see a quad on the screen that is half red and half green. With broken drivers, you'll just see black.

kRogue
08-11-2009, 01:42 AM
In reply to Jan's question about Qt and GL 3.x context creation:

You won't like the answer: creating a GL context is hidden under Qt's covers (as the implementation is different for MS-Windows vs X-Windows), and drum roll please: does not use the new entry points to create a GL context. You cannot even request it to do it that way either. If you are into this kind of thing open up Qt's source code, src/opengl/qgl_x11.cpp and you can find the context creation stuff, and LOOK nothing about the new context creation method.

but it gets richer: bits of the Qt source for OpenGL use the ARB interface for shaders (atleast Qt's shader API is using the 2.0+ interface).

and the FBO stuff is like a laughing matter to, so, sighs, crapped up. Qt for desktop only checks for GL_EXT_framebuffer_object and uses those entry points for its qglframebufferobject API, epic fail.


to get a GL 3.x context under Qt: use an nVidia driver since it will return a 3.x compatibility context/profile with the old entry point; ATI from last I _heard_ was more strict about generating a 3.x context, it insisted on going though the new context creation method to return a GL 3.x context.

That was prolly not a nice answer though.

skynet
08-11-2009, 02:17 AM
I suggest creating your own GL3-Context and let it render into Qt's DC. Would that work for you?

aqnuep
08-11-2009, 03:07 AM
Looking forward would be releasing an OpenGL 4.0 specification now, similar to how Microsoft has released DX11 API documentation. Notice that Microsoft did not release DX10.2; it is DX11.

When making major hardware changes, you bump the major version numbers. OpenGL 3.x should not promote to core features that do not exist in 3.x-class hardware.

OpenGL core should support those things exactly and only when they make version 4.0. As pointed out previously, 3.x is for a particular level of hardware; 4.0 is for a higher level of hardware.

Breaking this model screws everything up. Take 3.2 for example. It includes GLSL 1.5, ARB_sync, and ARB_geometry_shader4 as part of the core, as well as the compatibility/core profile system. If you add on top of this a DX11 feature that can only be made available on DX11 hardware, then anyone using these features on standard 3.x hardware has to use them as extensions of 3.1, not core features of 3.2.

No. 4.0 should be where DX11 features are supported, not 3.2.

I never said I'm talking about OpenGL 3.2. In my initial post I just wrote my wish-list for future releases of the API. I never mentioned that it should be 3.x or 4.x. What about tesselation? It is also a DX11 feature and nobody complained about that I've listed it. I think you either misunderstand me or just want to find some mistake in what I've said. Please, I'm just talk about what I would like to see in future versions because I would really like to use them.



Think about using texture arrays for materials. How you can assign a particular material for each and every triangle/mesh?

Why would you? Outside of some form of gimmick, I can't think of a reason why you would need this.

Eric Lengyel pointed out one example but here's mine: I would like to batch multiple object drawings that (without texture arrays) can be made only using texture atlases (which suck IMO) or use 3D textures but then I cannot use mipmapping. Nowadays rendering is more CPU bound than GPU bound, so batching is one of the main rooms for optimization. That's why I would use it this way.

aqnuep
08-11-2009, 03:21 AM
I would agree that the ATI drivers are much better than they have been in recent times, but I wouldn't go as far as calling them "quite good" yet. The problem with compressed array textures I mention above is only one of several open bug reports that we have on file with ATI right now.

OK! You're right, unfortunately they don't put too much effort in OpenGL drivers in the favour of DX ones. I also hate this because I'm a "weird animal" as an OpenGL user and ATI supporter :)
That's why I also appreciate Khronos' schedule because at least it forces ATI/AMD to adopt new OpenGL features. They will do it, because they want to use the "OpenGL 3.x compliant" sticker on their cards :)


Another particularly annoying bug is that changing the vertex program without changing the fragment program results in the driver not properly configuring the hardware to fetch the appropriate attribute arrays for the vertex program. This forces us to always re-bind the fragment program on ATI hardware.

As I see you're using assembly shaders. I liked them much more a few years ago as well, because there is much more possibility to optimize with those, but unfortunately ATI is very weak in it, at least they do not put to much effort to develop assembly shaders any further, they are just concentrating on GLSL. Again I think because this is the advertised way of using shaders.


ATI still has a way to go before they catch up to Nvidia's stability.

Yes, totally agree. Just I believe in them anyway, because ATI's history started a long time ago, when they just came up by purchasing small hardware companies and using their interesting ideas to catch up with the big players. That's why I like them and I think they still come up with some innovative ideas. OK, I'm not objective but who is? :)

Anyway, these whole stuffs were just my personal ideas. Maybe I'm not the most competent contributor to the topic. I'm just a so called garage OpenGL developer, because, even if I'm a professional software developer, unfortunately I'm currently working in the telecommunication industry. There isn't much need for OpenGL developers in Hungary :)

So sorry about sticking to my opinion and thanks for the many replies. That's exactly what I meant to achieve: to get some objective feedback about my vision.

Eric Lengyel
08-11-2009, 03:54 AM
As I see you're using assembly shaders. I liked them much more a few years ago as well, because there is much more possibility to optimize with those, but unfortunately ATI is very weak in it, at least they do not put to much effort to develop assembly shaders any further, they are just concentrating on GLSL. Again I think because this is the advertised way of using shaders.

I still use assembly shaders whenever possible because the compile times are so much faster, and the C4 Engine generates shaders on the fly. I love Nvidia for the awesome effort they put into maintaining assembly support for all the new hardware features. The C4 Engine can also generate GLSL fragment shaders, and those are used for any particular shader requiring a feature that is only exposed through GLSL on ATI hardware (or only works correctly in GLSL on ATI hardware, like texture fetch with bias).

Another big advantage to using assembly shaders is the ability to have global parameters. I still can't believe those were left out of GLSL. According to issue #13 of GL_ARB_shader_objects, this functionality was considered useful, but was deferred for some idiotic reason. So now, if I need to render 50 objects with 50 different shaders, and they all need to access the same light color, I'm forced to specify that light color 50 times as a parameter for all 50 shaders. Brilliant.

aqnuep
08-11-2009, 04:11 AM
I still use assembly shaders whenever possible because the compile times are so much faster, and the C4 Engine generates shaders on the fly. I love Nvidia for the awesome effort they put into maintaining assembly support for all the new hardware features. The C4 Engine can also generate GLSL fragment shaders, and those are used for any particular shader requiring a feature that is only exposed through GLSL on ATI hardware (or only works correctly in GLSL on ATI hardware, like texture fetch with bias).

Yes, this makes sense of course. I read about your engine in the past. I abandoned assembly shaders especially because of the fact that they are only up-to-date with NVIDIA drivers. BTW it's also the ARB's fault that they aren't maintained because there's no real vendor-independent extension for them since a long time ago.


Another big advantage to using assembly shaders is the ability to have global parameters. I still can't believe those were left out of GLSL. According to issue #13 of GL_ARB_shader_objects, this functionality was considered useful, but was deferred for some idiotic reason. So now, if I need to render 50 objects with 50 different shaders, and they all need to access the same light color, I'm forced to specify that light color 50 times as a parameter for all 50 shaders. Brilliant.

Using uniform buffers shall solve the problem from now on. But I'm not familiar with the exact use case so maybe I'm wrong.

Jan
08-11-2009, 04:16 AM
kRouge, skynet thanks for the infos.

"I suggest creating your own GL3-Context and let it render into Qt's DC. Would that work for you?"

I haven't thought this through, but as i see it, that could work. I am working on Windows exclusively, so i assume i would create the 3.x context myself after Qt initialized the widget and then somehow bind it to the same window?

Maybe i'll try it in a few days. But i think it would be more appropriate to put the discussion into another thread then.

Actually i was hoping that Trolltech adds flags for 3.x context creation to Qt soon, but i can't find any information, whether it is planned to do so.

Jan.

kRogue
08-11-2009, 04:47 AM
I, err, don't think you can make the GL context yourself, let me elaborate:

If you want to do QGLWidget, you must let Qt make the context for you, in doing so, the painter back end will then use GL to draw.

if you make the context yourself you will need to use a QWidget and duplicate alot of the QGLWidget code yourself: swapping buffers, etc; worse you will need to get into some more hackery if you want to use QPainter to your widget, because chances are it will do ungood but interesting things..


or just hack Qt your self, and *ick* rebuild it on Windows... ewwww...

Scribe
08-11-2009, 07:12 AM
As you say, any potential performance losses are minimal in the worst case to performance gains in best case.

I should mention that there is at least one case in which alpha test hardware is very important: when rendering a shadow map with alpha-tested geometry. Since the shader used for this is almost always very short (one cycle on a lot of hardware), adding a KIL instruction to the shader will cut performance by as much as 50%. The equivalent alpha test is free.

Also, since the alpha test must be supported by the driver for the foreseeable future, I don't think IHVs are going to drop hardware support for it any time soon. It's not difficult to implement in hardware, and it would be silly to burden the driver with recompiling a shader just because the alpha test was enabled/disabled or the alpha function was changed.


I would be interested to see just how much room alpha testing hardware requires on a gpu. If you could get an extra shader core in then that at least explains why you'd want to remove it. I mean a 50% slowdown in creating shadow map vs performance increase on all other pixels ops may balance out. Would be nice to know the answer to this one.

Also in future generations such as the roumored *cough* 6 times!? faster dx11 GT300 *cough* ray tracing etc may look more attractive and even devs may start wanting shader core power over other fixed function features?

On a side note I'm seriously struggling to believe x6!

Jan
08-11-2009, 07:16 AM
"On a side note I'm seriously struggling to believe x6!" - me too.

kRouge: Yes, i thought Qt should create the "main" context (2.1), then via extensions I create the 3.x context and use that for further rendering, no?

kRogue
08-11-2009, 11:38 AM
it would work except that you need to change the GL context handle that Qt is using to the one you get back from wgl/glxCreateContextAttribARB, whose value (and type!) is hidden away in the _source_ files of Qt, when you lookinto Qt header you will often see this pattern:

class QClassPrivate
class QClass //and then some icky Qt quasi macros for moc
{
public:
//yada-yada-yada.
protected:
//yada-yada-yada

//some more yada-yada-yada for slots and signals.

private:
QClassPrivate *d;
};

and the class definition for QClassPrivate will be different depending on the platform Qt is compiled for.

Qt's abstraction for a GL context is QGLContext, and as expected does not expose the actual context handle from the windowing system much less let you change it, so sighs.

In Qt's defense though, it is the same API for all of the following (well mostly same Qt API, with the caveat if the platform supports that API functionality):

X11 with Desktop GL using glX to make context
X11 with GLES1 or GLES2 using egl to make context
Windows with desktop GL using wgl to make context
WindowsCE with GLES1 or GLES2 using *I think* egl to make context

an example caveat: QGLShader and QGLShaderProgram are not supported in GLES1 (duh).


um, err, you are right we should make a Qt GL thread for this, sighs... well if you want to write more on this (and for me to write as well) start a new thread and I will write there. My apologies to those wanting to read on GL 3.2 feedback and not Qt.

skynet
08-11-2009, 12:39 PM
I created a new thread here:
Using GL3 with Qt (http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=262179#Post2621 79)

Mars_999
08-11-2009, 06:13 PM
Hey Eric, are you using glGenerateMipmaps()? or GL_GENERATE_MIPMAP?

Neither. We generate the mipmaps off-line and store them in resources.

Here's a test app (with source) for the array texture bug:

http://www.terathon.com/c4engine/TextureArrayTest.zip

Under working drivers, you'll see a quad on the screen that is half red and half green. With broken drivers, you'll just see black.


Runs fine on my Mac mini 2009 with GF9400 under WinXP...

Any other reason other than speeding up load times why you are doing it ahead of time vs. just allowing the GPU to mipmap them?

Eric Lengyel
08-11-2009, 06:27 PM
Runs fine on my Mac mini 2009 with GF9400 under WinXP...

In other words, Nvidia's drivers work correctly. The test app was made for ATI devrel.


Any other reason other than speeding up load times why you are doing it ahead of time vs. just allowing the GPU to mipmap them?

We do things to the mipmaps other than standard box/tent filtering.

Brolingstanz
08-12-2009, 02:00 AM
As a side note; can anyone confirm GL3.1 support from AMD/ATI in the recent Cat drivers and if so on what OS?

I can confirm that on Vista 9.7 reports a 3.0 and 3.1 beta, and that's as far as I've gone with it.


(also, the text input box is HORRIBLY screwed when using IE8 on Win7, so much so once I got passed "what is coming in D3D11 hardware are not the same thing, the" I had to resort to finishing my post in notepad because the text box kept jumpping up and down as I typed.)

I gave up on using IE here long, long ago ;-)

Heiko
08-12-2009, 03:10 AM
As a side note; can anyone confirm GL3.1 support from AMD/ATI in the recent Cat drivers and if so on what OS?

I can confirm that on Vista 9.7 reports a 3.0 and 3.1 beta, and that's as far as I've gone with it.

Same with Linux 9.7.
I wrote a small OpenGL 3.0 program a while ago. Just to make a context, try some VAO's, created my own matrix stack, etc. Fully forward compatible. That runs fine on Linux Catalyst 9.6 and 9.7. Didn't try very complex things though (basically: vao, fbo with 32 bits depth texture, and RGBA_16F color texture, just rendering the depth buffer of a rotating triangle to the screen).

Jacek Nowak
08-12-2009, 03:15 AM
aqnuep: Don't worry, I prefer development on ATI too :) If shader works on ATI, I know it will work on NVidia too, sadly it doesn't work both sides (nvidia allows "saturate" in GLSL shaders which is a CG function, ati doesn't). There are many cases like these and people often react as if nvidia was better because it is less strict.
(i'm not talking about cases like Eric Lengyel posted, obviously it's a driver bug).

Had everyone had success with newest beta drivers (190.56)?

Heiko
08-12-2009, 05:07 AM
aqnuep: Don't worry, I prefer development on ATI too :) If shader works on ATI, I know it will work on NVidia too, sadly it doesn't work both sides (nvidia allows "saturate" in GLSL shaders which is a CG function, ati doesn't). There are many cases like these and people often react as if nvidia was better because it is less strict.
(i'm not talking about cases like Eric Lengyel posted, obviously it's a driver bug).

Had everyone had success with newest beta drivers (190.56)?

Overhere another Ati/OpenGL and even a Linux user ;). I fully agree with Jacek, if it works on Ati, it will probably work on nVidia as well.

bertgp
08-12-2009, 08:08 AM
aqnuep: Don't worry, I prefer development on ATI too :) If shader works on ATI, I know it will work on NVidia too, sadly it doesn't work both sides (nvidia allows "saturate" in GLSL shaders which is a CG function, ati doesn't). There are many cases like these and people often react as if nvidia was better because it is less strict.

NVIDIA publishes a utility called "nvemulate" which lets you set, among other things, a flag to ask for the GLSL compiler to adhere strictly to the language spec. The option is called "Generate Shader Portability Errors" and it will refuse to compile shaders that don't follow the OpenGL spec.

Jacek Nowak
08-12-2009, 08:25 AM
Really? I will try it immediately!

Edit: Nope, I set Generate Shader Portability Errors to "enabled", Apply. Still no errors with line:
float diffuse = saturate(dot(L, N));

of course ATI bugs about undefined symbol saturate

bertgp
08-12-2009, 08:48 AM
Really? I will try it immediately!

Edit: Nope, I set Generate Shader Portability Errors to "enabled", Apply. Still no errors with line:
float diffuse = saturate(dot(L, N));

of course ATI bugs about undefined symbol saturate

Hmmm... That's weird. Did you put "#version 120" (or your specific version) at the top of the shader source? I can't get a shader to compile with the saturate() function with or without the "Generate Shader Portability Errors", but I put "#version 120" as the first source line.

Also, make sure _no_ OpenGL context is opened while you change settings in NVEmulate (same goes for NV control panel settings)

We used the portability flag to detect errors when implicit conversions between differently sized vecs were done by the compiler. These conversions were not defined in the GLSL spec so they did not compile on ATI and this allowed us to catch them.

bobvodka
08-12-2009, 01:59 PM
As a side note; can anyone confirm GL3.1 support from AMD/ATI in the recent Cat drivers and if so on what OS?

I can confirm that on Vista 9.7 reports a 3.0 and 3.1 beta, and that's as far as I've gone with it.


Is this your own code or a seperate detection app?

*looks at date and time*
*looks at lack of new drivers*

I'd like to say I'm surprised but....



I gave up on using IE here long, long ago ;-)

Yes, well, I like IE8; last time I tried to use FF or Chrome I'm sure it tried to give me eye cancer or something :D

Heiko
08-12-2009, 02:36 PM
...

Is this your own code or a seperate detection app?


In Linux (as far as I know the Linux driver shares the OpenGL stack with the windows driver), I use the following code in my own application:


int const glxContextAttributes[] = {
GLX_CONTEXT_MAJOR_VERSION_ARB, 3,
GLX_CONTEXT_MINOR_VERSION_ARB, 1,
GLX_CONTEXT_FLAGS_ARB, GLX_CONTEXT_FORWARD_COMPATIBLE_BIT_ARB,
0
};

And glGetString(GL_VERSION) reports:
3.1.8787 BETA Forward-Compatible Context
(GL_SHADING_LANGUAGE_VERSION still reports 1.30 though).

(Catalyst 9.7, was the first driver in which this works)

Mars_999
08-12-2009, 02:41 PM
To Phantom...


Well looks like its going to be Monday for you.

AMD is showing off 9.8 at Quake Con tomorrow. And public availability is Monday.

http://twitter.com/CatalystMaker/status/3247435928

Form PCPer

It looks like AMD GPU fans attending QuakeCon 2009 (and you really should) will get a bit of an early dig into the world of the Catalyst 9.8 drivers. According to this post on Catalyst driver lead Terry Makedon's twitter page, users interested in getting the first taste of 9.8 should be following him:

If you want to be the first in the world to try Catalyst 9.8 and are at #quakecon, follow me for info!

We have confirmed that Catalyst 9.8 will be available to anyone at QuakeCon by coming to the ATI booth on Thursday. It will be publicly posted on Monday August 17th, and AMD found a way to make it available a few days earlier for the gamers as a special thank you. We expect them to be available on the torrents from Thursday once they spread at QuakeCon; we might even post them here if we get our hands on it.

Also make sure to be at our Hardware Workshop at QuakeCon where Terry Makedon will join our panel.

PS: AMD has some other surprises up their sleeves for QuakeCon as evident from this cryptic post on AMD's blog about the new Wolfenstein game...

Brolingstanz
08-12-2009, 05:22 PM
Is this your own code or a seperate detection app?

My own flea bitten code. It was actually on a fluke that I tried creating these 3+ contexts that I found they worked - 3.0 is not reported as the top level version with a basic context as it is with NV.

barthold
08-12-2009, 09:13 PM
Hmmm... That's weird. Did you put "#version 120" (or your specific version) at the top of the shader source?


That is excellent advice, please do that. If you don't, you'll get version 1.10 by default and the GLSL compiler will fall into different code paths.

Thanks,
Barthold
(with my NVIDIA hat on)

Jacek Nowak
08-13-2009, 01:18 AM
I'd love to solve this mystery. I ticked everything in NVEmulate except for "Force software rasterization" obviously, and it generates some logs for me. Here they are: http://pastebin.com/m1fa6b5f2

Log says no errors no warnings (actually it doesn't say anything at all)

Edit: Tested it on my brother's PC, and it reports error with "saturate". But it doesn't on my PC. Could it be because I am using beta drivers (190.56)?

Gedolo
08-13-2009, 03:35 AM
This is a post to clarify a few things of my previous posts.

http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=261969#Post2619 69

http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=262005#Post2620 05

With joining the OpenGL's I mean the API's, and only the API's.

The specification may be totally different but it's the api's that the programmer works with for writing OpenGL software.
API similarity is all there has to be. The OpenGL doesn't have to limit functionality or have the same specification.
I fully agree with Alfonse Reinheart on this.
Here is his post about the subject: http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=262016#Post2620 16

Don't use the version to show difference in functionality, use profiles with this.
OpenGL ES 1.0 & 2.0,
Rename the former OpenGL ES and the latter OpenGL ES FF (Fixed Function)
Major API revisions: OpenGL ES 3.0 and OpenGL ES FF 1.0

Maybe in some time the vendors write OpenGL and OpenCL drivers.
The deprecated OpenGL function could be written in OpenCL.
This way the vendor/driver writer(s) only has to write code once and can enable it on any OpenCL-compatible card.

Groovounet
08-13-2009, 05:21 AM
I wonder if there isn't an issue in the OpenGL 3.2 core spec.
gl_FragColor and gl_FragData are deprecated in GLSL 1.5 specification but still present is OpenGL 3.2 core ...

Page 182, "Shader Outputs"

barthold
08-13-2009, 09:00 AM
I wonder if there isn't an issue in the OpenGL 3.2 core spec.
gl_FragColor and gl_FragData are deprecated in GLSL 1.5 specification but still present is OpenGL 3.2 core ...

Page 182, "Shader Outputs"

That's not a spec bug. Deprecated means "flagged for possible future removal". Deprecated does not mean "is already removed". gl_FragColor and gl_FragData are still present in the Core profile, but might be removed in future versions. Hence, you're probably better off not using them anymore if you are writing shaders.

Hope that helps,
Barthold
(with my ARB hat on)

Groovounet
08-14-2009, 02:26 AM
Thanks it does help!
I realized how confused I am with all this ... please keep OpenGL 3.2 model for the deprecation for OpenGL 3.3, the profile thing seems good. (just maybe a bit confusing but it might be just that it evolves from version to version.)

By the way the OpenGL reference card miss gl_FragColor and gl_FragData. I don't know to who I'm supose to report this.

I actually wonder which color it suposed to be in the card ...

barthold
08-14-2009, 11:56 AM
> By the way the OpenGL reference card miss gl_FragColor and gl_FragData. I don't know to who I'm supose to report this.

Good catch! That was on purpose. We ran out of space to put it. Since they are marked deprecated, you're not supposed to use it anymore, and hence they were the first to go when we ran out of space.

Here's the complete list of deprecated, but still available, functionality. See slide 20 from my BOF presentation also:

http://www.khronos.org/developers/librar...graph-Aug09.pdf (http://www.khronos.org/developers/library/2009_siggraph_bof_opengl/OpenGL-BOF-Overview-Siggraph-Aug09.pdf)

*Wide lines
**calling LineWidth() with values greater than 1.0.
**Deprecated in OpenGL 3.0

*Global component limit query
**API: MAX_VARYING_COMPONENTS and MAX_VARYING_FLOATS
***Deprecated in OpenGL 3.2
**GLSL: gl_MaxVaryingFloats
***Deprecated in GLSL 1.30 and aliased to gl_MaxVaryingComponents (also deprecated)
**GLSL: gl_MaxVaryingComponents. Use gl_MaxVertexOutputComponents, gl_MaxGeometryInputComponents, gl_MaxGeometryOutputComponents, gl_MaxFragmentInputComponents
***Deprecated in GLSL 1.50

*Misc other deprecated in GLSL
**Keywords: attribute, varying
**Outputs: gl_Fragcolor and gl_Fragdata[]
**Texture functions with the sampler type in their name (texture1D*, texture2D*, etc)
**Deprecated in GLSL 1.30

Barthold
(with my ARB hat on)

Alfonse Reinheart
08-14-2009, 12:04 PM
Wait. The ARB kept LineWidth deprecated but unremoved, but fully removed quads and alpha test?

CatDog
08-14-2009, 02:33 PM
According to this (http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=254245#Post2542 45) post, they marked wide lines deprecated in 3.0, but no more in 3.1 and up.

CatDog

Mars_999
08-14-2009, 08:31 PM
Here is the info on ATI's 9.8 driver for GL3+

This release of the ATI Catalyst driver provides OpenGL 3.1 extension support. The following is a list of OpenGL 3.1 features and extensions added in ATI Catalyst 9.8:

* Support for OpenGL Shading Language 1.30 and 1.40.
* Instanced rendering with a per-instance counter accessible to vertex shaders (GL ARB draw instanced).
* Data copying between buffer objects (GL EXT copy buffer).
* Primitive restart (NV primitive restart). Because client enable/disable no longer exists in OpenGL 3.1, the PRIMITIVE RESTART state has become server state, unlike the Nvidia extension where it is client state. As a result, the numeric values assigned to PRIMITIVE RESTART and PRIMITIVE RESTART INDEX differ from the NV versions of those tokens. o At least 16 texture image units must be accessible to vertex shaders, in addition to the 16 already guaranteed to be accessible to fragment shaders.
* Texture buffer objects (GL ARB texture buffer object).
* Rectangular textures (GL ARB texture rectangle). o Uniform buffer objects (GL ARB uniform buffer object).
* SNORM texture component formats.


Here is the link to DL 9.8!!!

Scroll down to bottom of blog.

http://blogs.amd.com/play/2009/08/14/what%E2%80%99s-a-good-title-for-a-quakecon-blog/

memory_leak
08-19-2009, 07:32 AM
Why having it in pdf ... pleeease - put in html- on opengl website.

You can't style HTML like that PDF. It also wouldn't be anywhere near as printable. Does it have to be 100% as pdf to be nice enough for reading offline (on paper)? But I think you can achieve quite more than acceptable printing results. With css you can use separate style sheet for printing; maybe you would like to hire me as a css/webb developer? Ii am looking for new emplyment :-)

Chris Lux
08-20-2009, 02:22 AM
> By the way the OpenGL reference card miss gl_FragColor and gl_FragData. I don't know to who I'm supose to report this.

Good catch! That was on purpose. We ran out of space to put it. Since they are marked deprecated, you're not supposed to use it anymore, and hence they were the first to go when we ran out of space.
that is one point that bugged me during the BoF. why is gl_Position still there and gl_FragColor not? what is the difference? i have to declare the name of my fragment shader output before linking the program, why shouldn't i do this with the vertex shader in a similar way? don't get me wrong i think some default outputs should be there, declaring the name of the output before linking is cumbersome.

-chris

Alfonse Reinheart
08-20-2009, 10:25 AM
i have to declare the name of my fragment shader output before linking the program, why shouldn't i do this with the vertex shader in a similar way?

I think it's because there is, and always will be, some fixed functionality associated with the position output from a vertex shader (or geometry shader).

Brolingstanz
08-21-2009, 02:09 AM
Plus this seems to be consistent with a "black box" user specified input and output, particularly in view of the fact that there are potentially multiple output targets at the fragment stage (analogous to the multiple inputs at the vertex stage).

Chris Lux
08-21-2009, 04:42 AM
this does not answer the question why gl_Position is not deprecated and gl_FragColor/Data is. all your arguments are valid either way (vertex, fragment domain).

Xmas
08-21-2009, 05:09 AM
gl_Position is special as its type is fixed (vec4), and it is the input to a unique fixed function stage which determines its semantic meaning. Fragment shader outputs can have different types and various semantics.

Rosario Leonardi
08-21-2009, 08:33 AM
Ehi.. somebody is listeing to me. :)
Now on the main page menu I see

"OpenGL 3.2 Reference page (Under costruction)"

I hope they finish soon.

Thanks Khronos ^_^

Alfonse Reinheart
08-21-2009, 10:22 AM
this does not answer the question why gl_Position is not deprecated and gl_FragColor/Data is. all your arguments are valid either way (vertex, fragment domain).

Like I said, it connects to fixed functionality. gl_FragColor does not.

barthold
08-25-2009, 12:30 PM
Has anyone tried to program using new NV drivers? I have a problem with them (190.56 beta for XP32), because when try to bind VAO (e.g. glBindVertexArray(m_vaoID[0]);) CPU is 100% utilized (both cores on C2D CPU), and program is stopped. It happens even if I use GL 3.0 or 3.1 rendering context. Till these drivers everything worked perfectly using GL 3.1 forward compatible context. Any clue?

Aleksandar,

This bug should now be fixed with the new OpenGL 3.2 drivers we just released:

http://developer.nvidia.com/object/opengl_3_driver.html

Barthold
(with my NVIDIA hat on)

ScottManDeath
08-25-2009, 10:19 PM
It seems that glGetErrors() are not propagated to the app, even after checking the proper setting in the NVIDIA control panel. The faulty command is getting ignored, but glGetError returns 0.

This was on WinXP 32 bit, on an 8800 GTX, with a "normal" context, i.e. the context has been created [by glut] without using WGL_ARB_create_context.

This also happened on the first OpenGL3.2 beta drivers.

Aleksandar
08-26-2009, 02:00 AM
Thank you, Barthold!

Believe it or not I have visited NVidia's site day by day expecting new drivers. :)
I have already downloaded 190.57 drivers eagerly expecting this afternoon to try them at home (I still don't dare to install then on my computer at work place). :)

I also hope that input blocks are fixed too...

Thank you for the new drivers!
Your team did the great job!
Thumbs up!

Aleksandar

Groovounet
08-26-2009, 08:48 AM
> By the way the OpenGL reference card miss gl_FragColor and gl_FragData. I don't know to who I'm supose to report this.

Good catch! That was on purpose. We ran out of space to put it. Since they are marked deprecated, you're not supposed to use it anymore, and hence they were the first to go when we ran out of space.

Barthold
(with my ARB hat on)

Looking back at the reference card, your will notice that there is enough space for 4 variables in the "fragment language part".

I really like this quick reference card as a reference for what is deprecated and what isn't. I would be great to have a complete reference card for such purpose.

Actually it would be even better to have three colors for "removed", "deprecated" and "core"

barthold
08-26-2009, 09:36 AM
It seems that glGetErrors() are not propagated to the app, even after checking the proper setting in the NVIDIA control panel. The faulty command is getting ignored, but glGetError returns 0.

This was on WinXP 32 bit, on an 8800 GTX, with a "normal" context, i.e. the context has been created [by glut] without using WGL_ARB_create_context.

This also happened on the first OpenGL3.2 beta drivers.



Can you prove more details? Code examples would be great.

Thanks!
Barthold
(with my NVIDIA hat on)

Aleksandar
08-26-2009, 12:29 PM
The new NV drivers (190.57) implement nine ARB extensions more than 190.56.

GL_ARB_draw_elements_base_vertex
GL_ARB_framebuffer_sRGB
GL_ARB_seamless_cube_map
GL_ARB_sync
GL_ARB_texture_compression_rgtc
GL_ARB_texture_env_crossbar
GL_ARB_texture_multisample
GL_ARB_uniform_buffer_object

I guess that GL 3.2 implementation list is now fulfilled.

One again, congratulations!

Groovounet
08-26-2009, 01:41 PM
Niceeeee .......

Alfonse Reinheart
08-26-2009, 03:03 PM
GL_ARB_texture_env_crossbar

Huh? An ancient extension like that? Crossbar hasn't been relevant since the R300/NV30 days.

Jan
08-26-2009, 03:06 PM
I was wondering the same thing...

Aleksandar
08-27-2009, 01:57 AM
Huh? An ancient extension like that? Crossbar hasn't been relevant since the R300/NV30 days.
I didn't try if it really works (maybe it is accidentally exposed by the driver in the extension string), just make a comparison of ARB extension lists of the two drivers. I have only tried GL_ARB_uniform_buffer_object, and it works (at least things that I tried).

sqrt[-1]
08-27-2009, 02:13 AM
I think that Crossbar extension was always supported by Nvidia through their own NV extension (used the same tokens and everything) I think the only difference was behaviour when accessing a texture stage that did not have a texture bound.

From the ARB spec:
If a texture environment for a given texture unit references a texture unit that is disabled or does not have a valid texture object bound to it, then it is as if texture blending is disabled for the given texture unit

I think the Nvidia spec said behaviour was undefined when accessing a unbound texture stage.

It always seemed to me that Nvidia was being really pedantic about the spec - and since it already had an extension that did exactly the same thing (in non error cases), it was impossible for them to support it fully.

Eosie
08-28-2009, 07:50 AM
The Crossbar extension is part of OpenGL 1.4 and has been removed from OpenGL 3.1. Now it's back again.

Alfonse Reinheart
08-28-2009, 10:55 AM
The Crossbar extension is part of OpenGL 1.4 and has been removed from OpenGL 3.1. Now it's back again.

But it's part of ARB_compatibility, since it was once part of core; you don't need to advertise it explicitly.

Gedolo
09-02-2009, 09:12 AM
Original from: http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=263316#Post2633 16

A feature request about precision.

Bundling of pipelines/stream processors for acting as a wider, more precise pipelines/stream processors. Enabling dynamical, programmable precisions.

Have seen that with GLSL, shaders can be coupled (coupled in serie) to do multiple effects.

What if you could couple pipelines in parallel for enhanced precision?
Not for parallel processing, just adding precision.
e.g. couple eight pipelines with full 32bit accuracy for each component to one combined pipeline with 8*32bit = 256bit accuracy.

This has the advantage to be very scalable with a good specification.
If there is only one pipeline then the pipeline will just have to take more time in calculations and store data in cache memory.
If there are pipelines left because of the size of the combined pipeline (e.g. 5*32bit on 17 stream processors/pipelines leaves out 17modulus5 = 2 left),
no problem, then so be it.

This would allow dynamical, programmable precision which is useful, welcome (essential?) in several area's.
Physics simulations for instance.

But also in more mainstream applications.
Position calculations in games on huge maps without running into precision issues. (It's going to be slower than less precision but at least it is possible to have good animation and movement.)


Got this idea after thinking about precision problems in physics simulations and the fact that shaders can be coupled after each other. That each pipeline has a certain precision and the current graphics processors have a lot of them ATI: graphic cards with 800 stream processors/pipelines.

Gedolo
09-02-2009, 09:35 AM
Sounds good, doesn't it?
Why not bundling/coupling the stream processors to enhance precision when needed anyway?

Jan
09-02-2009, 10:48 AM
Please don't cross-post.

Alfonse Reinheart
09-04-2009, 02:49 PM
The enumext.spec file has an error in it. There is a line:


use VERSION_3_1 R_SNORM

in the EXT_texture_snorm extension. It should be this:


use VERSION_3_1 RED_SNORM

as "R_SNORM" doesn't exist.

Additionally, in gl.tm, there is no entry for Int64, though it is used in the gl.spec file. There is an entry for Int64Ext.

Stephen A
09-05-2009, 07:36 AM
@Alfonse Reinhart: I can confirm both issues.

The first was filed as bug #195 (http://www.khronos.org/bugzilla/show_bug.cgi?id=195) back in August.

Couldn't find anything about the second one, filed it as bug #201 (http://www.khronos.org/bugzilla/post_bug.cgi).

Rsantina
09-07-2009, 04:39 AM
Does anyone know if ATI will keep supporting Opengl 1.x/2.x drivers? I know that this is the case for nvidia

Stephen A
09-07-2009, 08:35 AM
Ati tends to avoid announcing future plans, but I consider it highly, highly unlikely that they will ever stop supporting OpenGL 1.x / 2.x.

Obviously, I cannot speak for Ati and this is just a personal opinion, so take that as you will. However, Ati's commercial customers are *all* using OpenGL 1.x / 2.x right now - I consider that as good a guarantee as any.

Alfonse Reinheart
09-26-2009, 12:36 AM
The wglenumext.spec file is missing enumerators (and the definition of the extension itself) for WGL_ARB_framebuffer_sRGB.

Stephen A
09-26-2009, 02:08 AM
The wglenumext.spec file is missing enumerators (and the definition of the extension itself) for WGL_ARB_framebuffer_sRGB.

Confirmed and logged as bug #210 (http://www.khronos.org/bugzilla/show_bug.cgi?id=210). This also affects glxenumext.spec.

On a related note, the issues from the previous page (#195 and #201) were marked as fixed yesterday.

Alfonse Reinheart
09-26-2009, 01:09 PM
On a related note, the issues from the previous page (#195 and #201) were marked as fixed yesterday.

I checked the spec files, and they're fixed. They even added GLuint64 and GLsync, which I hadn't mentioned were also missing.

Alfonse Reinheart
09-27-2009, 09:25 PM
Also missing in the WGL spec files are entrypoints for WGL_EXT_swap_control.

nesister
10-08-2009, 05:02 AM
Hi, i've got two requests :

First I think there's a problem with current gl.spec, those are defined as OpenGL 3.2 functions but are not specified in the spec pdf.



# OpenGL 3.2 (ARB_geometry_shader4) commands

ProgramParameteri(program, pname, value)
return void
param program UInt32 in value
param pname GLenum in value
param value Int32 in value
category VERSION_3_2
version 1.2
extension
glxropcode ?
glxflags ignore
offset ?
...
FramebufferTextureFace(target, attachment, texture, level, face)
return void
param target GLenum in value
param attachment GLenum in value
param texture UInt32 in value
param level Int32 in value
param face GLenum in value
category VERSION_3_2
version 1.2
extension
glxropcode ?
glxflags ignore
offset ?


ProgramParameteri is useless as its parameter are now set in the geometry shader source code.
FramebufferTextureFace was used to attach a single face of a cube map to a framebuffer object but it can be done with FramebufferTexture2D.
So I think gl.spec has it wrong here.
I might add that latest nVidia GL3.2 driver on linux (190.18.05) doesn't expose them (only the ARB version, from ARB_geometry_shader4).

Second, I've already asked if there was an equivalent to glGetActiveUniform but for fragment shader outputs (see this post (http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Main=50199&Number=2584 43#Post258443)). There is an equivalent to glGetUniformLocation, it's glGetFragDataLocation, but output variable names can't be dynamically queried. So is it something planned ?

Thanks.

kRogue
10-12-2009, 12:32 PM
I see the following in the GL 3.2. core spec (version data 20090803) {looks like this is more like a GL 3.1 spec bug that went unnoticed} I see in Section 21.9 (page 97):




This view volume may be further restricted by as many as n client-defined half-
spaces. (n is an implementation-dependent maximum that must be at least 6.)


but in Section G.2 (page 343)



Update MAX_CLIP_DISTANCES from 6 to 8 in section 2.19 and table 6.37,
to match GLSL (bug 4803).


Spec bug?

In truth I found this while checking if it was supposed to be 6 or 8 because nVidia drivers have it at 6, nVidia MAX_CLIP_DISTANCE only 6 (http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=265513#Post2655 13)

Alfonse Reinheart
10-31-2009, 12:57 PM
Several non-core ARB extensions are missing the suffix from their enum names. GL_ARB_cube_map_array is not a core extension, and the extension spec clearly has ARB suffixes on the names. But the .spec files do not.

This is also true of GL_ARB_texture_gather, GL_ARB_draw_buffers_blend, and GL_ARB_sample_shading.

mossland
01-04-2010, 06:13 AM
Some of the new features has not been fully implemented by the hardware vendor, for example, the conditional rendering commands, and the conditional wait commands(glWaitSync).
Independant stencil buffer and depth buffer of fbo, etc.
Yet most of them have not been officially documented by the hardware vendor.

Alfonse Reinheart
01-04-2010, 05:37 PM
Which one of "the hardware vendor" are you referring to? And which driver version are you on?


Independant stencil buffer and depth buffer of fbo, etc.

Nobody implements these. Don't try to use them. Just use depth_stencil.

mossland
01-05-2010, 08:10 AM
Thanks for your kindness.
the vendor : nVidia
driver version : 195.62
the problem :
the command sequence like :
...
glFenceSync
glWaitSync
glDeleteSync
...
in my render code,
after hundreds of frames,
the gl server seem to hang over for several seconds,
meanwhile doesn't response to any gl command

maybe I should report this bug to nVidia,
but I quote it here just as a comment for the
present OpenGL state :
A standard supported by multi-vendors,
but not every function defined by the specs
is supported by every vendor.
And some of the unsupported functions are not
reported intuitive and clearly.
for example, the Independant stencil buffer and depth buffer
are reported by nvidia driver as "attachment incomplete",
not as "unsupport", which led me confused for some days.

mossland
01-05-2010, 08:29 AM
A method for the gl server to report its unsupported functions, and sometime, report the reason for that, is very important to
the developers.
For example, when I created a 3.2 core profile context, and by
check the version, it was confirmed as a 3.2 context.
Then I assumed that all the functions and features defined by the 3.2 spec are all supported and implemented. I check the shader version, it is 1.50, I declare this version at the beginning
of my shader file, but later on, it seems that the noise1
shader function does not work, I debug and debug over the time,
in the end I doubt that it is not implemented by nvidia,
the search result : it is true.
How can I get the result from the gl api?
And How does the gl server to report this?
I hope future gl versions provide a nice way to
let the gl server to report its unsupported functions or
vice versa, report its fully implemented functions.
Right now, the state is :
Function Pointer : Exists
Report Error : No
Effect : Nothing
Reason : by web searching.

martinsm
01-05-2010, 12:18 PM
noise function in GLSL can return value 0. It is perfectly legal value to return. Nvidia has done nothing wrong there.

Brolingstanz
01-05-2010, 02:38 PM
Besides, I don’t think that this is a driver feedback thread; rather it's an OpenGL 3.2 feedback thread. If you’ve found a reproducible bug in a particular driver you should probably report it to your vendor. Or if you need help diagnosing a problem with your code then you might try the developer forums.

Alfonse Reinheart
01-05-2010, 03:49 PM
in my render code,
after hundreds of frames,
the gl server seem to hang over for several seconds,

First, stop putting random
line
breaks
in your
sentences.

That just makes your post hard to read. HTML will word-wrap for you; you don't have to do it manually. Just put a line between paragraphs, and everyone can understand you.

Second, did you read the specification? Are you using the functions properly? Because, from the ARB_sync spec, there is a particular way of doing things. And if you don't do them correctly, the spec warns you that you may get into an infinite loop. It's been a while since I read the spec, but there's something about having your first wait on a sync do a flush or some such.

Third, as others have pointed out, this thread is about the OpenGL 3.2 specification, not any particular implementation thereof.

kyle_
01-16-2010, 05:18 PM
noise function in GLSL can return value 0. It is perfectly legal value to return. Nvidia has done nothing wrong there.
I dont think it is. As far as i remember its specified to have some coverage of values returned.
But than again, nobody serious cares, as its supported only in MESA.

Y-tension
01-30-2010, 08:14 AM
Well, it's been a while since we heard any news from anyone. Not that I have a lot to complain about, 3.2 spec was good enough to rekindle some spark of interest to developers and we already have implementations. However I think the next step to promote the API is the availability of better tools. Many people have already suggested this in these forums and as my own experience with the library increases I realize this need exists more and more. I would really like to see some work in that direction as well from Khronos. I really hope that more developers will adopt the API and start pushing in this direction.

Eosie
01-30-2010, 10:33 AM
Khronos can't do anything about tools. All they can do is to put words on paper.

AMD GPU PerfStudio is a great tool to profile your OpenGL applications.
AMD RenderMonkey is a nice IDE for developing GLSL shaders.
And there's much more...

serino
02-19-2010, 11:01 AM
INFORMAZIONI SULLA MATRICE SHADOW MATRIX


who explains SHADOW MARTIZX..?

Dark Photon
02-19-2010, 02:38 PM
(In English:) INFORMATION ON SHADOW MATRIX

who explains SHADOW MARTIZX..?
Next time, please start a new post. This apparently doesn't have anything to do with the thread you appended to.

So, Shadow Matrix in what context?

Depends on the application and the shadowing technique being discussed. But commonly it is a term used in shadow mapping for a transform that maps from object-space, world-space, or camera-eye-space coordinates to a light-NDC-space (mapped to 0..1). However in planar-projected shadows, it's a matrix that smashes 3D geometry onto a plane.

Alfonse Reinheart
11-12-2011, 10:02 AM
By the way, maybe it's time to update the online documentation, it's still at 2.1 version.

You mean this online documentation (http://www.opengl.org/sdk/docs/man4/)?