View Full Version : My wish list for OpenGL next versions

08-07-2009, 05:43 AM
http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=261969#Post2619 69 Response to official feetback topic about OpenGL 3.2 (http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=261969#Post2619 69)

Instead of duplicating my wish list.
The link above redirects to my OpenGL 3.2 official feedback thread reply.

08-07-2009, 07:32 AM
i do agree that tesselation should eventually make it's way to core and that binding stuff has to be rethought and unified, it's not that it's particularly bad, it just needs some work as sometimes it's just saying "use this", but at other times it's just a starting point in the multiple steps you have to take before you can start rendering.

but for the rest, no, why?

08-09-2009, 04:09 AM
Yep, the next version of DX is more or less the proverbial writing on the wall, at least where major features are concerned. You needn't look much further than DX11 to see what's in store for the next several GL3 increments (not that there isn't room for originality or uniqueness here and there, as was clearly demonstrated in 3.2's sync and way cool uniform blocks).

08-13-2009, 04:49 AM
The binding system is bad. It can be done simpler.

Every extra added bit of programming details makes it harder.
You already use your brains a lot in writing code.
Sure, it's simple. Yeah when you're doing simple projects.
I don't want to get frustrated in complicated projects because of these kind of things.

I don't want to remember another thing when I'm making complicated rendering engines. It slows me down and makes things unnecessary more complicated. Every bit of complicatedness is an extra burden for the brain.
And every burden that can be removed is a burden to much.

By unifying the different OpenGL versions I mean the API's.
The specification may be completely different and have different functionality.
There is nothing wrong with that.
As long as the API's for the same functionality are the same.
I completely agree with Alfonse Reinheart on this:
http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=262016#Post2620 16

The programmers work with the api's, and if the api's are the same. Porting is very easy.

All the rest is actually for being able to do major revisions without conflicting (OpenGL ES 1.0 and 2.0 currently) versions.
And have backwards compatibility as an option.

Anyway, there are two more posts about my wish list:

http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=262005#Post2620 05

http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=262276#Post2622 76

09-02-2009, 10:07 AM
A feature request about precision.

Bundling of pipelines/stream processors for acting as a wider, more precise pipelines/stream processors. Enabling dynamical, programmable precisions.

Have seen that with GLSL, shaders can be coupled (coupled in serie) to do multiple effects.

What if you could couple pipelines in parallel for enhanced precision?
Not for parallel processing, just adding precision.
e.g. couple eight pipelines with full 32bit accuracy for each component to one combined pipeline with 8*32bit = 256bit accuracy.

This has the advantage to be very scalable with a good specification.
If there is only one pipeline then the pipeline will just have to take more time in calculations and store data in cache memory.
If there are pipelines left because of the size of the combined pipeline (e.g. 5*32bit on 17 stream processors/pipelines leaves out 17modulus5 = 2 left),
no problem, then so be it.

This would allow dynamical, programmable precision which is useful, welcome (essential?) in several area's.
Physics simulations for instance.

But also in more mainstream applications.
Position calculations in games on huge maps without running into precision issues. (It's going to be slower than less precision but at least it is possible to have good animation and movement.)

Got this idea after thinking about precision problems in physics simulations and the fact that shaders can be coupled after each other. That each pipeline has a certain precision and the current graphics processors have a lot of them ATI: graphic cards with 800 stream processors/pipelines.

Alfonse Reinheart
09-02-2009, 11:30 AM
Position calculations in games on huge maps without running into precision issues.

Things like that almost never come up. In games with huge maps, the renderer will likely have culled objects that are that far away.

Further, if you're rendering objects that far away, then your 24-bit depth buffer isn't sufficient. And that is hardcoded; replacing it means losing performance features like Early-Z and Hi-Z/etc.

This seems more like something that should go into OpenCL as an option.

09-02-2009, 11:47 AM
"Have seen that with GLSL, shaders can be coupled (coupled in serie) to do multiple effects."

What are you talking about?? This is not possible and certainly will never be. You have to program such stuff yourself.

About the precision stuff: This is all OpenCL area. OpenGL/GLSL doesn't (want to) know anything about stream processors, pipelines and all those nasty "details".

If you want to do physics on the GPU, use OpenCL / Cuda, not GLSL, it is intended for such purposes.

Also, i am not sure one can use "more pipelines for more precision". I don't think the ALUs are designed to allow computations to be done that way.


09-02-2009, 11:48 AM
My wish list would be stop where you are until we have reliable drivers that support the current spec., and from major vendors at least. :)

09-02-2009, 05:34 PM
Stop? Please, don't stop! (oooooh....eeeeeeee.....ahhhhhh!)

;) :eek: :D :cool:

09-02-2009, 10:13 PM
Define sarcasm :)

How to advance on the driver side is more important I guess.

BTW what's the current version Apple support?

Dark Photon
09-03-2009, 12:34 PM
Position calculations in games on huge maps without running into precision issues.
Things like that almost never come up. In games with huge maps, the renderer will likely have culled objects that are that far away.
A guess, but I think he may be talking about huge arenas/worlds where float32 precision isn't sufficient to represent world-space positions without insufferable precision errors.

The errors computing MODELVIEW in this case can be often be avoided by doing your matrix math in float64 (double) on the CPU.

However, that still doesn't mean you can go messing with world-space in your shader to the accuracy you need. You have to resort to local coordinate system tricks for that, since shaders can't do float64.

09-03-2009, 01:17 PM
Define sarcasm :)

Wasn't being sarcastic. Damn the torpedoes... full speed ahead!


09-05-2009, 04:34 AM
That's the stuff I'm talking about.
Dark Photon is correct.
How would you do physics calculations that could generate and error or unsolvable big problems when precision isn't high enough?
(can happen, these kind of physics models exist, don't tell me it can't because there can always be bugs in someone's code.)

The program Celestia, which is a free and open source space simulator. Can simulate the universe (simplified somewhat) and renders a lot of stars. Many people want to place stars and planets in far away galaxies in addons. But they can't because of accuracy issues. The stars would get stacked on top of each other. Or orbits with spacecrafts, single precision is not good enough for these. And in some future cases, maybe even more precision than 64bit floating point is needed. It is also a problem when using the telescope feature, trying to view exoplanet stars and far away bodies in our solar system is a problem.
For zooming with the telescope feature on a spacecraft that is located in a solar system addon for Celestia, in a galaxy cluster 10^16?10^20 light years away requires definitely more precision than any current datatype can foresee (even more than float128). :mad:


I try to never say:
I don't see a use now, let's consider it never is needed.

It's like the people who say.
If I can't see it, it doesn't exist!
(Then they should be able to see atoms, but nobody can.)
People who say this are very short-sighted and egocentric.
Those people should be ashamed of even suggesting this sort of anti-progressive behaviour.
I don't need this [censored] and whining about how it's not useful, get over it and realize other people might have other needs.

This forum is about discussing what could be improved, added in OpenGL.
Not just about whining about and adding the missing stuff compared to the newest DirectX version.

And by using 64bit datatypes, the precision is just better, not adjustable to everybody's needs.
I don't need it currently, but maybe someone will need it.
And it's important to realize that!
I don't know for sure, but neither can you be sure it is not needed.

Datatypes and parameters:
Here is the solution to precision problems and also encoding problems in datatypes.
Datatypes with parameters!

(These examples are just for illustration.)
integer: int(32) /*an integer with 32bit, one bit is reserved for the sign */
int() /* an integer with a default value, could be 32bit signed */
int(u,16) /* an unsigned integer with 16 bit */
int(512) /* an integer with 512bit */

float: float(256) /* a float with 256bit */

/* About strings, there is the encoding issues.
There are a lot of encodings and you just don't know which one the language is using under the hood. Or want to force a certain encoding. String encoding parameters can add this kind of flexible behaviour. */

string(UTF8) /* a string with UTF8 encoding */
char(UTF8) /* a character with UTF8 encoding */

These things also count for OpenCL.
They solve the problem that sometimes it's not clear how many bits the compiler reserves for the data types, and exchanging source code produces different results on different computers. Making debugging more difficult because more parameters are involved.

binding system once again

The binding system is totally useless.
Binding can be improved, replaced by atomic operations.
Binding makes code larger and there for harder to debug
Binding takes in space while being completely unnecessary/replaceable with something better.
Binding system is bloat.

There is a problem with expectation, how the people see 1.0 and 2.0 relate. The Fixed Function pipeline should be noticeable in the name for clearness and avoiding confusion among the general public. Ignoring this can harm OGL's reputation.

Alfonse Reinheart
09-05-2009, 11:27 AM
How would you do physics calculations that could generate and error or unsolvable big problems when precision isn't high enough?

Use OpenCL. If you're doing physics calculations that are serious enough to need precision greater than a 32-bit float, you probably need a lot of things OpenCL provides that OpenGL does not.

09-07-2009, 12:00 PM
For zooming with the telescope feature on a spacecraft that is located in a solar system addon for Celestia, in a galaxy cluster 10^16?10^20 light years away requires definitely more precision than any current datatype can foresee (even more than float128). :mad:
Well, 10^20 light years are 9.46e35 metres. 2^128 is ~3.4e38, so 128 bit integers/fixed point have a fixed precision of 2.8 mm over that distance. That's quite a lot, considering that no data measured at such distance can ever be that precise.

Anyway, you can't just "bundle pipes" for more precision. Even the simplest example, a 2N wide integer addition with an N bit adder, needs to propagate the carry bit and would thus have to be performed in sequence. A wide multiplication requires several narrow multiplications and additions operations. Combining floats to form a wider float is a lot more complicated (and probably pointless if you have integers), as is any operation beyond multiplication and addition.

If you are willing to spend the time working out how to do it, you can do all that in sequence in one pipeline. No need to bundle them if you're processing multiple vertices/fragments in parallel anyway.

10-04-2009, 09:56 AM
With older cards I will only be able to use lower precision with OpenGL.

Combining a lot of pipes will make the pipe slower, that's the way hardware works. Duh!
The point is that it is going to be faster than doing it in steps with software. Every card will have a maximum of course. But once this is present, older cards can have a lot of precision, thus making developers work easier in the future.
Your saying that if I bundle two 32bit pipes or if I use a 64bit datatype, there is the same delay theoretically.
The system-software/hardware is still going to have to do all of the necessary things in some way.

There is a large difference with software doing that in sequence or having the hardware be able to do that.
The hardware acceleration will give a difference, will be faster!
How much? I don't know. It could be significant in some situations.
(Search for float64 operations per second benchmark comparisons on 32bit and 64bit OS's on 32 and 64bits CPU's.)


A better illustration is that now, developers have to make all sorts of algorithms to solve the precision problems.
For situations where older hardware can't be ignored!

A lot of older graphic cards have single precision, this makes it a lot more difficult to do big things in them.
It's possible to write code that uses certain techniques to proces things in steps and be fast. Allowing bundling can avoid having this problem, drawback a few times in the future.
And it enables having better performance as described above in this post.

My point is that it is slower than if the card can bundle it's pipes adding it's way more difficult to program extra algorithms to handle older hardware in steps and be fast.
How the internals work will be up to the IHV's and driver manufacturers, how it will behave will be up to OpenGL specification. OpenGL can also dictate a minimum and/or maximum for clarity.

10-05-2009, 05:28 AM

I'm not talking about performance but about hardware complexity. You can't simply connect two single precision FP ALUs to form one double precision FP ALU, that's just not how it works. You seem to have no idea how expensive it would be to actually implement your suggestion.

(Search for float64 operations per second benchmark comparisons on 32bit and 64bit OS's on 32 and 64bits CPU's.)
The execution speed of double precision operations on the FPU has absolutely nothing to do with the OS running in 32 bit or 64 bit mode.

10-12-2009, 06:38 AM
It makes a difference application side.
I'm well aware that this suggestion is very complex to realize in hardware. This is an idea's list. I just let my imagination go freely.

In the end, on the application speed the OS matters, I just included it for completeness. A difference OS can also have different drivers and a lot of other stuff that makes a difference.

10-12-2009, 06:43 AM
While textures are very mature.
(Here comes my next suggestion.)
Something to put vector based textures in memory.
Not as bitmaps but as vectors. Also with the ability to do transformations and stuff on it. This is something that more belongs to OpenVG. But it will probably used in OpenGL also a lot. And this is a list I started in OpenGL.
OpenGL support for vector textures, based on svg 1.2.

(Eventually when going on-screen, showing. The necessary parts will be rasterized for the framebuffer in another texture or directly to the framebuffer itself.)

10-12-2009, 06:55 AM

*GetString(enum name);

Is this usable everywhere?
If not please allow it to be used everywhere in a program.

10-12-2009, 07:00 AM
For the preprocessor.
Allow the version to be a string.
Because now, it's already 1.5.0.
In the future,
it could be too little space to have something like e.g. 1.11.0.
Would be handy to have something like e.g.:

Further: wherever there are versions, using a string datatype.
Defining some general API with some methods to check < > = for the whole string or substring. Generalization, standardization for versions related stuff across all opengl.
(Maybe also other e.g. OpenVG, OpenMAX, OpenKODE, OpenCL,...

10-12-2009, 07:27 AM
Direct State Access present and in core:

It's discussed on slide 59 to 68.

Alfonse Reinheart
10-12-2009, 11:21 AM
Is this usable everywhere?

What do you mean by "everywhere?" It can only be used while OpenGL is active, and outside of glBegin/glEnd (I think). Other than that, yes.

For the preprocessor.
Allow the version to be a string.
Because now, it's already 1.5.0.

Are you talking about the GLSL preprocessor? If so, what does it matter whether it's a string or a number?

10-20-2009, 12:37 PM

That's what I mean by everywhere, thanks.

About the preprocessor, the version attribute.
It's said it not very good.
What I mean is that it is rather inconvenient to check for the minor version. (A string with three sub strings would be handier to work with in those situations.)

For flexibility, it would be handier and safer to being able to manipulate it as a string consisting of sub strings.
Nothing wrong with using an array of integers internally since an unsigned 16bit integer has a range of 0 to +65,535.

Why safer? It doesn't matter until the version number would hit something e.g. 1.10! Internally converting to integer and recoding to display again could represent if programmer doesn't code right, bugs: 2.0)
It opens a few pitfalls (things that could get easy overlooked) to make it robust.

Alfonse Reinheart
10-20-2009, 02:26 PM
What I mean is that it is rather inconvenient to check for the minor version. (A string with three sub strings would be handier to work with in those situations.)

GLSL is a shading language. As such, it neither has nor needs strings or string manipulation functions.

I don't know how exactly it is difficult to use the floating-point version number. Wouldn't a ">= 1.3" test be sufficient for your needs?

However if it is, then the proper solution is to have a separate major and minor version numbers.

Stephen A
10-21-2009, 10:47 PM
I'm not sure I'm parsing Gedolo's post correctly, but it would make sense to add two new GetProgam parameters that return the major and minor versions, respectively (as integers).

The issue with string parsing is that it exposes you to localization and string comparison issues (where the dot might not mean what you think it does and the sorting order might not be what you expect, i.e. "1.100" < "1.50" or "1.20" != "1,20"...)

Relatively easy to avoid, yes, but still a potential issue that you have to test and verify. Better get the API right from the beginning (http://blogs.msdn.com/brada/archive/2003/10/02/50420.aspx) and return plain ints.

10-22-2009, 11:07 AM

You're correct, I didn't thought of that as a reason.
(Main focus was flexibility.)
Speaking about it, ogl would be better if all version parameters consist of a few int:
ui - uint (32 bits)
For being complete, the version should have 3 ints.
Because the version is in the form: x.y.z
Writing code to compare values is easy with this system.

Thus change of request:
change all version strings into array of ints, with a defined naming and comparing behaviour.

Something e.g. (This is for illustration purposes)
3 int:
Major = first int
Minor = second int
third = third int

(API: Minor and major is a good idea, wait a minute, why not use numbers that also indicate the importance and leaving the 0 for the build, if there is no build then don't use zero:
1 comes first thus:

getVersion: returns all, everything
get1Version: returns the major version number
get2Version: returns minor version number

Notice how naming the numbers this way is very intuitive, efficient and clear.
Determining which number is larger is just a matter of comparing the 1st if they match the second and so on...
And the first numbers always have a higher importance
(Major > Minor) then the numbers after them in sorting:
example: 3.1 > 1.4, tada no string issues.
Because it's comparing two sets of two ints that are ordered by precedence. One has the highest precedence, the larger the number the lower the precedence.

Not to mention flexible, scalable:
//When version number of something is something weird with:
then the following becomes:
for w: get4Version
for v: get5Version

For completeness:
Build version is an int
get0Version: returns the build version

Please do allow modular use of these methods:

End of e.g.

About the ogl --> svg
The geometry shader looks a very convenient starting point for this. Maybe this is (??going to be??) possible to do with ogl, ocl (and openvg)?

02-19-2010, 09:29 AM
Another feature request:
OpenGL: IGP's, Discrete cards and other graphic card control stuff.
(I'm serious starting to consider a thread in OpenCL)
(Just one more.)

Something that is coming up fairly recent.
Graphic card switching while running.
Switching between a discrete and integrated graphic processor while running.
It would be useful if the developer who is e.g. making a game to ask for maximum performance without IGP fall-back if he WANTS to. Because when you're in a game where the load is considered on the edge between going Discrete or Integrated. The computer could be switching a lot. Thus it would be handy to limit such scenarios OS-side/application-side.

It would also be handy for debugging and profiling purposes to be able to control what cards are being used. If the IGP instead of or with the Discrete graphic card is being used. If the developer doesn't want/does want multiple graphic cards to stay on. (profiling and testing purposes, write once deploy anywhere)

Having some GPU-minimal mode where it's still possible to pipe things too the connectors of the discrete card. And some slightly less minimal mode for asking the GPU-minimal and being able to use VRAM for e.g. an IGP. This would be especially useful when the IGP has very little VRAM, but the processing power in the discrete card isn't needed. (The power savings are a good thing too.) The IGP can still access all resources very fast and even pipe the output back to the discrete cards connector. (Handy for applications to decide what screen they want to be on.) This is more something for drivers. But applications can also benefit from load balancing Thus the reasoning for having an API in OpenCL/OpenGL?/OpenWF?(the port, output connector thinghie)

This could be handy for e.g. profiling, load balancing of applications with settings.
Such an API would give a standard interface for Applications or/and OperatingSystem to communicate about use of/load balancing/profiling/power savings in graphic cards.
(e.g. being able to seperately let the application use the present graphic processors resources: memory, ports, gpu. )

More granularity for instancing please.