PDA

View Full Version : What was wrong with the old API?



BManTR
11-26-2011, 04:10 AM
Hi all,
I had started OpenGL coding 10 years ago for several years, and then, I quited graphics programming. Nowadays, I returned to graphics sector again, and what i see is coding OpenGL has changed in a strange and complicated way.
Previously, it was easy to learn, simple to write, and fast to develop for graphics applications, and the better part was its simplicity over D3D.
My question is:
Why do people want to remove the old API part, which made openGL a better choice at that time?
OpenGL 3+ version is not as simple as previous API at first sight. (no matrix operations, no transformations, etc. )
Thanks

thokra
11-26-2011, 06:22 AM
Why do people want to remove the old API part, which made openGL a better choice at that time?
OpenGL 3+ version is not as simple as previous API at first sight. (no matrix operations, no transformations, etc. )

First of all, being a better choice at some time doesn't mean to be the better choice later. That's especially true for graphics programming and graphics hardware.

Second, no one is telling you to forsake fixed-function coding style, GL2 paradigms and so on. NVIDIA and AMD fully support the compatibility profile which basically garantuees functionality dating back to before you started graphics programming. If you want to use immediate mode, if you want to use FF gouraud shading and so forth, feel free to do so. Still, no one serious about OpenGL programming can suggest to you that ignoring recent developments is a good idea.

Third, matrix operations and the matrix stack have been dropped, yes. But that doesn't mean anything - if you wanted to really do matrix math in the past, you'd have had to exploy a third party library or your own functions anyway. Implementing a matrix stack is 1-2 days of work. And there are plenty of well designed math libraries out there which allow for almost anything you could ask for in graphics programming, most notably GLM by board moderator Groovounet.

Forth, no transformations? You are aware that shaders give you complete freedom to affect the orientation and position of every single vertex in the scene in almost all thinkable ways, right? This goes hand in hand with my first and second point. Ten years ago there were no shaders, register combiners maybe, assembly language programs maybe but no shading languages and compilers near as powerful as what we have today. Please pardon the language, but dude, you absolutely gotta learn about shaders.

Alfonse Reinheart
11-26-2011, 11:11 AM
Why do people want to remove the old API part, which made openGL a better choice at that time?

For many reasons.

The fixed function pipeline often gives the illusion of knowledge. For example, all (http://stackoverflow.com/questions/8273470/c-opengl-glulookat) of (http://stackoverflow.com/questions/8265636/how-to-get-specular-color-in-opengl) these (http://stackoverflow.com/questions/8263690/problems-with-opengl-lighting) questions (http://stackoverflow.com/questions/8252886/glulookat-not-affecting-gluquadric-objects) represent (http://stackoverflow.com/questions/8277349/unable-to-use-lighting-properly-in-opengl-project) persons who think that they know how OpenGL works, but don't. In reality, these people have been copying and pasting code in all of their projects, and now suddenly, they want to do something different that there's no code to copy from. Or their mass of conflicting copy-and-paste code has resulted in a bug.

You cannot get that with shaders. If you're using shaders, you have to know how they work.

Another thing is that shaders are much simpler than the various fixed-function state to actually understand. Shaders spell out in meticulous detail exactly what's going on. They're programs, written in a C-like language.

They're not some conflicting mass of global enables and state that could be set from anywhere in your code. To know what happens when you render with a shader, all you need to know is what uniforms were given, what textures were bound, and what attributes were passed. Some of those bugs I linked to came from not setting some arbitrary piece of state to enable some thing in order to cause some effect to happen.

Dealing with fixed function can be painful unless you know every detail about how it works. How TexEnv works. How the lighting works, and what the lighting computes. How to feed data from lighting into TexEnv. How to feed texture coordinate generation into TexEnv. Using crossbar to change the order of things. Etc.

Once you understand shaders, they're a lot easier to work with than fixed-function. Oh, they don't hold your hand. They don't allow you to use lighting without knowing the algorithm behind lighting. But that too is a benefit, because it requires the person to at least minimally understand what's going on.

I wrote the tutorial in my signature as a way to help people over that hump.

glfreak
11-26-2011, 12:36 PM
I agree there were very nice features in the old API (AKA compatibility API) that you already mentioned. However, video hardware changed, fixed pipeline becomes ancient, and the API needs to catch up with the new hardware.
Now what next? I anticipate no API very soon, just let the CPU giants create a super CPU and problem solved! Remember the FPU Emu era? or even the Phyics hardware? All done in CPU.
So what I expect no need for an API, shading is doe in your standard programming pipeline.

V-man
11-27-2011, 02:33 AM
Hi all,
I had started OpenGL coding 10 years ago for several years, and then, I quited graphics programming. Nowadays, I returned to graphics sector again, and what i see is coding OpenGL has changed in a strange and complicated way.
Previously, it was easy to learn, simple to write, and fast to develop for graphics applications, and the better part was its simplicity over D3D.
My question is:
Why do people want to remove the old API part, which made openGL a better choice at that time?
OpenGL 3+ version is not as simple as previous API at first sight. (no matrix operations, no transformations, etc. )
Thanks


Yes, GL 1.1 was a easy API to learn but not successful in the Windows market. D3D was a better success because it represented what the hw can do. D3D was guaranteed to be fast.

I'm not gone give a full lesson here, but matrix functions are not done by the GPU and neither are the matrix stacks. Therefore, it belongs in an external library such as GLU or something else. PS : GLU is dead so you need to use somthing else.

Anyway, to each his own. Use GL 1.1. If your objective is not to make a high end game, then 1.1 does the job.

Curiously, GL3 is not a big success either. Most games use DX9 OR 10 OR 11. Perhaps GL will never take over that market. However, GL ES 1.1 has been a big success. GL ES 2.0 is big too and does away with fixed function.

Aleksandar
11-27-2011, 10:44 AM
I'm not gone give a full lesson here, but matrix functions are not done by the GPU and neither are the matrix stacks. Therefore, it belongs in an external library such as GLU or something else. PS : GLU is dead so you need to use somthing else.

glMatrixMode()/glMultMatrix()/glPushMatrix()/glPopMatrix()/etc. have never been the part of GLU!

What is done on the GPU is not the predominant factor. Many functions are still executed on the CPU side.

The main advantage of the "new" approach is flexibility. Everything what is done with fixed functionality can be done by shaders, but the opposite is not (and never will be) the case. On the other hand, using shaders and leaving to programmers to implement their own functionality enables lean and more optimized implementation of the drivers (at least it could be if the core functionality take precedence in the optimization).

Honestly, the learning curve is now steeper, but when you come to any serious problem you'll find that without shaders many of them will be hard or even impossible to solve.

glfreak
11-27-2011, 12:23 PM
API does not matter now. It did before, at least to me, when GL 1.x was a very straight forward and "clean" compared to another competitor.

But now APIs almost compete on the same ugliness in order to take advantage of the new HW and how it works.

But who cares? I mean you can abstract your graphics rendering path on top of whatever API or even your own software renderer. Then you make the switch trivial and transparent.

And I quote myself here: "My graphics API of choice is C++" - glfreak :D

V-man
11-27-2011, 05:24 PM
glMatrixMode()/glMultMatrix()/glPushMatrix()/glPopMatrix()/etc. have never been the part of GLU!

I have never said it is part of GLU. I am using the word "belongs" in the sense that it should have been in GLU or some other library.



What is done on the GPU is not the predominant factor. Many functions are still executed on the CPU side.

Which functions are done by the CPU? Sure, there are functions for setting things up and then you fire a "Draw" function call but those don't exactly count.

I would say that it is a major factor. D3D's ideology has always been to be a thin layer between your program and the hardware. GL3 seems to have picked up the same concept but not too well since nvidia and AMD still implement there "compatibility" profile.


The main advantage of the "new" approach is flexibility. Everything what is done with fixed functionality can be done by shaders
Although that is true, the same can be said about GL 2.0. So, it isn't something specific to "core profile".


On the other hand, using shaders and leaving to programmers to implement their own functionality enables lean and more optimized implementation of the drivers (at least it could be if the core functionality take precedence in the optimization).
Probably. I saw some article that compared the core profile with compatibility on a nvidia and there was a 5% speed advantage.

But it isn't just about flexibility and speed. It is also about simple and stable drivers (I hope).

mbentrup
11-28-2011, 01:14 AM
Do you have any links to that comparison ? The last info I saw on that subject from Nvidia said that compatibility is always at least as fast if not faster than core, but that info is already a year old.

(http://www.slideshare.net/Mark_Kilgard/gtc-2010-opengl, page 97)

aqnuep
11-28-2011, 01:43 AM
Core profile being slower than compatibility is only the fault of the driver developers as they did a quick and dirty implementation of the core profile by adding another wrapper to the functions and check for the added restrictions of the core profile there.

If using only core features, core profile should have the very same speed as compatibility, and using compatibility features will be slower in most cases than their core equivalents.

Maybe today it is not yet the case, but I wouldn't use it as an indication about future performance. If you really need ultimate performance, use core features only, that's what the hardware really supports (at least in 99% of the cases). Compatibility features are really nothing more than something that could be done with a 3rd party library built on top of OpenGL.

mbentrup
11-28-2011, 01:56 AM
I don't dispute that future drivers or graphics cards may drop hardware support for compatibility-only features, but *Nvidia* claims to support them in hardware in the *current* generation GPUs, having at least equal or sometimes better performance than the equivalent rewritten core code.

Vendor's performance claims have been wrong in the past, and a benchmark can't run on hypothetical future GPU's so I was just interested in seeing some actual numbers.

aqnuep
11-28-2011, 06:14 AM
I didn't say that vendors will not support compatibility features, I said that they are actually done in "software". It *is* hardware accelerated, but even fog, fixed function lighting and all other deprecated features are actually done using shaders. In fact, they were done so for a long time now.

We are not talking about hypothetical future GPUs, we are talking about the last 5 year's GPUs. These don't have hardware for almost any of the fixed function stuff. Those are just emulated by driver magic, nothing more.

V-man
11-28-2011, 12:38 PM
Do you have any links to that comparison ? The last info I saw on that subject from Nvidia said that compatibility is always at least as fast if not faster than core, but that info is already a year old.

(http://www.slideshare.net/Mark_Kilgard/gtc-2010-opengl, page 97)


I didn't keep the link anywhere. I think it was on gamedev article and the person admitted that he didn't render anything extensive.

I found that slide while searching. I also found that post in the Linux forum about core being slower than compatibility.

Is this something that has been properly tested? Is it slow on Windows as well?

kRogue
11-29-2011, 05:14 AM
We are not talking about hypothetical future GPUs, we are talking about the last 5 year's GPUs. These don't have hardware for almost any of the fixed function stuff. Those are just emulated by driver magic, nothing more.


One bit worth noting: most of the time if a driver implements a bit of FF, it will do a better job than one would do implementing oneself on top of a shader only API. Makes sense as the GL implementation has greater access to the gizmo.


Along those lines, GLES2 had an epic brain fart: it does not have user defined clipping planes (be it via gl_ClipDistance or old school gl_ClipVertex), likely because someone said "just us discard" instead.. with no realization that hardware clipping is SOOOO much better than hackey clipping via discard... hardware has to clip anyways to something that looks like user defined clip distances of -1<=clip_z<=1


On another note I really, really wish people would not call them "GL drivers"... it totally borks the mental image of the jazz in my eyes. I'd prefer to call it a GL implementation.. the reason being that a driver implies something simple and low-level "feed these bytes to the device to make it beep" where as a GL implementation really is a renderer (different rendering strategies, etc)... not to mention all the other stuff it has.. like a compiler.

aqnuep
11-29-2011, 07:02 AM
One bit worth noting: most of the time if a driver implements a bit of FF, it will do a better job than one would do implementing oneself on top of a shader only API. Makes sense as the GL implementation has greater access to the gizmo.
I would rather say that this is true only if there is also FF hardware behind it. I agree that sometimes the driver writers are really able to optimize certain use cases better, but in general, unless there is also FF hardware behind the thing FF API vs programmable one is more of a limitation than a feature.

mhagain
11-29-2011, 12:02 PM
One bit worth noting: most of the time if a driver implements a bit of FF, it will do a better job than one would do implementing oneself on top of a shader only API. Makes sense as the GL implementation has greater access to the gizmo.
I would rather say that this is true only if there is also FF hardware behind it. I agree that sometimes the driver writers are really able to optimize certain use cases better, but in general, unless there is also FF hardware behind the thing FF API vs programmable one is more of a limitation than a feature.

I'd go 50/50.

If you're trying to write a fully featured FF-replacement shader yourself (or if the shader you're writing is starting to veer in that direction) then you should definitely stop and ask yourself "does the driver actually do this better?"

On the other hand, you know your own program better than the driver does. The driver needs to make fairly general assumptions for the sake of flexibility, conformance and invariance rules, but you can probably do better. You know your data sets, you know the operations you want to perform, you know the operations you can afford to skip, and you should be able to pull out better performance by rolling your own with specific reference to your use case requirements.

They're two extremes and any individual case is more likely to be somewhere in the middle. In that case I doubt if the performance difference is going to be anything other than marginal (your main bottlenecks are almost certainly elsewhere), so what should tip the decision is your general code architecture and personal preferences.

kyle_
11-29-2011, 12:39 PM
On the other hand, you know your own program better than the driver does.
Unless you are AAA title maker, in which case driver knows your program pretty damn well too.

Zenja
11-30-2011, 07:37 PM
Do you have any links to that comparison ? The last info I saw on that subject from Nvidia said that compatibility is always at least as fast if not faster than core, but that info is already a year old.

(http://www.slideshare.net/Mark_Kilgard/gtc-2010-opengl, page 97)


I didn't keep the link anywhere. I think it was on gamedev article and the person admitted that he didn't render anything extensive.

I found that slide while searching. I also found that post in the Linux forum about core being slower than compatibility.

Is this something that has been properly tested? Is it slow on Windows as well?

For an embedded system I'm working on under Linux, we still get slower performance with Core profile than with Compatible profile.

Compatible profile (4.1.1161 driver) 115 fps.
Core profile (4.1.11161) 93 fps.

As we've already concluded, the AMD developers have most probably implemented the Core profile above the compatible profile, with additional run time checks. This is why it's slower.

kRogue
12-01-2011, 03:29 AM
They're two extremes and any individual case is more likely to be somewhere in the middle. In that case I doubt if the performance difference is going to be anything other than marginal (your main bottlenecks are almost certainly elsewhere), so what should tip the decision is your general code architecture and personal preferences.


Examples of where FF is better on modern hardware than programmable:

Alpha test (many GPU' have a dedicated bit for that) "Built in primitive types" vs "Using a geometry shader", for example GL_QUADS and GL_QUAD_STRIP Clipping vs discard image format conversion (there is hardware out there with that have FF image conversion jazz, usually on top of and separate from the GPU though) filtered texture look up and gather (witness all the texture look up functions that can be "implemented" with unfiltered texture lookup).

The first is a minor improvement on desktop, but can be notable on embedded. The 2nd can be a big improvement on desktop. The 3rd is a HUGE freaking difference. The 4th is sometime pretty critical on embedded, and the last is huge everywhere.

So yes it depends, but for common uses (such as the above) dedicated FF is likely better. For other things (like texture combiners ala days of GeForce3) I am all for programmable interface.

mbentrup
12-01-2011, 03:59 AM
Clipping is supported in the core profile. Only gl_ClipVertex has been removed from core, but gl_ClipDistance has not.

kRogue
12-03-2011, 06:39 AM
Clipping is supported in the core profile. Only gl_ClipVertex has been removed from core, but gl_ClipDistance has not.

When it comes to clipping not being present, I was talking about OpenGL ES2 (not OpenGL desktop) and as an example of when the doing an operation via programability (for clipping via discard) vs having FF.

thokra
12-03-2011, 07:56 AM
kRogue: How is something that may be IHV specific, like separate conversion hardware, relevant to the core API? I thought we weren't talking about the low-level benefits of special-purpose hardware goodness, but how hardware features are exposed on the application level.

kRogue
12-05-2011, 01:40 AM
kRogue: How is something that may be IHV specific, like separate conversion hardware, relevant to the core API? I thought we weren't talking about the low-level benefits of special-purpose hardware goodness, but how hardware features are exposed on the application level.


Keep in mind most of my bile is directed firmly at OpenGL ES2. With that disclaimer in mind lets look at one epic fail train that is in OpenGL ES2 and NOT OpenGL: glTexImage family.

Under OpenGL ES2, essentially, the GL implementation is not supposed to do any format conversion for you. Thus if you made your texture GL_RGB565, then you need to feed in GL_RGB565 for it, i.e. the GL implementation is not supposed to convert it for you. This is stupid. Firstly, many GL implementations do not store their textures in scan line by scan line, but rather twiddled, so the implementation needs to do that anyways. Secondly, it is a royal pain in the rear, to write some of the conversion code optimized for each freaking platform. Worse it is pointless, since like 99% time the bits need to be twiddled anyways. Some hardware has special bits to do that conversion so if the freaking API said it would do the conversion that hardware would be used by the GL implementation. Instead, we all write for the lowest that we can use, so 99% time the conversion is done by an application via CPU.... and since lots of SoC's do not have NEON, not even using NEON... epic fail. Where as if the specification said it would do the conversion for you, then that implementation could use specialized hardrware, or special CPU instructions, etc... instead, a very fixed function kind of thing that is common is not done by the GL implementation because the GLES2 spec is retarded. For what it is worth there are some extensions that will let you store image data as YUV and when used in a shader you do get RGB.

Adding to my bile: using half floats under GLES2 just plain sucks. You need to do the conversion yourself, and it is a bit of a crap shoot guessing the endian-order at times. I've seen the exact same code seg-fault in glTexSubImage2D on half floats (and floats) in one GLES2 implementation but in another work just fine... both platform had the same endianness and yes both listed the float and half_float texture extensions...

What I am saying is that very common bits done very often for gfx and often the same way, should be (mostly) FF: clipping, image conversion, some primitive types beyond triangles, lines and point sprites, alpha test, texture filter and gather, LOD computation.

Alfonse Reinheart
12-05-2011, 03:04 AM
Where as if the specification said it would do the conversion for you, then that implementation could use specialized hardrware, or special CPU instructions, etc... instead, a very fixed function kind of thing that is common is not done by the GL implementation because the GLES2 spec is retarded.

Alternatively, it could be broken. And then what do you do? If the implementation's conversion for some particular color ordering you're using doesn't work, you have to implement it yourself.

The more stuff you force into implementations, the greater chance they will be broken. And you've already pointed out that the handling of floats/halfs in some implementations is suspect. And that's just straight copy uploading; adding conversion on top of that is just begging for trouble.

Desktop GL developers had to expend effort to stop getting conversions to happen. We had to come up with lots of ad-hoc rules just to figure out what the proper ordering would be to prevent conversion (beyond swizzling, which is just a fancy memcopy). This is necessary for getting maximum texture upload performance. If the implementation simply gave an error when a conversion would have happened, it would be a lot easier for all involved.

Conversion is something somebody could stick in a library and just give it away. It is far from essential.

V-man
12-05-2011, 03:36 AM
I think you guys are confused. This site is for OpenGL. If you want to make suggestions for GL ES, then khronos.org is what you want.

We are talking about old GL and the advantages/disadvantages of new GL in this thread.

mhagain
12-05-2011, 04:42 AM
Thing is though that there are some ES features that it would be nice to see going into full GL, and explicit matching of format with internalformat is one of them. This has bearing on one major thing that was wrong with old OpenGL - because of it's heritage as "a software interface to graphics hardware ... that may be accelerated" it's the case that there are various software layers that any given OpenGL call must go through, and one of those may be a conversion layer. That in itself is OK; what's not OK is that you as the programmer have no way of knowing when it happens, aside from lots of profiling and educated guesswork after the fact.

kRogue
12-05-2011, 06:28 AM
I think you guys are confused. This site is for OpenGL. If you want to make suggestions for GL ES, then khronos.org is what you want.


So true, I just want to make sure that the idiocy of GLES2's texture image specification does not infect OpenGL.



Thing is though that there are some ES features that it would be nice to see going into full GL, and explicit matching of format with internalformat is one of them.


Shudders. In OpenGL you can set the precise internal format, you cannot do that with OpenGL ES2. Rather it is determined implicitly. If you want to make sure that there is no image space conversion you can do that in OpenGL: make sure the internal format you specify is exactly the same as what you feed GL. Now if one uses the formats like GL_RGB in dekstop GL for internal format, then one is asking for trouble, as then the GL implementation chooses the internal format.



Alternatively, it could be broken. And then what do you do? If the implementation's conversion for some particular color ordering you're using doesn't work, you have to implement it yourself.


Um, if an implementations color conversion is broken it is a bug in the GL implementation! Mind you, in desktop GL we all like 99.999% time assume the GL implementation has the color conversion correct when we don't make the internal and external formats match. We do this all the time with half floating point textures for example.

I just want to make sure that the idiocy of GLES2's no-image-format-conversion does not invade desktop GL.

aqnuep
12-05-2011, 07:03 AM
If we have to talk about GLES then I have a few words about this format conversion or no format conversion debate.

The reason why this was left out of the GLES specification is because it involves a lot of GL implementation code and the possibility of using suboptimal inputs, as it was already mentioned.

Please keep it in mind that embedded hardware had limited resources compared to desktop and even though embedded hardware evolved a lot this is still the case. Personally, I am happy to see that they don't waste memory and run-time for larger GL implementations on embedded hardware, also I'm happy to see that they don't allocate a potentially big temporary buffer just to make the format conversion. Even a few megabytes is really a premium on many embedded devices so I think format conversions are something that have to be done at build time not at run time. This is actually true for desktop as well if you do something serious.

kRogue
12-05-2011, 08:28 AM
Thing is though that there are some ES features that it would be nice to see going into full GL, and explicit matching of format with internalformat is one of them.


Given that I work on embedded and GLES2 all freaking day, let me share some bits. Firstly, because most GLES2 implementations store pow2 texture data twiddled, that means the tex image calls all need to do a conversion in terms of pixel ordering. The image conversion could be done at the same time as the repacking of pixel order. As far as needing a large buffer to do the conversions, that is not true at all either, as the color conversion can be done in place on the location in RAM where the texture data resides. Lastly, regardless, someone needs to do that conversion. So we find that every freaking application/fraemwork needs to provide the image conversion routines.. i.e. write the same freaknig thingover and over agian. If a framework does the conversion for you that WILL take up more memory to store the converted data before giving it to the GL implementation which will in turn copy and twiddle it internally. When the GL implementation does the color conversion, it can perform the color conversion on the location where it stores the actual texture data INPLACE. Lastly, GL implementations are usually tightly tuned to the SoC, thus not only will a GL implementation providing the color conversion use less memory than a framework providing it would, but 99.99% time it will run faster using less CPU. And oh yes, less work for most developers is always a good thing.

Secondly, if you really want to know where GLES2 implementation eat memory, you need to look at the tiled renderers... for these buggers each GL context can take several megabytes easily (I've seen as high as 32MB) NOT including the framebuffer, textures or buffer objects. That memory cost comes from the tile buffer allocated for each GL context.

aqnuep
12-05-2011, 10:37 AM
Just two questions about something that I still don't understand:

1. Why don't you do the image format conversion off-line, during build? In most cases this can be done.
2. How do you do in-place conversion between formats of different size?

kRogue
12-05-2011, 11:47 AM
the first answer is kind of simple: when deciding what format for the gizmo to use you need to check at run time what extensions and how much memory the gizmo has. Additionally, some data you do not have until the app is running (for example procedurally benerated data and/or data fetched externally).

in place conversion is not rocket science, you just make for each pair (accpeted input format, internal format) a function that does the job:



//convert from some input typ stored at inpixies to storage at outpixels
typedef void (*store_and_convert)(const void *inpixies, int bpp, int bperline,
void *outpixels);

store_and_convert funcs[number_input_types_supported][number_output_types_supportrd];


and then the teximage call becimes just dererence fubction table and call the function..

Alfonse Reinheart
12-05-2011, 12:00 PM
In OpenGL you can set the precise internal format, you cannot do that with OpenGL ES2.

That is really the problem with ES. Not that there's no conversion, but that it doesn't allow you to pick a specific internal format. So the format matching is done by convention, rather than by explicit request (and thus making it possible to mis-match).


Mind you, in desktop GL we all like 99.999% time assume the GL implementation has the color conversion correct when we don't make the internal and external formats match. We do this all the time with half floating point textures for example.

We do? In general, you wouldn't use a half-float internal format if you weren't passing half-float data. Sure, you could, but no application that actually cared about performance ever would.


The image conversion could be done at the same time as the repacking of pixel order.

Assuming of course that the repacking is not a special DMA mode. Because if it is, you can't really do in-place conversion.


Um, if an implementations color conversion is broken it is a bug in the GL implementation!

Yes, but opening up more avenues for bugs isn't exactly helping the "buggy OpenGL drivers" situation.

mhagain
12-05-2011, 12:20 PM
There are two sides of an argument going on here. Performance and flexibility. OpenGL is traditionally on the flexibility side; you provide the data, specify it's format and layout, tell GL what internal representation you'd like, and let the driver work the rest out.

That's cool for some use cases but it's not cool for others.

Sometimes you want the data to go up FAST. You're going to be using that texture later in the current frame, or in the next frame at the absolute latest, you're doing an upload, and you need to know that the format and layout you're using is going to exactly match the internal representation OpenGL is going to give you. Passing through any kind of software conversion layer - whether your own or in the driver - just doesn't cut it.

So how do you find that out? Because right now it's trial and error; you have no way of knowing. Even worse, it can vary from implementation to implementation, so it's not something you can do a bunch of tests on and then code the results into your program.

At the very least that opens a requirement for a "gimme gimme exact format baby" extension, but by now it's more appropriate in the Suggestions forum rather than here.

V-man
12-05-2011, 03:52 PM
For the case of GL ES, I really do not see what the big deal is.

Isn't there a format that is supported by all the GPUs? On the desktop, they all support BGRA 8888.
For the embeded world, there should be a common supported format, whether it is R5G6B5 or BGRA 8888 or BGR 888 and so on. Even if there is a GPU that doesn't support the "common" format in question, surely it supports something.

I can imagine that it can be a huge problem if there were 100 different GPUs and each supports its own unique format and they are forcing you to write that format converter.

So, how big is the problem?

aqnuep
12-06-2011, 01:45 AM
Yes, you are right, R5G6B5 and BGRA8 is generally supported by 99.99% of embedded devices.

Things can become more complicated only in case of compressed texture formats (e.g. PVRTC or ETC) but hey, you store the compressed images anyway separately, I mean you don't do on-the-fly texture compression on embedded hardware, do you? So what's the problem? You can solve the selection by using a single conditional in the client side code and there is really no need for format conversion.

kRogue
12-06-2011, 10:47 AM
Sometimes you want the data to go up FAST. You're going to be using that texture later in the current frame, or in the next frame at the absolute latest, you're doing an upload, and you need to know that the format and layout you're using is going to exactly match the internal representation OpenGL is going to give you. Passing through any kind of software conversion layer - whether your own or in the driver - just doesn't cut it.

So how do you find that out? Because right now it's trial and error; you have no way of knowing. Even worse, it can vary from implementation to implementation, so it's not something you can do a bunch of tests on and then code the results into your program.



Lets talk conversions ok. Lots of us are familiar with the RGBA vs BGRA vs ABGR fiascos for conversions. In the embedded world it is even richer and much more hilarious. The formats exposed by GLES2 for textures are essentially RGB565, RGBA4444, RGBA5551, RGBA8, RGB8, L8, A8, LA8 .. and that is mostly it. Mind you, plenty of platforms do not really support RGB8, they pad it to RGBA8. Also, in the embedded world, like 99.999999% time, it is a unified memory model. It gets richer, there are SoC's that let you write _directly_ to a location in memory from which GL will source texture data. There are some pretty anal rules about the formats, but you can see that getting an actual location to directly write bytes of image data is orders of magnitude better than the idiot guessing game of will it convert or not. If you need to stream texture data on an embedded platform, using glTexSubImage2D is never going to have a happy ending, you need something better, you need a real memory location to directly write the image data.

With that in mind, and with in mind that GL implementations will more often than not twiddle the texture data, the color conversion issue is not that big a deal on top of everything else, really it is not. Besides, I would wager that the folks making the GL implementation will do a much better job than me or for that matter most developers, simlpy because that is core work for GL implementations, twiddling bits to make the hardware happy.

What is quite troubling, especially when you think about it, is that color conversions are done all the freaking time every time you draw one frame of data (texture look up [internal texture format to format used in shader arithmetic] and then writes to framebuffer).. and yet no one cares... looking in particular in the context of texture look ups, there really is no excuse for color conversion to not be in a GL implementation.