PDA

View Full Version : Choice for Toolkit



3KyNoX
05-06-2011, 01:21 PM
Hello everyones,

I am new to OpenGL and looking for a future project about choosing the correct toolkit that fits my needs.

I looked many already like :

- SDL
- SFML
- OpenTK (this one is really interesting)
- Glut

But for example,

- SDL don't permit to use OpenGL 4.1 and stop to 3.xx version
- SFML doesn't support OpenGL ES and mobiles devices
- OpenTK looks great, it use OpenGL 4.1, got OpenGL ES but not fully supported for mobiles devices and need some maths improvements.
- Glut really don't fit for the project

Well, as you see, I need using latest OpenGL and be able to make my future game engine on Win, OSx, mobiles (iPhone, Android, ...) and maybe consoles.

Some informations could be great.

Thanks a lot.

ZbuffeR
05-07-2011, 01:32 AM
Even whitout the toolkit, I don't see how you can write GL code to be 100% GL 4.1 and GLES compatible.
Dependance on the toolkit should be minimal, so having at least 2 distinct low level code, one for desktop, one for mobile, seem unavoidable.

3KyNoX
05-07-2011, 07:56 AM
OpenTK should be a great choice.

It use OpenGL 4.1 for systems who accept that version and automatically use the maximum version if using the toolkit on a mac , for example, OpenTK should work on 10.7. OpenGL support is determined at runtime: if you run OpenTK on a 2.1-capable system you'll be able to use 2.1 methods (newer ones will throw an exception); if you run it on a 3.2-capable system you'll be able to use 3.2 and so on.

Also accepting OpenGL ES.

kRogue
06-01-2011, 09:55 PM
Even whitout the toolkit, I don't see how you can write GL code to be 100% GL 4.1 and GLES compatible.


Totally agree. Here are some specifics:

Shader differences between GLES2 and GL2.x/3.x/4.x. In particular the precision qualifiers. Some API differences, in particular glTexImage is a different beast in GLES2 than it is in GL. There are other differences too where a fair amount of the API calls in GLES2 are much more limited than in GL2/3/4. GL3.3 (or was it 3.2?) mandated using vertex array objects, as there is no such in GLES2 already that is an API break between the intersection of API functions of GL3.3 and GLES2.

Additionally, coding for GLES 2.x is a different beast than coding for GL3.x as is is there a difference between coding for GL 3.x and 4.x simply because they are different generations of hardware. In addition, almost every embedded GLES2 implementation has peculiarities (PowerVR SGX, ARM Mali and even nVidia Tegra). Though I freely confess that Tegra's peculiarity in relation to blending is a strength not an issue! If you try to write entirely platform agnostic code in terms of performance tuning prepared to be severely underwhelmed when you go to the embedded world... each of these GPU's one needs to optimize one's rendering strategy differently, the SGX is by far the most sensitive and the Tegra is by far the closest to looking like GL implementations on the desktop. Each of these on their developer sites/newsletters/whatever give good guide lines for how to write for each of them.

Lastly, in regards to SDL 1.2, under desktop, it simply creates a GL context the old fashioned way, which means for both NVIDIA and ATI one gets a compatibility profile GL4.x context for current generation hardware and GL 3.x compatibility profile for previous generation hardware. As for SDL1.3, you can specify the nature of the profile though. In all honesty I recommend SDL to get a GL context and to handle event jazz. The API is simple to use. For iOS, it goes through the extra pain of doing this for the device (as iOS does NOT use EGL to create contexts but a very Apple specific API). For MS-Windows and desktop Linux SDL uses wgl and glx respectively. For the embedded Linux platform (such as the Nokia N900), SDL exposes the window handle so that one can use EGL (if you are writing for the N900 go ahead and use the package SDL_gles to handle creating the context but beware it is only cool, atleast the last time I checked, for creating one context as it's multi-context API and code is not right).

At the end of the day, what you are worried about is very small potatoes: creating a GL context and making it current.

Xmas
06-03-2011, 07:33 AM
Shader differences between GLES2 and GL2.x/3.x/4.x. In particular the precision qualifiers. Some API differences, in particular glTexImage is a different beast in GLES2 than it is in GL. There are other differences too where a fair amount of the API calls in GLES2 are much more limited than in GL2/3/4. GL3.3 (or was it 3.2?) mandated using vertex array objects, as there is no such in GLES2 already that is an API break between the intersection of API functions of GL3.3 and GLES2.
- Precision qualifiers were added to GLSL 1.30 to aid portability.
- Some GLES2 commands are more restricted than their GL counterparts, but that doesn't really make it harder to write portable code. Just target GLES2 first.
- Not core, but there is an extension available at least on some platforms (including iOS): OES_vertex_array_object (http://www.khronos.org/registry/gles/extensions/OES/OES_vertex_array_object.txt)

kRogue
06-03-2011, 08:53 PM
- Precision qualifiers were added to GLSL 1.30 to aid portability.

True, and in GLSL 1.30 the meaning of precision qualifies is nothing, i.e. they are ignored. You do get source level compatibility, but the need to carefully choose the precision qualifier in a fragment shader under most GLES2 implementations cannot be understated.

On the subject of shaders, writing to gl_FragDepth is NOT in GLES2.. there is the extension GL_EXT_frag_depth to give that back, but it is not a popular extension.



- Some GLES2 commands are more restricted than their GL counterparts, but that doesn't really make it harder to write portable code. Just target GLES2 first.


This is mostly true, but there are some serious exceptions. The most serious one is glTexImage2D/glTexSubImage2D. In GLES2, the 3rd argument (internalFormat) must be one of GL_ALPHA, GL_LUMINANCE, GL_LUMINANCE_ALPHA, GL_RGB or GL_RGBA, i.e. it states the number of channels and the interpretation of the channels. In contrast, in GL, that argument can be used to precisely state how the texture is stored (ala GL_RGB8, GL_RGB565, in addition to floating point and integer textures). In GLES2, the "internal storage format" of a texture is set by the arguments type and format (7th and 8th argument) of glTexImage2D. The bigger pain comes into play when using the floating point texture extension of GLES2.

Another little irritations are that some GL API points use doubles (usually really clampd) but GLES2 only supports float (which does make since really).

Other annoyances include glGenerateMipmap... even if a GLES2 implementation supports GL_OES_texture_npot, glGenerateMipmap does not work for non-power-of-2 textures (atleast for those tightly conforming to spec). Indeed, SGX and Mali will generate an error on calling glGenerateMipmap on a non-power-2 texture even though both support GL_OES_texture_npot.

On the subject of GLSL programs, under desktop GL one can attach many shaders of the same type to one GLSL program, where as under GLES2 one can attach only one shader of each type. This is not really a big deal, but it is worth noting. Additionally, the binary shader API between GL4 and GLES2 is different; GL4 is at the GLSL program level where as GLES2 is at the shader level. Just to make life more complicated there is also GL_OES_get_program_binary which gives the GL4 binary API to GLES2.

Lastly, the divergence between GLES2 and GL3/4 core profile is a stinker as the intersection of the API's are not compatible with each other (vertex array objects in particular along with 1 and 2 channel textures).

There are many times that I really, really freaking hate GLES2 in that it cut out so much functionality that is useful: flexible buffer mapping, almost all read back (buffer objects and textures), severe limitation on FBOs, control over active mipmap levels (this is restored via an extension though) and more if I keep thinking on it :sleeping:

Alfonse Reinheart
06-03-2011, 09:41 PM
In GLES2, the 3rd argument (internalFormat) must be one of GL_ALPHA, GL_LUMINANCE, GL_LUMINANCE_ALPHA, GL_RGB or GL_RGBA, i.e. it states the number of channels and the interpretation of the channels. In contrast, in GL, that argument can be used to precisely state how the texture is stored (ala GL_RGB8, GL_RGB565, in addition to floating point and integer textures). In GLES2, the "internal storage format" of a texture is set by the arguments type and format (7th and 8th argument) of glTexImage2D.

What? That's absolutely ridiculous. The whole point of having two sets of parameters was that one set was for the texture's internal format (which could be sized) and the other described only the format of the data that the user was providing. Is this here to avoid copying data or something? And even if it was, couldn't they just say that the transfer format and internal format have to match, that you have to use sized internal formats?

I can understand incompatibility for the sake of actual hardware, like not having gl_FragDepth (since that would probably screw over PowerVR's tile-based renderer), or the lack of GL_RED and RG textures. But this texture thing is so arbitrary; it doesn't make since that there would be a specific hardware need to do it this way.

kRogue
06-03-2011, 10:11 PM
What? That's absolutely ridiculous. The whole point of having two sets of parameters was that one set was for the texture's internal format (which could be sized) and the other described only the format of the data that the user was providing. Is this here to avoid copying data or something? And even if it was, couldn't they just say that the transfer format and internal format have to match, that you have to use sized internal formats?


Sucks doesn't it? Here is the man page: http://www.khronos.org/opengles/documentation/opengles1_0/html/glTexImage2D.html and a link to the GLES2 man pages: http://www.khronos.org/opengles/sdk/docs/man/. Notice that under GLES2 both internalFormat and format must be the same. The behavior of glTexSubimage2D, I think, should do conversions for you.. but... you know embedded hardware, don't count on it :whistle:



I can understand incompatibility for the sake of actual hardware, like not having gl_FragDepth (since that would probably screw over PowerVR's tile-based renderer),


No more than discard does, which is core in GLES2 (as there is no alpha test). On another note, having clip planes would help most embedded hardware (Mali, SGX and more) but that was taken out of GLES2 with the line "you can do discard instead" which in light of SGX architecture(and to a lesser extent ARM-Mali architecture) is absolutely insane.

kyle_
06-04-2011, 11:39 AM
Notice that under GLES2 both internalFormat and format must be the same. The behavior of glTexSubimage2D, I think, should do conversions for you.. but... you know embedded hardware, don't count on it :whistle:

I on the other hand think it shouldn't. In fact i like how GLES guys axe features from GL and try to avoid feature creep that happened in 'big' GL for the sake of so called completeness.
This is supposed to be low level API, so if they require the formats to match then its probably because conversion would be done in software on most implementation, which would be silly. They are stuck with the api so it may not be very pretty. Also note that its so much easier to add features, then try to take them back (see 'big' GL).
Then again, i dont have much experience with GLES so its easy for me to say that ;)

kRogue
06-05-2011, 03:45 PM
I on the other hand think it shouldn't. In fact i like how GLES guys axe features from GL and try to avoid feature creep that happened in 'big' GL for the sake of so called completeness.
This is supposed to be low level API, so if they require the formats to match then its probably because conversion would be done in software on most implementation, which would be silly.


On the general issue of feature creep I agree, but on the matter of format conversions I disagree and here is why: format conversion is done often and the first go at most format conversions that most people make is usually pretty poor. One example that really bites to do the format conversions is half float (there is a white paper floating around on a nice way to do it, but it gives an idea of how icky format conversions can be to do well). Additionally, over in embedded land where ARM often rules, you get into the world where what features are on the ARM CPU varies dramatically (NEON being one example), these CPU features determine what is a reasonable way to convert data. One can definitely argue that there should be no need to do format conversion for most image data as it should already be in the format the platform wants, but for a significant number of situations the data for textures is generated at run-time (there are cases far beyond just image data and the data to create the texture is not possible to know at run time). Compounding the CPU issue is that a fair amount of hardware out there as extra bits to do format conversions fast (at low power consumption, not using the CPU).

My take is that if a bit of functionality is often used, it is epic silliness for the platform to not provide it in some standardized form.

I still stand by my claim, GLES2 took the axe to way too much [read back of texture and buffer object data being one example along with the mapping buffer objects]. That buffer object mapping is not even core in GLES2 and that the extension is for write only is truly hideously offensive as for many, many of these platforms it is a unified memory architecture anyways. Axing of clip planes was also just down right *bad* as it dropped a significant performance enhancement possibility on PowerVR-SGX (as that GPU family takes a major hit when discard comes) and to a lesser extent ARM-Mali and in truth likely all tiled based solutions.

Alfonse Reinheart
06-05-2011, 04:46 PM
Compounding the CPU issue is that a fair amount of hardware out there as extra bits to do format conversions fast (at low power consumption, not using the CPU).

My take is that if a bit of functionality is often used, it is epic silliness for the platform to not provide it in some standardized form.

The problem here is a combination of factors.

First, OpenGL ES wanted to at least pretend to be API compatible with OpenGL. I have no idea why; if they weren't going to be serious about it and get it right, then they shouldn't have done it at all.

Second, OpenGL ES tries very hard not to lie about what is and isn't supported in hardware. If you can do it through ES, then the hardware can do it. To do what you're suggesting would either require that all implementations provide conversions whether their hardware can support them or not, or that the implementation provides a way to ask what conversions are possible.

We're talking about embedded systems here. Memory is very precious. You don't want to load up an additional 500KB of code just to implement conversions that nobody will use. So the first alternative is out; you can't make them implement conversions even if their hardware can't support them.

The second option is probably one they should have taken: a query mechanism for conversions. This could be based on proxy textures or some such. Of course, this violates compatibility with desktop GL. Even though they already broke compatibility with desktop GL with their current texture upload code.


Axing of clip planes was also just down right *bad* as it dropped a significant performance enhancement possibility on PowerVR-SGX (as that GPU family takes a major hit when discard comes) and to a lesser extent ARM-Mali and in truth likely all tiled based solutions.

If they can't do discard very well, what makes you think their hardware would be better capable of clip planes? Aren't clip planes usually implemented nowadays as discard-style operations?

OpenGL ES doesn't have the luxuries that desktop OpenGL has. There's no D3D as an alternative for people who want lean-and-mean. It's GL ES, or it's proprietary APIs. And nobody wants the latter. So if the dominant hardware vendor can't do X, then X doesn't get into the spec. Desktop OpenGL can have pie-in-the-sky features, and if one of the vendors can't support it, screw them (see ATI's pre-HD hardware and NPOTs).

You can't say "screw them" to PowerVR in the mobile space because they own the mobile space.

kRogue
06-05-2011, 05:30 PM
We're talking about embedded systems here. Memory is very precious. You don't want to load up an additional 500KB of code just to implement conversions that nobody will use. So the first alternative is out; you can't make them implement conversions even if their hardware can't support them.

Um, 500KB for conversion code, I don't think so, no where near that is needed. As for the memory argument: these little do-dads that have OpenGL ES2 typically have something like 256MB of RAM available. Lastly, conversions can be quite common, especially if you want to use half float textures *gee*. The reason for arguing that the GLES implementation should have it is because as soon as one wishes to use half floats, then you must do a conversion. Considering that the optimal way to do a conversion is tied tightly to the device (CPU+extra bits of the SoC) it is clear that it is better for the GLES implementation to do the conversion. Just so you know, some GLES implementations have quite peculiar ways of implementing glGenerateMipmap (I had one such prototype that essentially did it via CPU).



If they can't do discard very well, what makes you think their hardware would be better capable of clip planes? Aren't clip planes usually implemented nowadays as discard-style operations?


Er, you really need to read up on what "tiled based renderer" means. In a nutshell, for tiled based renderers it usually means something like this:
The display is broken into a grid, each box (called a tile) is typically 8-32pixels by 8-32 pixels in size. Each tile has a "polygon list" Calling glDrawElements/glDrawArrays does NOT trigger any effect on the framebuffer. Instead, essentially, only the vertex shader is called and the GPU realizes, for each triangle (or other primitive type), which tiles are hit by that primitive. For each hit tile, "something" is added to it's polygon list at eglSwapBuffers, the tiles are walked finally doing the actual rasterization. The key bit here is that rasterizing is NOT to the framebuffer it is rasterized on GPU in SRAM. Once a tile is rasterized it's contents are copied out to the framebuffer. One of the EGL bits of preserving framebuffer contents from one frame to the next makes sense here, since by not caring about the previous contents, the GLES implementation does not need to copy the framebuffer into SRAM before doing it's jazz. It also makes the case that doing a clear is a big deal on these gizmos too. Additionally, FBO switches under tile-based renderers are expensive and the strategy on using FBO's is very different for tile-based renderers than for traditional renderers.

From the above one can see that clip planes can be a big boon as it can cull out tiles completely. Moving along the PowerVR-SGX does something interesting in that 3rd step. It does that 3rd step in 2 passes: first it rasterizes only to the depth/stencil buffer tagging each pixel with what fragment shader to run and then it walks the pixels of the tile finally executing the fragment shader(this is why they call their jazz a differed renderer). Naturally, discard totally hurts borks because it means that the fragment shader must be run, as does interlacing blending with opaque operations that write to depth or stencil. For SGX, clip-planes would be great because the decision of weather or not to clip does not need the fragment shader to run, and thus it would keep the SGX renderer happier.

Also, even on the desktop, clipping is not usually implemented by doing a discard, given the choice between faking gl_ClipDistance via discard and using gl_ClipDistance, you can be sure that using gl_ClipDistance will win out especially with nasty expensive fragment shaders, try it out and see, and you will see :D




OpenGL ES doesn't have the luxuries that desktop OpenGL has. There's no D3D as an alternative for people who want lean-and-mean. It's GL ES, or it's proprietary APIs. And nobody wants the latter. So if the dominant hardware vendor can't do X, then X doesn't get into the spec. Desktop OpenGL can have pie-in-the-sky features, and if one of the vendors can't support it, screw them (see ATI's pre-HD hardware and NPOTs).

You can't say "screw them" to PowerVR in the mobile space because they own the mobile space.


No offense, but the above really makes me believe you do not have a clue on what you are talking about. But here goes:
Windows Phone 7: the API is D3D, OpenGL ES1/2 are not on the platform with any strength Over in PowerVR, guess what API's they advertise? Here goes: D3D9, D3D10 (yes you read that correctly D3D10 for the SGX545), OpenGL ES1, OpenGL ES2 and OpenVG. If you look at the extension list of the SGX you would know how far off you are, SGX supports floating point textures, half float textures, npot-textures, buffer object mapping, unsigned int indexing... and if you freaking read my complaints about GLES2 none of them are hardware issues in the embedded world: read back support for textures, read back support for buffer objects, better buffer object mapping. Each of these is HECK-A-EASY for most of these platforms because *drum roll* they are usually a unified memory model!

Alfonse Reinheart
06-05-2011, 06:08 PM
Lastly, conversions can be quite common, especially if you want to use half float textures *gee*. The reason for arguing that the GLES implementation should have it is because as soon as one wishes to use half floats, then you must do a conversion.

You could say the same thing about any form of compression, that "as soon as one wishes to use compression, then you must do a conversion." That's not true for compression or half-floats. You can do the compression as an off-line process; the file can be stored stored in compressed form. Or as half-floats.

And if you're generating the data yourself, then you have the ability to generate it in the format that you want to send to GL. Will it be as fast as possible? Perhaps not, depending on bandwidth.

I'm not saying that this wouldn't be useful. The thing you have to understand is that mobile platforms are all about what is necessary, not what is useful. They have many tradeoffs, and one of them is conversion. It's an operation that most developers don't need, so there's no reason to force all implementations to support it. Throwing in features that 5% or less of users would actually use is how things like picking and selection wound up in OpenGL.


From the above one can see that clip planes can be a big boon as it can cull out tiles completely.

What I meant was that:

1: You're assuming that clip-planes would be actually supported in the hardware. Whatever boon they might be for performance doesn't matter if the hardware simply can't do it.

2: You're assuming that clip-planes would actually clip the geometry, rather than cull fragments. Remember: hardware tries to avoid clipping geometry at all costs. They try to only clip against the front plane, and even then only when the triangle crosses the camera-space Z origin (which means a clip-space W of zero). And even then, they may have clever ways of avoiding clipping.

Could the PowerVR implement clip planes with actual geometry clipping? Certainly. Does it? If not, then this entire conversation, and your wish, is academic because what you want is impossible. If OpenGL ES required it but PowerVR didn't implement it, what good is OpenGL ES's requirement?


Also, even on the desktop, clipping is not usually implemented by doing a discard, given the choice between faking gl_ClipDistance via discard and using gl_ClipDistance, you can be sure that using gl_ClipDistance will win out especially with nasty expensive fragment shaders, try it out and see, and you will see

Just because it's not part of the fragment shader doesn't mean it isn't a rasterization operation rather than a geometric one. Graphics hardware does not clip triangles against most of the frustum, yet it maintains reasonable performance. Those fragments are culled by the rasterizer, if they're even generated at all.