Some Ideas

Time for platform independent context creation API. This will definitely allow non-rendering commands to be implemented as part of the context, such as capabilities queries, GPU settings/features, and video memory…etc.

So the proposal is about taking off non-rendering calls from t eh core and make it part of another core specification for context management.

Another proposal I would like to see in the next versions is a deferred rendering display lists, that supports multi-threading.

Last but not least, PLEASE, some function names are either tooooooooo long or non clear about their job. I’m not giving examples but just think about better names please.

And let it be OpenGL 4.5.

Thank You!

That is the main occupation of the “Platform TSG” since 2006 (EGL API).

The functions for querying capabilities are so rare in OpenGL, and also very vendor specific, that I’m glad they even exist.

The rendering calls make just the tiny fragment of the OpenGL API. There are states setting and reading commands, buffers management, etc. Everything that is context related must be bound to the particular context. What functions are you referring to?

On the other hand, there are other APIs handling GPU states, totally unrelated to OpenGL. The problem is that they are vendor dependent. That’s why I asked to have the access to some parameters through the OpenGL API.

Can you explain this proposal a little bit more detailed.

Maybe for the functions that will be added later on, otherwise the backward compatibility will be broken.

We are now at OpenGL 4.2. So you gave them from a year and the half up to three years to fulfill your wishes. :slight_smile:

One definite requirement is a way to determine (1) did something drop you back to software emulation, and (2) what was that something. Because right now - beyond the extreme case where you get under 1 FPS - you have no way of knowing. Some fallbacks can be incredibly subtle, manifesting as a few percent perf loss that you as a programmer may not immediately recognise as having been a fallback, but those few percents can mount up and turn a program that should be soaring into quite a sluggish thing.

I understand the reason why the fallback exists (OpenGL makes no promises about hardware acceleration, blah blah blah) but without a way of querying for info about when it happens it’s a complete pain the the butt.

Some Ideas

If you’re going to propose some ideas, could you at least do some searching in the forum and not propose ideas that have already been proposed?

Can you explain this proposal a little bit more detailed.

He’s talking about these things.

Some fallbacks can be incredibly subtle, manifesting as a few percent perf loss that you as a programmer may not immediately recognise as having been a fallback, but those few percents can mount up and turn a program that should be soaring into quite a sluggish thing.

If you only drop a “few percent perf,” you did not drop back to software emulation. There is a difference between “software rendering” (which is virtually impossible to get in core 3.x) and “not the fast path”.

It depends on which part of the pipeline falls back.

If you’re going to propose some ideas, could you at least do some searching in the forum and not propose ideas that have already been proposed?

No.

[quote=“mhagain”]

It depends on which part of the pipeline falls back. [/QUOTE]

I thought that the shader would fail to compile with “exceeded varyimg” or “exceeded ALU instruction” or “nested loops and if are bad”.

Another proposal I would like to see in the next versions is a deferred rendering display lists, that supports multi-threading.

It can be done. I success to do a such function : you juste have to allocated a executable memory page, an then, add only opengl functions call with hexadecimal instructions. All threads can do that. Then call this memory pages like it was a C function by your opengl thread.
It work fine, and i can see better performance for scene which require a lot of opengl API call.

PS: Headache guarantee :wink:

I success to do a such function : you juste have to allocated a executable memory page, an then, add only opengl functions call with hexadecimal instructions.

Except that you need more than just function pointers. You need the parameters to execute them.

Also, even if you added parameter data to the memory, all you’re doing is more or less re-implementing display lists. What D3D deferred contexts give you is the ability to segregate one set of state from another.

For example, if I put some rendering commands in a display list, I can change blend modes between those rendering commands. The first display list call renders opaquely, the second renders with translucency. You cannot do that with deferred contexts. They store all of the rendering state within themselves. This ensures that when you set up a series of rendering commands, it will come out the other end completely exactly as you wanted it. It renders with all of the state you used.

That’s why it is a deferred context, not merely deferred rendering. All OpenGL needs to support this is the ability to create a HGLRC with a special attribute that makes it deferred relative to a given non-deferred HGLRC. wgl/glXCreateContextAttribARB can handle the API, so you don’t need a special function for it. Commands given via a deferred context are not executed immediately; instead, they sit there and wait. There would need to be a new wgl/glX command to execute the commands of the deferred context on the main one. And that’s all.

There would need to be some adjustment on the deferred context side. Certain functions couldn’t be used in those contexts (glFlush, glFinish, etc). But that’s about it.

Also, even if you added parameter data to the memory, all you’re doing is more or less re-implementing display lists. What D3D deferred contexts give you is the ability to segregate one set of state from another.

Of course with parameters. I don’t know many opengl functions without arguments who can be useful for drawing. Opengl pointers are store in a struct get with wglGetProcAdress (or equivalent for other OS). We only need the folowing instructions:

  • push for passing arguments,
  • call: to call function pointers,
  • ret: to end the function

And opengl display list can be only compiled with the opengl thread. But my solution don’t care about, it only write hexadecimal instruction in a memory. So it’s exactly like the deferred context of D3D.

And opengl display list can be only compiled with the opengl thread.

Since when? You’ve been able to share display lists between contexts for over a decade. You can make a display list in one context and pass it over to a shared context for rendering. This works today.

But my solution don’t care about, it only write hexadecimal instruction in a memory. So it’s exactly like the deferred context of D3D.

Except that D3D doesn’t do that. D3D contexts are basically attachment points for objects. Therefore, since many objects are more or less immutable, you can break any rendering operation down to attaching a set of objects and drawing with them. Because the deferred context is a separate context, the set of attached objects is separate.

Therefore, the set of rendering commands on one context cannot affect the outcome of another context.

OpenGL isn’t like that. There is still plenty of global state out there. Unless a display list resets all state, display lists will be affected by previously set global state. The only way to ensure that the two contexts don’t affect one another is to make them two separate contexts.

So your simple “store commands in a buffer” just does not work for OpenGL.

I think i’m not enough clear about what my solution do. I will publish you a fictive example in C/C++

This is a “normal” drawing function on the opengl thread


glUniform4fv (16, 9, (const float *)Colors);

for ( int i = 0 ; i < BodyCount ; i ++ )
{
    glUniformMatrix4fv (0, 4, GL_FALSE, (const float *)BonesMatrix);
    glDrawArrays (GL_TRIANGLES, 0, TrianglesCount);
}

i replace by this commands on an other thread


GLoffline->Uniform4fv (16, 9, (const float *)Colors);

for ( int i = 0 ; i < BodyCount ; i ++ ) // BodyCount = 2 for exemple
{
    GLoffline->UniformMatrix4fv (0, 4, GL_FALSE, (const float *)BonesMatrix[i]);
    GLoffline->DrawArrays (GL_TRIANGLES, 0, TrianglesCount); // TrianglesCount = 727 for exemple
}

Each function of the GLoffline object only add correct hexadecimal instructions for arguments and the call of the right opengl function. But it doesn’t execute the opengl command, else program crash of course because it hasn’t been executed on the opengl thread ;). Of course, it also add the ret instruction at the end for the return. The result function is like that.


void dynamicFunc (void)
{
glUniform4fv (16, 9, 0x56be79c4);

glUniformMatrix4fv (0, 4, GL_FALSE, 0x56be7000);
glDrawArrays (GL_TRIANGLES, 0, 727);

glUniformMatrix4fv (0, 4, GL_FALSE, 0x56be7800);
glDrawArrays (GL_TRIANGLES, 0, 727);

return ;
}

You can see, this function do not compute, loop or test, it has only opengl calls with arguments ready to be pop in the CPU stacks to calls opengl functions pointers, and then this function is executed on the opengl threads. Any problems like display list (which are deprecated) on an other opengl context.

This is simply work like that, multithreaded submission

Here, this example doesn’t really need a such techniques, because it’s very easy, but some rendering who required a lot of opengl calls can win performances. (Very nice on CPU which has 4 or more cores)

Do you understand what i mean ?

Do you understand what i mean ?

I understood what you meant before. Just because someone disagrees with your solution doesn’t mean they don’t understand it.

What you have proposed is essentially a display list. The only difference is that rendering commands don’t package the rendered data, so you can in theory change what gets rendered by changing buffer objects and such.

It still provides no assurance that state set by other calls will not affect these calls. It also provides no means for doing error checking, doubly so since you can’t verify state beforehand.