PDA

View Full Version : Ork: a new object-oriented API on top of OpenGL



Eric Bruneton
10-11-2010, 05:47 AM
Ork (http://ork.gforge.inria.fr), for OpenGL Rendering Kernel, provides an object-oriented C++ API on top of OpenGL. Using Ork can greatly simplify the implementation of OpenGL applications. For instance, suppose that you want to draw a mesh in an offscreen framebuffer, with a program that uses a texture. Assuming that these objects are already created, with the OpenGL API you need something like this:

glUseProgram(myProgram);
glActiveTexture(GL_TEXTURE0 + myUnit);
glBindTexture(GL_TEXTURE_2D, myTexture);
glUniform1i(glGetUniformLocation(myProgram, "mySampler"), myUnit);
glBindBuffer(GL_ARRAY_BUFFER, myVBO);
glVertexAttribPointer(0, 4, GL_FLOAT, false, 16, 0);
glEnableVertexAttribArray(0);
glBindFramebuffer(GL_FRAMEBUFFER, myFramebuffer);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
With the Ork API you simply need two steps (and the first one does not need to be repeated before each draw, unless you want a different texture for each draw):

myProgram->getUniformSampler("mySampler")->set(myTexture);
myFramebuffer->draw(myProgram, *myMesh);
The Ork API fully covers the OpenGL 3.3 core profile, and partially covers the OpenGL 4.0 and 4.1 core profile APIs (tesselation shaders are supported, but uniform subroutines, binary shaders and programs, pipeline objects, separable shaders, and multiple viewports are currently not supported). Ork has just been released as an Open Source project, under the LGPL license.

Groovounet
10-11-2010, 06:21 AM
It looks interesting, pretty code and everything.

There is just again something I can't understand: Why so many people use a string name to set uniform? Even using uniform locations, setting the uniform is the most expensive operation in a real scale software just because there is so many of them.
Adding a string search per glUniform* call is an even stronger penalty.

Eric Bruneton
10-11-2010, 08:12 AM
It looks interesting, pretty code and everything.

There is just again something I can't understand: Why so many people use a string name to set uniform? Even using uniform locations, setting the uniform is the most expensive operation in a real scale software just because there is so many of them.
Adding a string search per glUniform* call is an even stronger penalty.

uniforms are represented with Uniform objects (which encapsulate the uniform location). And it is explicitly recommended to store the reference to the Uniform instead of computing it each time you want to use it (see the "Important note" in 3.2.1 here (http://ork.gforge.inria.fr/v3.1/index.html#sec_shaders))

pjcozzi
10-12-2010, 12:03 PM
There is just again something I can't understand: Why so many people use a string name to set uniform?
By so many people I bet you are including me ;).

Eric hit the nail on the head; just find the uniform by name once after program compiling and linking, and cache it. The search isn't going to cost anything compared to the compile/link. Of course, immediately querying a program for its uniforms after linking is going to prohibit driver optimizations that try to compile/link on another thread but I don't know what Ork is doing behind the scenes. I personally just query uniforms, etc. right after linking.

Eric, very nice job with this project. The documentation is great!

Regards,
Patrick

Groovounet
10-13-2010, 05:45 AM
Of course, immediately querying a program for its uniforms after linking is going to prohibit driver optimizations that try to compile/link on another thread but I don't know what Ork is doing behind the scenes.

Really, you think that it could prevent some optimizations? I don't see how but maybe. The ARB_get_program_binary extension goes this way at least.

Chances are that most developers do so why showing otherwise? Why showing something that the documentation says not to do?

pjcozzi
10-13-2010, 07:06 AM
Really, you think that it could prevent some optimizations?
I don't want to take this thread off the topic of Ork but I am referring to Pierre Boudier's post here (http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=282529#Post2825 29).

Regards,
Patrick

Alfonse Reinheart
10-17-2010, 05:39 PM
With the Ork API you simply need two steps (and the first one does not need to be repeated before each draw, unless you want a different texture for each draw):

I'm not sure this is a good idea in terms of performance. Essentially, you have abstracted away the notion of texture image units. That is, the fact that programs don't have textures bound to them; they instead refer to texture image units.

Consider the possibility of creating certain conventions for texture image units. For example, one could always bind the shadow map to texture unit number 12. That way, every program that uses a shadow map just uses image unit 12, and you never have to set the uniform state again.

The way you're doing it, every program that uses a shadow map may be using entirely different texture image units for it. So on every program state change, you have to rebind both the uniform and the texture object.

Consider the case of differred rendering, where the lighting passes all use the same texture sets. There may be dozens of different lighting and material shaders, but they all use the exact same texture objects. Why have all of those state changes when the underlying API needs precisely none of them?

At the very least, there should be some way to have these kinds of conventions associated with programs.

Similarly, having the draw function as part of the framebuffer gives the wrong idea about the performance characteristics of drawing to different framebuffers. It may be object oriented, but it suggests that changing which framebuffer you're drawing to is a trivial change with no performance implications. If instead you make the current framebuffer some kind of state, then you more strongly suggest that changing the state incurs a performance penalty.

Eric Bruneton
10-19-2010, 11:09 AM
I'm not sure this is a good idea in terms of performance. Essentially, you have abstracted away the notion of texture image units.

right, one of the main goal was to abstract image units away


Consider the possibility of creating certain conventions for texture image units. For example, one could always bind the shadow map to texture unit number 12. That way, every program that uses a shadow map just uses image unit 12, and you never have to set the uniform state again.


At the very least, there should be some way to have these kinds of conventions associated with programs.

I think this could be easily done with a "setPreferredUnit" or "setUnit" method in the Texture class.


Similarly, having the draw function as part of the framebuffer gives the wrong idea about the performance characteristics of drawing to different framebuffers. It may be object oriented, but it suggests that changing which framebuffer you're drawing to is a trivial change with no performance implications. If instead you make the current framebuffer some kind of state, then you more strongly suggest that changing the state incurs a performance penalty.

of course the users should have some knowledge about OpenGL to use Ork in an efficient way. But users always have to know the "internals" of any library to use it in the most efficient way (this is even true for C++ itself, or for the STL)

Alfonse Reinheart
10-19-2010, 04:19 PM
I think this could be easily done with a "setPreferredUnit" or "setUnit" method in the Texture class.

Why would it be in the texture class? It would make more sense to put this in the shader(s), since they're the ones who need to decide where these things go.


of course the users should have some knowledge about OpenGL to use Ork in an efficient way.

You're making an API who's primary design is to abstract away OpenGL. Why would you expect users of it to have detailed knowledge of how OpenGL works to be able to use it? If they had detailed knowledge of OpenGL, they wouldn't need Ork.


But users always have to know the "internals" of any library to use it in the most efficient way (this is even true for C++ itself, or for the STL)

True, but STL doesn't make it easier to do things the wrong way than to do them the right way. std::list does not have operator[] because it would be prohibitively expensive to implement. It would give the user the illusion that it is OK to access elements by index. Similarly, std::list::iterator's are not random access iterators, again suggesting that directly adding numbers to them isn't the most effective way to use them.

The point is that the API should encourage correct use and discourage incorrect use.

Groovounet
10-19-2010, 04:42 PM
Ouhhhhhh as harsh as usual Alfonse :p

OpenGL is quite crazy hard to abstract especially compare to Direct3D. I think this is due to the serious lack of consistency of OpenGL. Damn, why everything as to invent new rules for every feature when already exciting conventions are already present in the OpenGL specification.

My last example on this regard: the subroutine API... WHAT THE [censored]. Why does it have a shader type parameter? Why does it have such strange rules on where it belongs (half context half program in a summary) Actually I would say... why not using these conventions, if so why uniform blocks and uniform variable doesn't follows the same rules?

Sometime (and actually quite often!) when I start to think about OpenGL, my brain just end up in a dead lock.

I dare anyone to tell me they are using all the strenght of the VAO in a decent size project! It has barely any XD

kRogue
10-20-2010, 02:42 AM
My last example on this regard: the subroutine API... WHAT THE [censored]. Why does it have a shader type parameter? Why does it have such strange rules on where it belongs (half context half program in a summary) Actually I would say... why not using these conventions, if so why uniform blocks and uniform variable doesn't follows the same rules?


We might as well write down the UBO vs Subroutine usage... I confess that reading the spec and getting it right is not exactly a walk in the park.. if I got this wrong, PLEASE someone correct it.

Here goes, shader:




uniform BlockName
{
mat4 classic;
vec2 baroque;
int jazz;
float blues;
int punk;
int techno;
} InstanceName

.
.
.

acid_music=mix_music(InstanceName.punk, InstanceName.classic);



C code:


block_index=glGetUniformBlockIndex(program, "BlockName");
.
.
.
//or use glBindBufferRange to use a range of some_buffer_object
//bind the buffer object to "block_binding_point", this is like
//glBindMultiTextureEXT(GL_TEXTURE0+unit, GL_TEXTURE_SOMETHING (like 2D), texture_name)
glBindBufferBase(GL_UNIFORM_BUFFER, block_binding_point, some_buffer_object);

//instruct program to use the buffer object at "block_binding_point" as the source
//of the uniform block
//this is analogous to glUniform1i when the uniform is a sampler type
glUniformBlockBinding(program, block_index, block_binding_point);



Getting that from reading the GL spec takes some time: glGetUniformBlockIndex is like "uniform location, but for uniform blocks" and glUniformBlockBinding is like "glUniform" for uniform blocks, saying what binding point to use. Sickly, NVIDIA's GL_NV_shader_buffer_load is not only easier to use but more flexible (though it might have slower access than uniform bocks :o ). However, taking a gander, uniform buffer object usage is the same as for texture units: glUniformBlockBinding determines which "unit" to use and glBindBufferBase/Range puts the buffer object at a "unit".


Now for subroutines,



//C analogue would be typedef float (*my_function)(in float a0, out float b0, in vec4 c0)
//if the words in/out made any sense in C
//note that it is not quite a typedef as identical return and argument
//types give rise to unique types.
subroutine float my_function_type(in float a0, out float b0, in vec4 c0);


//the subroutine(my_function_type) warns the compiler that
// f1 can be used a function pointer for the function pointer type my_function_type
subroutine(my_function_type) float f1(in float a0, out float b0, in vec4 c0)
{
return something;
}


subroutine(my_function_type) float f2(in float a0, out float b0, in vec4 c0)
{
return something_else;
}

//declare a uniform function pointer
subroutine uniform my_function_type function_pointer;


.
.
.
.
.
float giggles, jjokes;
jjokes=function_pointer(4.0, giggles, vec4(3.0, 2.0, 1.0, 4.0) );



and now the C code:



subroutine_location=glGetSubroutineUniformLocation (program, GL_SOME_SHADER_TYPE, "function_pointer");
f1_index=glGetSubroutineUniformLocation(program, GL_SOME_SHADER_TYPE, "f1");
f2_index=glGetSubroutineUniformLocation(program, GL_SOME_SHADER_TYPE, "f2");
.
.
.
.
.
GLuint choice[]={ }; //make sure length of choice is same as "glGet(program, GL_ACTIVE_SUBROUTINE_UNIFORM_LOCATIONS)"

//choose:
choice[subroutine_location]= f1_index;
glUniformSubroutinesuiv(GL_SOME_SHADER_TYPE, sizeof(choice)/sizeof(GLuint), choice);

//but beware: glUseProgram will "reset" whatever choice you made with the above!


I freely admit that having glUseProgram reset the value (and for that matter not taking program name as a parameter) made me do an epic "what the?" The justification was that the same program in uses across different contexts could use different uniforms. This recovers the same functionality as using the same program in different contexts but having different textures and buffer objects bound. I also confess, what a corner case, since all other uniform values are program state anyways. I also free admit that the subroutine typing as is, is wonky since I would *think* that the return type and argument list would completely determine, but as of now declaring two subroutine with the same argument type and return types gives two distinct types and for a fixed function to be used as either then one needs to list both in the subroutine(...) prefix declaration. Ick.


In their defense, to get the same pattern as found for uniform blocks and textures, it would have needed this:




//create subroutine resource
GLuint subroutine;
subroutine=glCreateSubroutine(subroutine_source_co de);

//bind the subroutine:
glBindSubroutine(GL_SUBROUTINE, binding_point, subroutine);

//use that in our GLSL:
glUniformSubroutine(program, subroutine_location, binding_point);



but I'd imagine that would have been practically useless as then the subroutine would only be able to access the arguments and it's own (private) local variables.

pjcozzi
10-20-2010, 06:02 AM
Similarly, having the draw function as part of the framebuffer gives the wrong idea about the performance characteristics of drawing to different framebuffers. It may be object oriented, but it suggests that changing which framebuffer you're drawing to is a trivial change with no performance implications.
Since we don't know how Ork is implemented, it is possible that it defers submitting GL calls until the end of the frame (perhaps it is even split into two threads: one that fills internal command lists, and another that issues GL calls from the list, similar to Quake 4 (http://mrelusive.com/publications/presentations/2008_gdc/GDC%2008%20Threading%20QUAKE%204%20and%20ETQW%20Fi nal.pdf)). In this case, the implementation could render sorted by frame buffer, and therefore "changing frame buffers" in client code is cheap.

It is unlikely that Ork is implemented this way now, but by abstracting GL, it has the option to do so in the future, just like it could switch from selectors to direct state access without affecting client code.


The point is that the API should encourage correct use and discourage incorrect use.
Stellar advice. I've also heard this phrased "easy to use correctly and hard to use incorrectly."

Regards,
Patrick

Chris Lux
10-20-2010, 06:10 AM
I look at the subroutine problem like this: Instead of calling glUseProgram(prog_id), which is context state, we call glUseProgramInstance(prog_id, instance_id). Where the latter call is my view of glUseProgram and glUniformSubroutine combined, which are both context state. If the program uses subroutines we _have_ to select a program instance according to some uniform selection vector or we get undefined behavior in the subroutine selection.

I do not like it, but with the view described above i can live with it.

Alfonse Reinheart
10-20-2010, 11:47 AM
Since we don't know how Ork is implemented, it is possible that it defers submitting GL calls until the end of the frame (perhaps it is even split into two threads: one that fills internal command lists, and another that issues GL calls from the list, similar to Quake 4). In this case, the implementation could render sorted by frame buffer, and therefore "changing frame buffers" in client code is cheap.

The problem is that the API documentation doesn't state that this is possible. It strongly implies that rendering commands will be processed in the order they are submitted, rather than in an arbitrary or optimal order. The fact that the API is a leaky abstraction (providing features to allow you to back-door the abstraction and make GL commands directly) only helps encourage this view.

If the code later takes advantage of the abstraction, then a compatibility problem is created for all applications that use Ork.

Groovounet
10-20-2010, 12:52 PM
@kRogue: Just a little error in your subroutine code: We get subroutine index with glGetSubroutineIndex, in your code you wrote glGetSubroutineUniformLocation.

When I was speaking about the subroutine vs uniform block vs uniform variable I was more thinking at a "higher" level, like how to make everything comes together inside a program how to define how state changes will be ordered & co. Like subroutine query functions require the program stage... not the uniform block or uniform variables... why?Oo And with separate program it's even more "Oo". Subroutine uniforms are half context half program state, uniform blocks and variables are program states. It's also true for the definisions and uses of of index and location across uniform variable / block / subroutine...

Other different example: Every glBind* fonctions are contexts states... every but glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ...) which is a VAO state... Really? Why? Oo. Ok, the VAO stuff doesn't make sense so why not, it comes perfectly with this mess.

Lack of consistence has many forms and more and more forms since OpenGL 3.0. I am pretty sure it's going to be more and more a big topic. It's still manageable but... challenging!

kRogue
10-21-2010, 11:51 AM
@kRogue: Just a little error in your subroutine code: We get subroutine index with glGetSubroutineIndex, in your code you wrote glGetSubroutineUniformLocation.


AHRRRR.. I HATE typos... and I can't edit the post, well here it to hoping those that read my typo-code will see your correction.. I hate typos.

As for the inconsistency of GL... ah.. nothing like evolutionary scarring...

For what it is worth: I have always, always hated that the name space for uniforms was common to all shader stages, now mix in separate shader objects and it is.. ickier... I suspect that the need for specifying the shader type in the glGetSubroutineUniformLocation and glGetSubroutineIndex functions is there because one gives an index of what to use, and not a value... and I would bet, atleast someone else's money, that other uniforms are duplicated in terms of "where" they are stored for each shader stage they appear in.... at any rate that uniform name scopes (except for subroutines!) are in the same namespace has always irked me, ALOT. But that cat was let out of the bag nearly 6 years ago in GL2.0... sighs...

and oy yes the function name "glGetUniformBlockIndex" is one big insane wtf for me. I figured it should have been named "glGetUniformBlockLocation"... ehh.. I am whining probably right now.

Giggles: VAO's. My bet is that the thought behind VAO was one VAO per mesh, which meant GL_ELEMENT_ARRAY_BUFFER and GL_ARRAY_BUFFER state (and the glVertexAttribPointer state).... but it all got wonky didn't it since, like you said, glBindBuffer is context state anywhere else.... here is to hoping that some kind of DSA comes to core that sorts it out, eh? Oh yea, my "use" of VAO is just making and binding one at startup... I am an NVIDIA bindless fiend anyways. When red fruits come to core, that is VAO.

Want to know what gives me biggest headaches: in compatibility profile knowing exactly what state is part of a displaylist... I am always paranoid of missing/adding something.

GL API is inconsistency all comes down it's evolutionary nature/growth/whatever bloody appendix. As we all know, reading the spec realling brings the evolutionary scarred nature of GL into focus.. the thing has the same layout as version 1.0, some sections (particularly texturing) are almost word for word the same... Me thinks it is that way because it was added to and never, um, for lack of a better word, refactored. I freely admit that I am no editor :confused: But at any rate the spec is added to and has been added to from the beginning.. that it still works after such a long time and how much stuff has advanced is a testimant to how well the original GL spec was... definitely shows its appendix when you see something like



The maximum border width bt is 0. If border is less than zero, or greater than
bt, then the error INVALID_VALUE is generated.

and then using the value all over the place :cool: though it must be zero (in GL spec defense the compatibility profile just changes the 0 to 1 and then has exact same text).

kyle_
11-20-2010, 05:33 AM
Other different example: Every glBind* fonctions are contexts states... every but glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ...) which is a VAO state... Really? Why? Oo. Ok, the VAO stuff doesn't make sense so why not, it comes perfectly with this mess.
Wait ... what?
I sure am glad i couldnt be bothered to grok vao's.


Lack of consistence has many forms and more and more forms since OpenGL 3.0.
At least there is one thing OpenGL manages to do consistently ;)

Stephen A
11-20-2010, 03:32 PM
It looks interesting, pretty code and everything.

There is just again something I can't understand: Why so many people use a string name to set uniform? Even using uniform locations, setting the uniform is the most expensive operation in a real scale software just because there is so many of them.
Adding a string search per glUniform* call is an even stronger penalty.

String-based lookups don't have to be slow. (http://www.humus.name/index.php?page=Comments&ID=296)