RFC: buffer description language

Two interesting problems are not addressed by the EXT_render_target spec:
[ul][li]how to chose the “optimal” internal format[*]how to specify buffers that are “built-in” to the render target object[/ul][/li]Addressing the first item requires some information about the various configurations that a buffer will be used in, and the second one just needs a mechanism for specifying such buffers.
I think the idea of a buffer description language and an API for using it layers nicely on the EXT_render_target spec and provides this useful functionality.
Consider the following:

!!BDL1.0
buffer2d colortex {
  format = RGBA8;
  is_color_buffer = true;
  is_texture = true;
};

buffer2d colortexfp16 {
  format = RGBA16F;
  is_color_buffer = true;
  is_texture = true;
};

framebuffer main_rtt {
  color_buffer = colortex | colortexfp16;

  depth_buffer = {
    format = DEPTH_COMPONENT24;
  };
};

The above describes a couple of renderable textures, and states that textures of this type need to both be renderable as the color buffer in a FRAMEBUFFER style render target that contains its own built-in 24-bit depth buffer.

Other buffers, properties, and framebuffer configurations could be specified, but this gives the general idea.

Then in order to use this description, one would do something like:

glBufferDescription( "!!BDL1.0 …" );
GLenum colortex_internal_format = getBufferInternalFormat("colortex");
GLenum colortexfp16_internal_format = getBufferInternalFormat("colortexfp16");
GLuint main_rtt_config = getRenderTargetConfig("main_rtt");

Then the appropriate internal format queried above would be passed in as the internalFormat parameter when specifying the texture image TexImage2D() for the texture.

Likewise, the render target config would also be specified via glRenderTargetParameteri() with a pname like RENDER_TARGET_CONFIG.

I don’t think most developers will find it difficult to use the EXT_render_target API with great acceleration as long as they are somewhat careful to stay on “the beaten path”, but this mechanism would allow the driver to pick buffer formats based on all the relevant information.

How do others feel about this approach?

Thanks -
Cass

edit: nothing major…

This is a great idea!

But how will the driver choose the “best” buffer?
For example:

color_buffer = colortex | colortexfp16;

How does the driver know which buffer to choose?

Why not adding dimensions to the buffers like:

buffer2d colortex {
  format = RGBA8;
  is_color_buffer = true;
  is_texture = true;
  size_x = 512;
  size_y = 512;
};

And something like:

GLchar* col_buffer = glGetChosenColorBuffer();
GLint tex = glCreateTexture(col_buffer, GL_USE_INTERNAL_FORMAT | GL_USE_BUFFER_DIMENSIONS);

I should point out, this idea is not intended to go into EXT_render_target (which would delay it), but something that would be layered atop.

As for whether width/height make sense, I’m not sure. That doesn’t add much value as far as I can tell. Especially if you wanted to use the same RT to draw to different sized textures.

The question about the “best” format, is that the driver gets to look at all the possible combinations that a buffer will be used in, and figure out which actual internal format works best for the hardware you’re on…

Does that make sense?

Thanks -
Cass

Originally posted by Corrail:
[b]But how will the driver choose the “best” buffer?
For example:

color_buffer = colortex | colortexfp16;

How does the driver know which buffer to choose?
[/b]
I believe the idea is that when you state “color_buffer = colortex | colortexfp16” you are stating that the buffer will be used for RGBA8 or RGBA16F rendering.

I don’t think most developers will find it difficult to use the EXT_render_target API with great acceleration as long as they are somewhat careful to stay on “the beaten path”, but this mechanism would allow the driver to pick buffer formats based on all the relevant information.
That’s just it. I’m not sure that this is a “problem” that needs to be solved.

How far off “the beaten path” can you need to go before the driver starts to need more information? Certainly, if you bound a floating-point texture, it should render in that format, yes? If you bound a luminance texture, it may need to do some copying if it can’t natively render to it. But, what are the cases where the driver needs to be told what framebuffer format to use?

i thought about this issue since a quite long time…

in c/c++, you can make arrays of structs with arbitary content. in graphics, you can only create arrays of structs of certain layout and format, namely GL_RGBA8, GL_LUMINANCE16, etc… all different predefined types.

it would be really cool to instead be able to define own types that could be used in the buffers/arrays… but the way to describe it should be the same as c (so, with some macro-trickery, you would only have to write it once).

reason?

think about the dx9 way of describing how you actually have layed out the vertex-struct in the vertex-buffer. this huge array is a big piece of code, and redundant => useless.

being able to describe any sort of texture-/vertex-buffer format, that would be too cool (espencially, again, in my case, for hw accelerated raytracing:D)

I’m not sure that this is a “problem” that needs to be solved.

I’m inclined to agree with Korval. Besides, even if you can make the case for the need, the idea of another scripted entry into the api makes me a little nervous; is it an argument against further function bloating of the api? Will we end up with slew of scripts? Hmmmm, this one is a puzzler.

In any case, if you proceed with the language approach, perhaps a larger framework might be nice, something very extensible, like a generic “description/attribute/specification language”, on/in which a buffer language could be based, among other things, such as vertex buffers, as davepermen suggests.

It seem that app can request render to texture which format are not accelerated for rendering to it. To avoid this problem driver should provide best texture format for that.

Maybe it’s better to do something like glChooseTextureFormat(params). Driver should choose best texture format for render target and texturing. Params should be some kind of description about color, depth and stencil bits that app need. glChooseTextureFormat sholud return best internal texture format.

Following issue claims that app can choose any kind of texture format and driver should deal with acceleration problem. This mean driver can do internal data copy and it can slow down rendering.

(26) What happens if an RGBA4 texture is used as a render target
buffer, but the hardware does not support rendering to RGBA4?
    The implementation is allowed to render to a surface with more
    precision than the texture's requested format.
    The texture object's format remains RGBA4.  The driver is allowed
    to copy the texture contents from the RGBA4 surface to a surface
    with a format that can be rendered to, for example RGBA8, and then
    copy back from the RGBA8 surface to the RGBA4 texture.  The actual
    surface format used for rendering must depend only on the
    (requested) formats of the texture and the other render target
    buffers.
    In this example, querying TEXTURE_RED_BITS will return 4 because
    the texture's requested format is RGBA4.  But querying RED_BITS
    while the render target object is bound will return 8.  The
    discrepancy is an indication to the application that a copy is
    happening behind the scenes.

And one more question… Is texture size relevant for accelerated rendering?

yooyo

I don’t quite get it, yet. Perhaps you could enlighten me about what the driver-side benefit would be?

Current theory:
Users of EXT_render_target are encouraged to specify a multitude of buffers and usage models up front, instead of requesting resoruces in a more JIT-like fashion.
This will help the driver make better choices about memory allocation, ie use the appropriate memory type (is this an option for drawable buffers?), and keep down fragmentation (and copies to resolve such fragmentation) over the lifetime of these objects.

Is this the deal? Care to add anything?

I have to admit, in the first couple of minutes of reading this I was sort of torn between feeling ‘this is a great idea’, and ‘this is overkill for configuring a buffer’. I mean, afterall you could send structs filled with this kind of data in form of symbolic constants or enums to the driver to accomplish the same, right?

Wrong. While the same could be achieved in the ‘traditional’ way, having a language makes it easier to extend upon buffer descriptions without having to make API changes, or even recompile your project. If anyone else thinks direct API calls with structures or arrays of values to describe buffers would be the better way, think of requesting a certain framebuffer format from WGL (yuck!). I’ll prefer a clear, structured language like this anyday, thank you.

The only drawback to having an actual language describing buffers I can see is, that such can make it a slight pain in the rear to configure buffers at runtime (constructing a string depending on, say, your game’s configuration versus the hardware’s capabilities, and send it to the driver, versus filling structs with variables).
Other than this single (and probably easily disputed) point, I say: go for it. It’s a good idea - it decouples additional hardware capabilities as far as buffer caps are concerned, from the application’s code, which is a good direction to steer in.

Reading through your example again, Cass, one question: how would this tie in with multiple render targets? I can imagine how it would probably work (one BDL-string per render target, and ‘bound object’ methodology for the API calls to work on the currently ‘active’ target?), but I’d like to make sure I’m understanding your thoughts behind this correctly instead of making assumptions.

Originally posted by Korval:
That’s just it. I’m not sure that this is a “problem” that needs to be solved.

I’m actually glad to hear you say this.

I agree that most apps will not need to bother with this kind of mechanism.


How far off “the beaten path” can you need to go before the driver starts to need more information? Certainly, if you bound a floating-point texture, it should render in that format, yes? If you bound a luminance texture, it may need to do some copying if it can’t natively render to it. But, what are the cases where the driver needs to be told what framebuffer format to use?

You’d usually need to go pretty far off the beaten path for most modern hardware.

Hardware often has non-orthogonalities like the inability to render to 32b color with a 16b depth buffer. Specifying the usage up front would allow the driver to guide the app to choose the “most accelerated” format based on the app’s intended usage.

It might also provide a good way for the driver to tell an app that it couldn’t find a set of formats that it considered “good”. For example, if you asked about rendering to fp textures on a GeForce2.

Thanks -
Cass

edit: fix ubb code

Originally posted by zeckensack:
I don’t quite get it, yet. Perhaps you could enlighten me about what the driver-side benefit would be?

Sorry! I probably should have discussed the motivation for this a little more.


Current theory:
[…]
Is this the deal? Care to add anything?

Not exactly. This isn’t really about buffer allocation, it’s about buffer internal format selection. For example, if your buffer description indicated rendering to an RGB565 texture, the driver might tell you the best internal format for that buffer is RGB8 if it didn’t support RGB565 rendering.

Thanks -
Cass

The above describes a couple of renderable textures, and states that textures of this type need to both be renderable as the color buffer in a FRAMEBUFFER style render target that contains its own built-in 24-bit depth buffer.
So that was describing 2 textures that are sharing a depth buffer?

framebuffer main_rtt? This part seems to be saying it’s a single entity.

And also, we don’t have much control over the framebuffer from here. That exists in the wgl/glx realm.
EXT_render_target should be more about render to texture then a generalization over all buffer types.

So what happened?

Not two textures. Two types of textures (one type is RGBA8 and the other type is RGBA16F).

“main_rtt” is the name of a framebuffer configuration described above. You could gen as many render targets as you like, but telling it to use the configuration of “main_rtt” means that the render target contains its own depth buffer, and it can be used for rendering to either RGBA8 or RGBA16F textures.

I’m not sure what you mean in your last sentence. EXT_render_target is mostly about rendering to texture, but its framework is open enough to support rendering to other things in the future.

Thanks -
Cass

OK, it’s a template that can be used to create multiple targets.

either RGBA8 or RGBA16F textures.
I don’t see the use of this since EXT_render_target suggest that we can create as many “color” buffer as we like and as many “depth” buffer as we like and combine them as we like (almost) and render to them.

In the spirit of the example :

create a RGBA8 texture
create a RGBA16F texture
create depth24 texture

and have fun…

If the driver wants to group these 3, then no problem there.

Do you have an example that would lead to poor performance or that would not be supported by the hw?

I was thinking this is some kind of plan to kill wgl and allow us to create GL context without a framebuffer, and then use EXT_render_target to specify a framebuffer format. Never mind.

Hi V-man,

My example was probably not the best, but that’s probably because the situation would not come up very often.

If the color formats that were chosen were RGB565 and RGB8, then there exists a class of hardware that can’t render to both of these formats with the same depth buffer (32bpp color with 32bpp depth/stencil and 16bpp color with 16bpp depth are the only valid combinations). On such hardware, you would like the driver to tell you “really use RGB8” for the RGB565 texture.

I don’t want to make too big a deal of this, but for some hardware with non-orthogonalities and for some apps that want to combine buffers in unusual ways, this would help them stay on the fast path. Otherwise, they might not have enough direct information to do so. Most apps won’t need to do this, however, most apps may wish to use RTs that have “built-in” buffers for depth/stencil/accum, etc.

Thanks -
Cass

That’s the part I don’t understand. I know that various combinations of render targets may force the driver to create a temporary framebuffer and then copy the texture when needed. However, the thing I don’t understand is why the driver has to be told what to do.

Drivers are constantly doing this kind of stuff behind our backs. If hardware doesn’t support RGB565 at all, then the driver just internally promotes it to RGB8. It seems pretty reasonable to me that the drivers should simply do the right thing behind our backs. As long as there is only one right answer, there is no need for us to have to inform the driver about the choice.

Perhaps provide a set of hints instead?

Originally posted by Korval:
[b]That’s the part I don’t understand. I know that various combinations of render targets may force the driver to create a temporary framebuffer and then copy the texture when needed. However, the thing I don’t understand is why the driver has to be told what to do.

Drivers are constantly doing this kind of stuff behind our backs. If hardware doesn’t support RGB565 at all, then the driver just internally promotes it to RGB8. It seems pretty reasonable to me that the drivers should simply do the right thing behind our backs. As long as there is only one right answer, there is no need for us to have to inform the driver about the choice.[/b]
I don’t disagree with your comment, and originally, we just said that the driver may change the internal format.

However, there were some criticisms raised by other driver implementors we talked with. In the GL the internal format of a texture can’t change once it is established. There was concern that introducing an variance like this would be a Bad Idea.

Rather than argue about it (and deal with the corresponding delay), we just took that lanugage out and said that rendering to some internal formats may be less efficient due to copying.

Would you have preferred the ‘variance’ solution?

Thanks for the feedback, Korval. This is really helpful.

Cass