Increased number of texture color channels

While working on some full scene lighting code, it occurred to me that there were some things that could be sped up.

Right now, I have 4 different textures. Each the size as the view. One has the diffuse color of all world objects in it. One has normals. Another has the color of 1 light, with occlusion factored in. The third has a specular factor in the red channel.

Multiple render targets are used to construct the images at the beginning of the rendering pass (except the light image) and then all the textures are sampled (per light) at the end of the pass.

Nothing new. Other people do similar stuff.

It just seems silly that we need to stack so many textures in order to store all the stuff we find useful, per-pixel.

Instead of storing the data in multiple rendering targets, it would be nice to just create a texture with an arbitrary number of float channels in it. Where people could store whatever they want into any channel. Includeing depth. Let depth culling be something people can deal with by hand.

For example, what if the rendering target could store: (per pixel)

Diffuse Red color
Diffuse Blue color
Diffuse Green color
Normal Vector X
Normal Vector Y
Normal Vector Z
Position Point X
Position Point Y
Position Point Z
Depth from camera
Specular “shininess” factor
Some arbitrary mask percentage
The kitchen sink
My left sock
My right sock

Then, during the light rendering pass(es) we wouldn’t need our fragment shader to access multiple texures.

A single call to texture2D() in the fragment shader would give back a pointer to the saved pixel and all it’s values.

The act of performing a trivial reject at the begining by comparing depth and calling “discard;” if we want to, is up to the fragment program.

A single call to texture2D() in the fragment shader would give back a pointer to the saved pixel and all it’s values.

First, pointer=bad. Particularly for shaders. The best you could hope for is a pre-defined struct.

Second, what about interpolation? You’ll want to interpolate differently for the different kinds of values. Normals, for example, will need to be renormalized, masks shouldn’t be interpolated at all, etc.

I said pointer since I couldn’t say vec3 or vec4 or any existing defined type. Obviously there would need to be a new array type for something like this.

Yes, the result would have to store an interpolated answer. All channels would be interpolated. The same. Even vector normals. Sure, the interpolated answer would not be mathematicly correct, but it would be close enough that the eye wouldn’t notice. If the programmer wanted to be strict about it (and was willing to spend the GPU cycles) he could normalize the interpolated result after he asks for it.

As for masks, well it depends on what kind of mask it is. If you want something like an alpha transparency blend, then the interpolation is what you want anyway. If you are looking for binary masking, then the fragment program can decide how to interpret the interpolated answer. Example:

vecFoo sample = texture2D(my_texture_sampler, SomeCoordinate);

// position 9 in the interpoloated pixel sample stores my mask.
// I wrote 1.0 for transparent pixels and 0.0 for masked pixels.

if(sample[9] < 0.5)
{
// fragment masked out
discard;
}

// else, keep going and draw the fragment

for an alpha blend, it would be nice to be able to read from the fragment you are about to write to. Then just mix the existing fragment with whatever you come up with using the value.

Basicly, I am saying that depth testing and alpha blending should just be things people activly do in fragment shaders, rather than passively.

What problem is this suppose to solve?
Too many tex instructions in the fragment shader?

Then there is the whole mechanism of texture fetching that needs to be changed, starting from memory into texture cache, from cache to sampler and from sampler to register.

The whole hardware would need to be changed to allow for textures with an arbitrary (or very high) number of channels. I don’t think there would be any difference, neither in performance, nor in ease of use.

Now what in my opinion needs to be changed, are the texture formats “available” today and the restrictions on RTT.

For example, we have 8, 16 and 32 Bit single-channel textures (LUMINANCE). Though their support is so bad, they are pretty much unusable (nVidia gives you an error, when uploading a 32 Bit luminance texture through PBOs, since the driver expects an RGBA image, which it then will convert into a single-channel image, killing any speed-increases due to PBOs).

Also dual-channel formats are pretty much non-existing and RGB textures are usually internally stored as RGBA, which makes them, in a way, non-existing, too.

In my opinion the available texture-formats need to be revised and implemented properly.

And to be able to properly use them, RTT restrictions need to be reduced. For example, AFAIK even high-end hardware does not allow you to bind textures of different formats to multiple render-targets. So either you do several render-passes to fill your buffers, or you somehow put everything into RGBA float16 textures (usually the least common denominator that works).

Jan.

I guest texture buffer object could be used for this purpose

I had a check and actually it isn’t as easy as I thought but still work. The issue with your proposal is that it doesn’t feet with current hardware.

This seams like deferred shading so you probably don’t need mipmaps or filtering… Basic texture fetch will do the trick but data can’t be packed with more than 4 components with texture buffer object. you will need multiple texelFetchBuffer calls to restore your pixel structure.

I don’t think that a texture unit is required (but I’m not sure!) so maybe with a extension that update texture buffer object we could get a feature making possible to format a buffer in a shader and get this packed data in a single ‘fetch’.

With OpenGL 3 words, I think you should make differences between “images” and “buffers”. Buffers are row data, that where you could get you custom structure type. There are maybe some interesting investigation around render to vertex buffer but texture aren’t needed to save pixels anyway so it possible to read your pixel data form a single buffer but I don’t think it’s possible to write in a single buffer… Pretty much useless idea actually.

You forgot to mention why you think this would improve performance. Cache coherence? Less instructions per shader?

Let depth culling be something people can deal with by hand.

Throwing away all the advantages of early Z optimizations, depth compression, and multisampling is not a good idea.