Subset of blending modes for 32 bit integer render

Having partial support for some of the blending functionality would make certain algorithms easier to implement. Logic operations work for 32-bit integers, however they don’t support useful things like adding, min, max.

So having support for the {add, subtract, min, max, reverse subtract} blend equations for the {one, zero, source (color/ alpha), destination(color/ alpha), constant (color/ alpha))} blend functions would make things orthogonal.

I bumped into this while wanting to count overdraw into an interger render target, but it didnt work :frowning:

I agree, that having 1-src, 1-dest, might be less meaningful.

For counting overdraw you can use the stencil buffer.
Anyway, I agree that your idea would be a nice feature to have.

I can, but I cannot access the stencil value in the shader by reading from a depth stencil texture, which will only give me depth, stencil is implicitely set to zero.

How about being able to specify an N in place of a 1 in those equations?

Nothing inherently FP about those equations that I can see.

Sigh I wonder why OpenGL doesn’t allow changing the texture format to some other one which has the same number of bits per pixel. It really is a sampler state and should be treated as that. The driver internally changes the format when it needs to copy e.g. a depth-stencil texture and it’s quite a common thing (it’s actually the only way how to “detile” a texture on ATI R500).

OK granted, you can do something like that in OpenGL using a PBO. Just copy your depth-stencil texture into a pixel buffer object and use it to load another texture having a format of your choice. If it doesn’t work, report a bug to your vendor, they should fix it considering that even open source drivers can handle that with ease.

I wonder why OpenGL doesn’t allow changing the texture format to some other one which has the same number of bits per pixel.

Because that would imply that the actual format of the texture is the one you asked for. Which is not guaranteed by the spec.

For example, an implementation is allowed to make a RGB8 format take up 32 bits of space per pixel.

Furthermore, even if you could, what data would you get back? The internal format doesn’t say how the 24-bits of a depth value are stored (in terms of endianess, for example). If you convert a DEPTH24_STENCIL8 texture into an RGBA8, what do you get for G? Which byte of the 24-bits of depth do you get? Will it work the same on all hardware?

It’s far more reasonable to simply ask that, if they’re going to let people write to the stencil buffer in a shader, it now stands to reason that they be able to read from textures with stencil data.

it’s actually the only way how to “detile” a texture on ATI R500

What is “detiling”?

probably meant converting from tiled format to linear format, usually for cpu access; something that you need to do on readpixels for instance.

The spec can be extended to address that and endiannes shouldn’t be a problem as hardware I know supports bit swapping for both pixel reads and writes. The internal format shouldn’t matter since it’s internal.

Yes, this feature would be quite handy in conjunction with GL_AMD_shader_stencil_export.

Pierre is right. Usually you want the textures to be tiled for better cache utilization but they’re specified in a linear form in OpenGL. So basically since hardware supports both linear and tiled textures, a driver developer can use the “blit” operation (implemented using a textured quad and a pass-through vertex and pixel shader) to copy between linear and tiled textures. This blit is part of any texture transfer in OpenGL, i.e. TexImage, GetTexImage, ReadPixels and maybe even DrawPixels, and in my opinion this is the reason the textures must be copied into PBOs (which is effectively the blit) instead of obtaining the buffer backing the texture directly.

The spec can be extended to address that and endiannes shouldn’t be a problem as hardware I know supports bit swapping for both pixel reads and writes. The internal format shouldn’t matter since it’s internal.

My point is that it breaks the abstraction and limits what IHVs can actually do with their hardware.

The whole point of pretending that a DEPTH_STENCIL texture is an RGBA8 texture is to make accessing it faster. If performance didn’t matter, you would simply copy the texel data yourself. Therefore, you would only get a performance gain if the hardware were capable of using one internal format as another.

Do you know if all hardware can do that for all formats? Would there be format combination that it can’t do it for in a performant manor?

No; it’s simply not worthwhile to bother with for something that should have been specified correctly (getting stencil data in a shader) to begin with.

Usually you want the textures to be tiled for better cache utilization but they’re specified in a linear form in OpenGL. So basically since hardware supports both linear and tiled textures, a driver developer can use the “blit” operation (implemented using a textured quad and a pass-through vertex and pixel shader) to copy between linear and tiled textures. This blit is part of any texture transfer in OpenGL, i.e. TexImage, GetTexImage, ReadPixels and maybe even DrawPixels, and in my opinion this is the reason the textures must be copied into PBOs (which is effectively the blit) instead of obtaining the buffer backing the texture directly.

The reason that you can’t map a texture is because the format of that texture, the memory layout and arrangement of pixels, is implementation dependent. Is it tiled? Would it be not tile for cubemaps or rectangle textures? What tiling scheme is used? Implementations are only free to choose these things if the user can’t directly access the data.

Buffer objects are unformatted memory; as such, the user can directly access and modify them. OpenGL guarantees that the bytes you set will still be set exactly as you set them, until you perform some operation that changes these bytes.

And I seriously doubt they’re doing a blit operation that involves actual shaders. A simple DMA that swizzles the texture in transit would work just fine and also not interfere with current rendering.

All D3D9 ATI GPUs don’t have this feature, not sure about the newer ones. NVIDIA G80 seems to have it, at least its hw interface suggests that but ya never know for sure…