PDA

View Full Version : Custom Bilinear filtering w textureGather problem

menzel
03-07-2012, 06:12 AM
Hello,

I'm trying to implement bilinear texture filtering as it's done with GL_LINEAR. I fetch the needed texels with textureGather, calculate from the texture coordinate the blending factors and blend the texels according to the formulas in the GL specs (at least, that was my intension).

The problem is, that I get strange artifacts with blend values near 1.0. Playing araund with this I foound out that I have to shift the texturecoordinates by a small offset to exactly mimic the hardware filtering of my GeForce GTX 580. However, I can't figure out _why_ this magic offset 1/512 has to be used (which is the same for all pot texture sizes).

This is my function:

vec4 textureBilinear( in sampler2D tex, in vec2 coord, in float useOffset )
{
// get texture size in pixels:
vec2 colorTextureSize = vec2( textureSize(tex, 0) );

// gather from all surounding texels:
vec4 red = textureGather( tex, coord, 0 );
vec4 green = textureGather( tex, coord, 1 );
vec4 blue = textureGather( tex, coord, 2 );
vec4 alpha = textureGather( tex, coord, 3 );

// mix the colours:
vec4 c01 = vec4( red.x, green.x, blue.x, alpha.x );
vec4 c11 = vec4( red.y, green.y, blue.y, alpha.y );
vec4 c10 = vec4( red.z, green.z, blue.z, alpha.z );
vec4 c00 = vec4( red.w, green.w, blue.w, alpha.w );

// calculate the sub-pixel texture coordinate:
float strangeOffset = useOffset * 1.0/512.0; // = 0.00195313;
vec2 filterWeight = fract( coord*colorTextureSize - 0.5 + strangeOffset );

// bi-linear mixing:
vec4 temp0 = mix( c01, c11, filterWeight.x );
vec4 temp1 = mix( c00, c10, filterWeight.x );
return mix( temp1, temp0, filterWeight.y );
}

Attached is an image where you see in the upper left the filtering when I call textureBilinear(mySampler, coord, 0.0) (no magic offset used) - lower left with the offset - upper right the difference between the function without offset and the hardware texture lookup (increased values to better see the artifacts) - lower right the same diff with the offset.

Everything looks fine with the offset but I can't figure out why I need it (did I overread some part of the spec?). I can't just hack in a magic number in my code - in case it's an NVidia problem I would get artifacts on proper implementations. Or if the offset is in fact dependent on some texture properties the code would only work with my set of test textures...

In the long run I want to implement a special variant of texture filtering where the problem with this offset also applies, so 'just use the hardware filter' is sadly not an option.

Any ideas? Thanks.

kyle_
03-07-2012, 11:49 AM

menzel
03-07-2012, 12:29 PM
So four texelFetch and no swizzeling. That should also work and I have a bit more control over the process.

textureGather sounded like the natural solution and in a scenario where I only need to filter two or three channels I need less textureGather operations than texelFetch (but maybe the other texels are in the cache after the first fetch anyway).

I'll try the texelFetch idea and see what it does to the image quality and performance.

But still, I want to know what the underlying problem is (curiosity). I remembered, that the GF filters with fixpoint numbers of 8 bit precision for the blending parameters (filterWeight here, alpha/beta in some other sources). you can provoke situations where hardware filterin gets blocky and doing it on your own in a shader stays smooth. The 1/512 factor is exactly 1/2 the difference between two representable numbers in NVidias hardware filtering. So I added 1/512 of a texel with to the texture coordinate I feeded into the textureGather function and got the same result as the hardware. Must be related to that and then setting the offset right to the coordinate to gather from seems more right.

Still, if anyone has a deeper understanding of what's going on I would appreciate some more details.

aqnuep
03-07-2012, 08:38 PM
Why don't you rather bind the same texture object using different sampler objects? This way one can have bilinear filtering while the other can use nearest filtering:

// bind your texture to texture unit #0
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, myTexture);

// bind your bilinear sampler to texture unit #0
glBindSampler(0, bilinearSampler);

// bind your texture to texture unit #1 too
glActiveTexture(GL_TEXTURE1);
glBindTexture(GL_TEXTURE_2D, myTexture);

// bind your nearest sampler to texture unit #1
glBindSampler(1, nearestSampler);

menzel
03-08-2012, 12:16 AM
aqnuep: I actually did that while debugging. As I wrote: In the longer run I want a specific filtering variant of bilinear filtering matching my usecase (the hardware filter doesn't) but first I wanted to understand how to replicate the result of the hardware filter. This gives me a rough measurement of the performance penalty and helps me better to understand the textureGather logic.

Dark Photon
03-08-2012, 04:18 AM
I'm trying to implement bilinear texture filtering as it's done with GL_LINEAR.
...with anisotropy = 1.0.

That's something I discovered recently. Aniso > 1 plus uneven texture derivatives (looking at the surface at an angle) will cause GL_LINEAR texture filtering to integrate over a larger area of the texture map, even without MIPmaps.

menzel
03-08-2012, 04:41 AM
Dark Photon: yes I know, even when filtered with GL_NEAREST anisotrophic filtering will sample multiple times - I don't see a use case but it's actually what you would expect when switching AF on.

I implemented my filter now with texelFetch, ignoring texture wrapping i get the same performance when I need RGBA. That will work for my use case.

Unless anyone has a deeper insight of the 1/512 offset and can confirm that it's related to the 8 bit filtering precision of Fermi chips or provide a better explanation consider this problem solved.

Thanks all.

kyle_
03-08-2012, 10:23 AM
ignoring texture wrapping i get the same performance when I need RGBA. That will work for my use case.

Care to share info about perf. impact for smaller (component wise) formats?

tksuoran
03-09-2012, 02:49 AM
In my case, I have not enabled AF, I have AMD, and offset is 1/8192 instead. Could this be related?