Custom filtering (max instead of average)

Hi everyone, I need, for a very specific application, to do a custom filtering operation on textures.

I have vertical lines, with a non constant height, that are displayed 2D to the screen at a possibly different height. The lines use a texture, and I want this texture to be filtered using what could be called GL_MAX instead of GL_LINEAR (that will perform an average, and that is precisely what I don’t want), when minifying, and GL_NEAREST when magnifying. No problems for the GL_NEAREST, but I don’t know how I could do the MAX filter. The texture is float alpha format, and created dynamically from distant data each frame. I’m not using any mipmaps at the moment, and it would be better if it could stay this way (it avoids to generate mipmaps each frame).

I run on a 7800 GT, so I have access to SM3.0 and all. The only thing I can think of right now is to use a shader with multiple texture fetches to do that max operation, but the problem is that the height may be subject to change dynamically.

Would there be any kind of branching that could allow me to avoid developping a custom shader generator ? And what’s more, I need maximum speed …

Thanks for your thoughts,

SeskePeel.

The derivative commands ought to help in this case. I’m not certain that it would be slower to compute the mipmaps, for the following reasons (which are all really the same thing):

  1. simpler texture fetch (NEAREST_MIPMAP_NEAREST)
  2. completely regular resampling (constant ds/dx)
  3. probably don’t need to compute all mip levels
  4. no aliasing, coherent texture accesses

I guess it depends on the number of texels v. texture fetches. Put another way: is it worthwhile to precompute the filter?

-Won

I never gave a look to these derivatives.

For the mipmaps, yes, not all should have to be computed. But, I’ll have to compute them in software (no SGIS_GENERATE_MIPMAPS), to keep the MAX along the mip levels. This is a solution, but I fear it can be costly for the CPU, and the aim of this application is to put all possible computations on the GPU (the goal is 20% CPU usage).

What’s more, these vertical lines are rendered by block (1 to 10 lines, or so, are packed in a texture). So I’ll have to enable anisotropy so that no horizontal filter be applied, but this should be OK.

And, when I said destination height was not constant, it means that it can change at any time during the application lifetime, and the rendering has to adjust on the fly to that. The source lines are subject to change too, but this is not supposed to be frequent.

Any more help for this ?

SeskaPeel.

the max-operation needed while mipmap creation is very cheap, as you sample exactly 4 pixels for every pixel, in an interger grid. the complexity versus abitary scaling es even reduced, as you calculate the mip-levels recursively…

the transform on the GPU , although heavily accelarated, is complex because the number of samples taken may vary heavily. (as long you dont have exact margins, eg. use maximum one to two times magnification) without mipmaps in fact, there is NO guarantee the fragment shader could account for all sampled texels (as long its program is not very complicated and uses long conditional loops)

so mipmaps give more safety and are not as bad by cpu means i think, because mipmap the generation is no abitrary “scaling” but a very simple one.

Thanks dronus for your explanations.
The aim is really to save as most CPU as possible, not to get the best frame rate ever (well a minimum has to be).

There’s no exact margins, but there should not be a big ratio bewteen the heights, but it could happen in some rare case, so the application has to handle all possibilities.

I was thinking of sampling manually the texture in the shader, as many times as needed, with a loop. I’m not experienced at all with SM 3 and branching, so I don’t know how costly it can be, neither if this is complicated to implement or not. I’m supposed to render two 1600*1200 screens on a single 7800 GT or 6800 Ultra graphic card. If the shader is too complex, it could become a bottleneck, but if it leaves 60 fps and 20% CPU, then it’s a win.

I could compute the mip maps by hardware (rendering to screen at half resolution, sampling 4 times, computing the max, then grabbing the result, uplaod the mip level, and then rendering again, etc.), so I wonder what method would be most efficient. I have to precise that I get new data each and every frame, so the mip computation will happen each and every frame too, for the new datas.

Ho … and I’m using rectangle textures … I never played with rectangle textures + mipmaps, isn’t there a problem with this combination ?

Thanks for your input,
SeskaPeel.