Intel driver texture lookup limitation

maxijac · February 6, 2011, 4:39am

Hi !

I’m currently working on shaders through Cg.
I’m on an old i915 intel card that seems to only support ARB programs.
My fragment shader implements a filtering algorithm on a texture, this means I run several texture lookups. And my driver ( open-source mesa 7.10 ) does not like that, telling me I did 5 out of 4 indirect texture lookups.
Indirect ? I google’d about that and it seems to mean getting the coordinates to look up a texture from another lookup. That’s not what I am doing, am I ?

I’m doing texRECT(texture, float2(coords.x+1, coords.y+1)) and so on. (actually, I use a 3x3 filter)

I tried passing the texture not as a uniform but as a TEXUNIT0 but same thing.

So, why is that considered indirect lookup ?

Thank you for your attention

Rosario_Leonardi · February 6, 2011, 6:03am

From mesa/src/gallium/driver/i915/i915_fpc_translate.c


static void
i915_fini_compile(struct i915_context *i915, struct i915_fp_compile *p)
{
   struct i915_fragment_shader *ifs = p->shader;
   unsigned long program_size = (unsigned long) (p->csr - p->program);
   unsigned long decl_size = (unsigned long) (p->decl - p->declarations);

   if (p->nr_tex_indirect > I915_MAX_TEX_INDIRECT)
      i915_program_error(p, "Exceeded max nr indirect texture lookups");

And i915_reg.h define I915_MAX_TEX_INDIRECT as 4

** Ohh… I love opensource **

Hardware have limitation on the number of instruction, texture access and number of indirect texture access(access to the texture with an offset) you can retrieve this number with

glGet(GL_MAX_PROGRAM_TEX_INSTRUCTIONS_ARB)
GL_MAX_PROGRAM_TEX_INDIRECTIONS_ARB

So, sorry your hardware can’t do that, try another technique (multiple pass or lower quality).

maxijac · February 6, 2011, 6:21am

OK, so accessing with an offset is considered indirect. Too bad

Thank you a lot

ZbuffeR · February 6, 2011, 9:07am

Use ‘multitexturing’ with same texture but slightly offset texcoords. So each texRECT will use directly its own texcoord, without any indirection. It may even end up faster.

arekkusu · February 7, 2011, 8:28pm

Really the limitation is on the number of indirection phases, not total indirections. The shader compiler should re-order the instructions so that all of the temporary coordinates are computed in a batch, and then used in a batch, to minimize the number of phases.

kRogue · February 8, 2011, 1:06pm

Adding to what ZbuffeR is saying:

If the above is really what appears in your shader, create an additional varying for each texture coordinate:

VertexShader (GLSL, you’ll need to convert to the correct Cg code):


varying vec2 texcoord0, texcoord1, ...

void
main(void)
{
  texcoord0=whatever;
  texcoord1=texcoord0 + offset_constant1;
  texcoord2=texcoord0 + offset_constant2;
  texcoord3=texcoord0 + offset_constant3;
  texcoord4=texcoord0 + offset_constant4;
   .
   .
}

FragmentShader:


varying vec2 texcoord0, texcoord1, ...
uniform sampler2DRect rect_tex;
void
main(void)
{
   vec4 tex0, tex1, ...

   tex0=texelFetch(rect_tex, texcoord0);
   tex1=texelFetch(rect_tex, texcoord1);
   tex2=texelFetch(rect_tex, texcoord2);
   .
   .
}

You are almost guaranteed that this will run faster (letting the hardware doing the interpolation is typically cheaper than doing an extra operation in the fragment shader).

system · October 19, 2021, 6:42pm

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.