I’m currently working on shaders through Cg.
I’m on an old i915 intel card that seems to only support ARB programs.
My fragment shader implements a filtering algorithm on a texture, this means I run several texture lookups. And my driver ( open-source mesa 7.10 ) does not like that, telling me I did 5 out of 4 indirect texture lookups. Indirect ? I google’d about that and it seems to mean getting the coordinates to look up a texture from another lookup. That’s not what I am doing, am I ?
I’m doing texRECT(texture, float2(coords.x+1, coords.y+1)) and so on. (actually, I use a 3x3 filter)
I tried passing the texture not as a uniform but as a TEXUNIT0 but same thing.
From mesa/src/gallium/driver/i915/i915_fpc_translate.c
static void
i915_fini_compile(struct i915_context *i915, struct i915_fp_compile *p)
{
struct i915_fragment_shader *ifs = p->shader;
unsigned long program_size = (unsigned long) (p->csr - p->program);
unsigned long decl_size = (unsigned long) (p->decl - p->declarations);
if (p->nr_tex_indirect > I915_MAX_TEX_INDIRECT)
i915_program_error(p, "Exceeded max nr indirect texture lookups");
And i915_reg.h define I915_MAX_TEX_INDIRECT as 4
** Ohh… I love opensource **
Hardware have limitation on the number of instruction, texture access and number of indirect texture access(access to the texture with an offset) you can retrieve this number with
Use ‘multitexturing’ with same texture but slightly offset texcoords. So each texRECT will use directly its own texcoord, without any indirection. It may even end up faster.
Really the limitation is on the number of indirection phases, not total indirections. The shader compiler should re-order the instructions so that all of the temporary coordinates are computed in a batch, and then used in a batch, to minimize the number of phases.
You are almost guaranteed that this will run faster (letting the hardware doing the interpolation is typically cheaper than doing an extra operation in the fragment shader).