PDA

View Full Version : Intel driver texture lookup limitation



maxijac
02-06-2011, 05:39 AM
Hi !

I'm currently working on shaders through Cg.
I'm on an old i915 intel card that seems to only support ARB programs.
My fragment shader implements a filtering algorithm on a texture, this means I run several texture lookups. And my driver ( open-source mesa 7.10 ) does not like that, telling me I did 5 out of 4 indirect texture lookups.
Indirect ? I google'd about that and it seems to mean getting the coordinates to look up a texture from another lookup. That's not what I am doing, am I ? :confused:

I'm doing texRECT(texture, float2(coords.x+1, coords.y+1)) and so on. (actually, I use a 3x3 filter)

I tried passing the texture not as a uniform but as a TEXUNIT0 but same thing.


So, why is that considered indirect lookup ?

Thank you for your attention :)

Rosario Leonardi
02-06-2011, 07:03 AM
From mesa/src/gallium/driver/i915/i915_fpc_translate.c


static void
i915_fini_compile(struct i915_context *i915, struct i915_fp_compile *p)
{
struct i915_fragment_shader *ifs = p->shader;
unsigned long program_size = (unsigned long) (p->csr - p->program);
unsigned long decl_size = (unsigned long) (p->decl - p->declarations);

if (p->nr_tex_indirect > I915_MAX_TEX_INDIRECT)
i915_program_error(p, "Exceeded max nr indirect texture lookups");

And i915_reg.h define I915_MAX_TEX_INDIRECT as 4

** Ohh.. I love opensource **

Hardware have limitation on the number of instruction, texture access and number of indirect texture access(access to the texture with an offset) you can retrieve this number with

glGet(GL_MAX_PROGRAM_TEX_INSTRUCTIONS_ARB)
GL_MAX_PROGRAM_TEX_INDIRECTIONS_ARB

So, sorry your hardware can't do that, try another technique (multiple pass or lower quality).

maxijac
02-06-2011, 07:21 AM
indirect texture access(access to the texture with an offset)
OK, so accessing with an offset is considered indirect. Too bad :(

Thank you a lot ;)

ZbuffeR
02-06-2011, 10:07 AM
Use 'multitexturing' with same texture but slightly offset texcoords. So each texRECT will use directly its own texcoord, without any indirection. It may even end up faster.

arekkusu
02-07-2011, 09:28 PM
So, sorry your hardware can't do that

Really the limitation is on the number of indirection phases, not total indirections. The shader compiler should re-order the instructions so that all of the temporary coordinates are computed in a batch, and then used in a batch, to minimize the number of phases.

kRogue
02-08-2011, 02:06 PM
Adding to what ZbuffeR is saying:



I'm doing texRECT(texture, float2(coords.x+1, coords.y+1)) and so on. (actually, I use a 3x3 filter)


If the above is really what appears in your shader, create an additional varying for each texture coordinate:

VertexShader (GLSL, you'll need to convert to the correct Cg code):


varying vec2 texcoord0, texcoord1, ...

void
main(void)
{
texcoord0=whatever;
texcoord1=texcoord0 + offset_constant1;
texcoord2=texcoord0 + offset_constant2;
texcoord3=texcoord0 + offset_constant3;
texcoord4=texcoord0 + offset_constant4;
.
.
}


FragmentShader:


varying vec2 texcoord0, texcoord1, ...
uniform sampler2DRect rect_tex;
void
main(void)
{
vec4 tex0, tex1, ...

tex0=texelFetch(rect_tex, texcoord0);
tex1=texelFetch(rect_tex, texcoord1);
tex2=texelFetch(rect_tex, texcoord2);
.
.
}



You are almost guaranteed that this will run faster (letting the hardware doing the interpolation is typically cheaper than doing an extra operation in the fragment shader).