Multiple sampler2DShadow reads problem - help

Some time ago I wrote a program demonstrating shadowmapping (using GLSlang). It worked OK. However, I decided to play with soft shadows. I want to sample the shadowmap 5 times, but only the first call works - next calls return 1.0.
I tried both sampling from a single texture unit and sampling from 5 texture units (containing identical textures). Basically, it seems that

float f0 = shadow2DProj(ShadowTex,TexCoord).x;
float f1 = shadow2DProj(ShadowTex1,TC1).x;
float f2 = shadow2DProj(ShadowTex2,TC2).x;
float f3 = shadow2DProj(ShadowTex3,TC3).x;
float f4 = shadow2DProj(ShadowTex4,TC4).x;
f = f0+f1+f2+f3+f4;

and

float f0 = shadow2DProj(ShadowTex,TexCoord).x;
float f1 = shadow2DProj(ShadowTex,TC1).x;
float f2 = shadow2DProj(ShadowTex,TC2).x;
float f3 = shadow2DProj(ShadowTex,TC3).x;
float f4 = shadow2DProj(ShadowTex,TC4).x;
f = f0+f1+f2+f3+f4;

are equal to

float f0 = shadow2DProj(ShadowTex,TexCoord).x;
float f1 = 1.0;
float f2 = 1.0;
float f3 = 1.0;
float f4 = 1.0;
f = f0+f1+f2+f3+f4;

and not equal to

float f0 = shadow2DProj(ShadowTex,TexCoord).x;
float f1 = f0;
float f2 = f0;
float f3 = f0;
float f4 = f0;
f = f0+f1+f2+f3+f4;

Each of the texture instruction works, but only when it is the first one.

Any help will be appreciated…

I have Radeon 9800 and Catalyst 4.8

I don’t know whether I worded anything wrongly, broke some rules or something (if so, then sorry), but I’ve been waiting for a reply for quite a while.
Should I post the entire shader?
Is it possible to be a driver bug? If so, then is there a way to get round it? If so, then what is it?

If you post the whole shsader I will try it :wink:

ATi does not natively support shadow textures; they actually do multiple texture reads and compares for each shadow2DProj access. It is entirely possible that the compiler gets confused when you do a lot of shadow accesses, since it isn’t compiling it into a single opcode.

It is entirely possible that, even when they fix the bug in their drivers, compiling that many shadow functions will likely throw you over the fragment program instruction count limit.

Ffelagund >> that’s the fragment shader:
(I know it’s rather messy, but it was changed many times to see what’s wrong)
varying vec4 TexCoord;
varying vec3 Normal;
varying vec3 Position;
varying vec3 LightVec;
uniform vec3 LightPosition;
uniform sampler2DShadow ShadowTex;
uniform sampler2DShadow ShadowTex1;
uniform sampler2DShadow ShadowTex2;
uniform sampler2DShadow ShadowTex3;
uniform sampler2DShadow ShadowTex4;
varying vec4 TC1;
varying vec4 TC2;
varying vec4 TC3;
varying vec4 TC4;
void main()
{

vec3 tNormal = normalize(Normal);
vec3 tLightVec = normalize(LightVec);
float brightness = clamp(dot(tLightVec,tNormal),0.0,1.0);
float shine = dot(normalize(Position),normalize(reflect(tLightVec,tNormal)));
shine = pow(clamp(shine,0.0,1.0),24.0);

float c;
float d;
d = clamp(distance(LightPosition,Position),0.0,1000.0);
d = sqrt(pow(1.0-d/1000.0,1.0));
c = max((brightness0.6+shine0.4)*d,0.01);
if (dot(Normal,LightVec) > 0.0)
{
float f = 0.0;
// f = shadow2DProj(ShadowTex,TC1).x;
float f0 = shadow2DProj(ShadowTex,TexCoord).x;
float f1 = shadow2DProj(ShadowTex1,TC1).x;
float f2 = shadow2DProj(ShadowTex2,TC2).x;
float f3 = shadow2DProj(ShadowTex3,TC3).x;
float f4 = shadow2DProj(ShadowTex4,TC4).x;
f = f0+f1+f2+f3+f4;

f = fc0.2;
gl_FragColor = vec4(f,f,f,1.0);

//gl_FragColor = shadow2DProj(ShadowTex,TexCoord)*c;

} else
{
gl_FragColor = vec4(0.01,0.01,0.01,1.0);
}
}

Korval >>
I also tried sampling only 2 times, with the same effect:(
Do you think switching to ARB_Fragment_program (rewriting the shader in assembly)would help?

Anyway, thanks for answers!

I’ve had no problems with multiple samples on my Radeon 9700 Pro. The difference is, that I’m not using the shadow* functions and depth textures; I’m using regular RGBA8 textures or floating point textures.

Code snippet:

float OffsetDepthLookup(const float _Depth, const vec2 _Offset)
{
	return float(_Depth < texture2DProj(uShadowMap, vec4(vProjTexCoord.xy + _Offset * vProjTexCoord.w, vProjTexCoord.z, vProjTexCoord.w)).r);
}

float ComputeShadow(const float _Depth)
{
	float
	Intensity  = OffsetDepthLookup(_Depth, vec2(-ShadowMapTexelSize, -ShadowMapTexelSize));
	Intensity += OffsetDepthLookup(_Depth, vec2( ShadowMapTexelSize, -ShadowMapTexelSize));
	Intensity += OffsetDepthLookup(_Depth, vec2( ShadowMapTexelSize,  ShadowMapTexelSize));
	Intensity += OffsetDepthLookup(_Depth, vec2(-ShadowMapTexelSize,  ShadowMapTexelSize));

	return Intensity * 0.25;
}

Sunray >> thanks for the idea!
However I have a few questions:
uShadowMap in your code is a floating point texture with 32 bits per component(and depth in r)? How did you create and render it? Could you point to the extension you used? I think that would really solve all my problems. Thanks again!

uShadowMap in your code is a floating point texture with 32 bits per component(and depth in r)?
16 bits per channel float-texture or packing depth into a RGBA8.

How did you create and render it? Could you point to the extension you used?
Easy. It’s simply a shader that outputs the distance of vec3(POV -> fragment).
Let v be a varying vec3. Then, for each vertex v = POV - VertexPosition. And for each fragment, compute depth by calling dot(v,v) for a squared distance (faster), or by calling length(v).

Hope that helps!

I am afraid there was a little misunderstanding… I know how to compute distance/depth and so on - I only wanted to know how you encoded it into texture. I mean

  1. how you pack/unpack float into 8bit RGBA
    and/or

  2. how you allocate color buffer that can be filled with floats (16/32 bit)and then read to float texture.
    I searched a little bit and found:

  3. how to pack float [0<>1] into RGBA8

  4. extansions WGL_ATI_pixel_format_float and GL_ATI_texture_float

  5. is a bit problematic, but I will think about rescalling the distance values to fall within the [0<>1] bounds. However, if you know a better way to pack floats (without clamp), then I am listening:)

  6. I will have to abandon dglopengl.pas CreateRenderingContext routine:(. Is it (WGL_TYPE_RGBA_FLOAT_ATI) much slower than regular 8bit per color buffer?

Sorry that I cannot test it right now, but my Radeon is burdened with converting VHS to DVD and running my crashy programs would cause <at least> dropping frames :frowning: I will get down to it tomorrow…
By the way >> I have already copied the movie from this year’s trip round Scandinavia. You have a beautifull landscape there…

Thanks again!

packing a float into a RGBA is pretty easy, stright from my shadowing fragment program :

  const vec4 packFactors = vec4( 1.0, 256.0, 65536.0, 16777216.0 );
void main()
{	
	gl_FragColor = vec4(fract(packFactors*gl_FragCoord.z));
}

unpacking is just a matter of doing the following :

  const vec4 extract = vec4( 1.0, 0.00390625, 0.0000152587890625, 0.000000059604644775390625 );
void main()
{
     vec4 shadowValue = texture2D(shadowMap, projectiveBiased.xy);
    float shadow = dot(shadowValue,extract);
    ... more code ...
} 

hope thats some help :slight_smile:

edit: the constants are 1/<something> formulas, but I cant remember what they are, they are in another post about the place on here I found while looking up how todo it, the GLSL code its self is a direct port of the code from a Humus demo

I didn’t try the method through shaders, but I did it in my main program(encoded it into TVector4f, wrote to frambuffer, read it back and unpacked). The result is, when I try to encode 0.666 I get 0.668627 (0.5 - 0.50196; 1 - 0;0.75 -0.7490196). The mistake isn’t to big, but I think I will play with float32 buffers and float32 textures - they have better resolution and no clamp:
(0.666 - 0.6659927; 0.5 - 0.5; 1 - 1; 654.321-654.32031).
However, I have another problem:
I initialize float pbuffer (whrough wgl) and it seems to work (I don’t use shaders right now, I have a small program that reads/writes the buffer with drawpixels,readpixels), but when I want to check whether it is float buffer like this:
glGetBooleanv(GL_RGBA_FLOAT_MODE_ATI,@Success);

if Success then
ShowMessage(‘Works’)
else
ShowMessage(‘Screwed up again’);

the second message is shown:(
It is not really a problem unless I would like my programs to run on other machines.

I think now everything will be OK, I will have to write another shadowmapping program, cause the old one has survived so many modifications that it looks like Frankenstein.

Oh, the constants seem to be 1/2^0,1/2^8,1/2^16…, so just the inverse of packfactors

Thanks again for all your help and time devoted. See you later:)

Originally posted by Lurker_pas:
[b]Ffelagund >> that’s the fragment shader:
(I know it’s rather messy, but it was changed many times to see what’s wrong)
.
.
.

Anyway, thanks for answers![/b]
Hello Lurker,
I was unable to test your shader. At this moment my card only has beta drivers, and unfortunately, you fragment shader crashes the application (the only one that does that at this moment).
I’ve reported a bug about that, so maybe they will fix it soon.

I didn’t have much time recently and additionally I had problems with Internet connection so I abandoned my program and this forum. However, I am back online today and I have found new 4.9 ATI drivers. The code I posted works now exactly as it should - so it was a driver bug…

I also found GL2.0 spec and I am pretty disappointed with the lack of render_target extension as now I am struggling with the pbuffers and render_texture.

Again, many thanks to all.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.