Multisampling shadow mapping

In my last thread, I tried to use PCF to smooth out shadow. Many thanks to those people’s replies, I finally worked it out.

In this thread, I will state my problem when I try to use multisampling shadow mapping.

I followed a similar example in OpenGL Superbible, here is the result of my multisampling shadow.

In the c++ code, the multisampling texture object is defined as:


glActiveTexture(GL_TEXTURE0);
glGenTextures(1, &depthTexture);
glBindTexture(GL_TEXTURE_2D_MULTISAMPLE, depthTexture);
	
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexImage2DMultisample(GL_TEXTURE_2D_MULTISAMPLE, SAMPLE_COUNT, GL_DEPTH_COMPONENT32, 
TEX_WIDTH, TEX_HEIGHT, GL_TRUE);
	
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, 
GL_TEXTURE_2D_MULTISAMPLE, depthTexture, 0);

SAMPLE_COUNT is a const indicates the number of sample I use in sampling each pixel, which is 32.
I use fixed sample locations so that I could compute the weight for each sample ahead of time. Notice that the last argument for glTexImage2DMultisample is GL_TRUE.
The value for the parameter GL_TEXTURE_MAG_FILTER has been set to GL_NEAREST.

Then I calculate the weight for each sample based on their relative location to the central point. The code is similar to the source code given by the superbible, I will just list it here:


float positions[SAMPLE_COUNT * 2];
for(int i = 0 ; i < SAMPLE_COUNT; ++i)
glGetMultisamplefv(GL_SAMPLE_POSITION, i, &positions[i*2]);
		
float invertedSampleDistances[SAMPLE_COUNT];
float maxDist = 1.0f;

for(int i=0; i<SAMPLE_COUNT; i++)
{
    double xDist = positions[i*2  ]-0.5;
    double yDist = positions[i*2+1]-0.5;
    invertedSampleDistances[i] = maxDist - sqrt(xDist*xDist + yDist*yDist);
}
    
float totalWeight = 0.0f;
for(int j=0; j<SAMPLE_COUNT; j++)
    totalWeight += invertedSampleDistances[j];
    
// Invert to get the factor used for each sample, the sum of
//all sample weights is always 1.0
float perSampleFactor = 1.0 / totalWeight;
for(int j=0; j<SAMPLE_COUNT; j++)
  sampleWeights[j] = invertedSampleDistances[j] * perSampleFactor;
    
glGenBuffers(1, &sampleWeightBuf);
glBindBuffer(GL_TEXTURE_BUFFER, sampleWeightBuf);
glBufferData(GL_TEXTURE_BUFFER, sizeof(float)*SAMPLE_COUNT,
sampleWeights, GL_DYNAMIC_DRAW);
glBindBuffer(GL_TEXTURE_BUFFER, 0);
    
glActiveTexture(GL_TEXTURE1);
glGenTextures(1, &texBOTexture);
glBindTexture(GL_TEXTURE_BUFFER, texBOTexture);
glTexBuffer(GL_TEXTURE_BUFFER, GL_R32F, sampleWeightBuf); 

Then comes the fragment shader.


#version 150
#extension GL_EXT_gpu_shader4 : enable

out vec4 oColor;

uniform sampler2DMS depthTex;
uniform vec4 lightColor;
uniform samplerBuffer sampleWeight;

smooth in vec4 texPos;

void main(void)
{
	vec4 newTexPos = texPos / texPos.w;
	
	vec2 iTmp = textureSize(depthTex);
	vec2 tmp = floor(iTmp * newTexPos.xy);
	
	float weight = 0;
	for(int i = 0; i < 32; ++i)
	{
		float sampleDepth = texelFetch(depthTex, ivec2(tmp), i).r;
		if(newTexPos.z > sampleDepth)
			weight += texelFetchBuffer(sampleWeight, i).r;
		
			
	}
	
	oColor = lightColor * (1.0 - weight);
		
	oColor.a = 1.0;
}

Then it generates the result as posted ealier. Really disappointing.

texelFetch(depthTex, ivec2(tmp), i)

This will fetch from the same texture coordinate 32 times. After all, your depth texture almost certainly only has one mipmap, right? I’m guessing what you really wanted was to use texelFetchOffset.

Granted, that’s not your problem. But it isn’t helping.

Really disappointing.

Your weighting scheme completely disregards the location of the texture access. You don’t adjust your weights based on how close the texture coordinate is to the adjacent texels; you pick a fixed, integer value as your texture coordinate. You discard information about how close the texture coordinate is to its neighboring texels.

This all means that each fragment that fetches from the same base texel will compute the same value. Hence the blockiness.

Also, try to use some better variable names. tmp should never be used.

Thank you very much.

But texelFetchOffset cannot take in an argument of the type sampler2DMS.

For


gvec4 texelFetch(gsampler2DMS  	sampler,
 	         ivec2  	P,
 	         sample  	sample);

I am counting on the second argument to specify a certain texel, and the third argument to specify the sample in the texel.

I didn’t see that you were using a multisample sampler.

But that doesn’t change the fact that you will be computing the same value for every fragment that happens to compute the same integer texture coordinate. So you’ll get a blocky look.

You cannot use a fixed set of weights like that. You have to pick from the nearest texels and average the values together. So basically, you have to do this 4 times and average the results based on where you’re trying to sample it. And even then, the fixed set of weights really isn’t helpful. Unless you have access to the actual location of the samples, it’s better to just weigh them all equally.

Thank you very much. I changed my code like the following:


#version 150
#extension GL_EXT_gpu_shader4 : enable

out vec4 oColor;

uniform sampler2DMS depthTex;
uniform vec4 lightColor;
uniform samplerBuffer sampleWeight;

smooth in vec4 texPos;

void main(void)
{
	vec4 newTexPos = texPos / texPos.w;
	vec2 iTmp = vec2(512, 512);
	vec2 tmp = floor(iTmp * newTexPos.xy);
	vec2 lerp = fract(iTmp * newTexPos.xy);

	float weight[4], sum = 0;
	for(int i = 0; i < 4; ++i)
		weight[i] = 0;
	for(int i = 0; i < 32; ++i)
	{
		float sampleDepth = texelFetch(depthTex, ivec2(tmp), i).r;
		if(newTexPos.z > sampleDepth)
			weight[0] += 1.0/32.0;
	}
	for(int i = 0; i < 32; ++i)
	{
		float sampleDepth = texelFetch(depthTex, ivec2(tmp) + ivec2(1, 0), i).r;
		if(newTexPos.z > sampleDepth)
			weight[1] += 1.0/32.0;
	}
	for(int i = 0; i < 32; ++i)
	{
		float sampleDepth = texelFetch(depthTex, ivec2(tmp) + ivec2(0, 1), i).r;
		if(newTexPos.z > sampleDepth)
			weight[2] += 1.0/32.0;
	}
	for(int i = 0; i < 32; ++i)
	{
		float sampleDepth = texelFetch(depthTex, ivec2(tmp) + ivec2(1, 1), i).r;
		if(newTexPos.z > sampleDepth)
			weight[3] += 1.0/32.0;
	}
	sum = mix(mix(weight[0], weight[1], lerp.x),
		  mix(weight[2], weight[3], lerp.x),
		  lerp.y);
	
	oColor = lightColor * (1.0 - sum);
	
	oColor.a = 1.0;
}

I did a PCF the same way as in my previous thread. Then the result is acceptable. Even better than the result in my previous thread.

It seems that the only benefit from mutilsampling is to achieve better result with the same kernel size.