Weird GLSL Issue

toneburst · April 5, 2008, 10:03am

Can anyone see any obvious errors in the code below?
I’ve been through it, but I can’t see any problems. However, I’m finding that it renders properly on some GPUs, but not on others;

Vertex Shader:

/*
Spherical Harmonics code Paul Bourke, adapted from
http://local.wasp.uwa.edu.au/%7Epbourke/surfaces_curves/sphericalh/

Normal calculation concept tonfilm
http://tonfilm.blogspot.com/2007/01/calculate-normals-in-shader.html

Lighting calculation from OpenGL Red Book, via www.lighthouse3d.com

GLSL assistance and general encouragement Memo

GLSL implementation alx @ toneburst, 2008
*/

/////////////////////
////  CONSTANTS  ////
/////////////////////

#define TWOPI 6.28318531
#define PI    3.14159265

/////////////////////
//// TWEAKABLES  ////
/////////////////////

// Pre-Transform controls
uniform vec4 TT_0;		// TT_0(X) = Pre-Scale X (range 0.0 > 1.0)
uniform vec4 TT_1;		// TT_1(Y) = Pre-Scale Y (range 0.0 > 1.0)
uniform vec4 TT_2;
uniform vec4 TT_3;		// TT_3(X),(Y) = Pre-Translate X,Y (range 0.0 > 1.0)

// Spherical Harmonics controls (range 0.0 to 10.0)
uniform float M0,M1,M2,M3,M4,M5,M6,M7;

// Light position
uniform vec3 LightPosition;

/////////////////////
////  VARYINGS   ////
/////////////////////

// Passes result of shading calculation to Fragment Shader
varying float colpos;

/////////////////////
////  FUNCTIONS  ////
/////////////////////

// The actual Spherical Harmonics formula (operates on Spherical coordinates)
vec3 sphericalHarmonics(float theta, float phi, float m0,float m1,float m2,float m3,float m4,float m5,float m6,float m7)
{
	vec3 point;
	float r = 0.0;
	r += pow(sin(m0*phi),m1);
	r += pow(cos(m2*phi),m3);
	r += pow(sin(m4*theta),m5);
	r += pow(cos(m6*theta),m7);
	
	point.x = r * sin(phi) * cos(theta);
	point.y = r * cos(phi);
	point.z = r * sin(phi) * sin(theta);

	return point;
}

/////////////////////
////  MAIN LOOP  ////
/////////////////////

void main()
{
	// Create pre-transform matrix from uniform vec4s
	mat4 TT = mat4(TT_0,TT_1,TT_2,TT_3);
	
	// Get vertex coordinates (cartesian)
	vec4 vertex = gl_Vertex;
	
	// Initial vertex position pre-transformed
	vertex = TT * vertex;
	
	// Spherical coordinates to send to Spherical Harmonics function
	float theta = (vertex.x + 0.5) * TWOPI;	// set range to 0 > TWOPI
	float phi = (vertex.y + 0.5) * PI;		// set range 0 > PI
	
	// Spherical Harmonics function
	vertex.xyz = sphericalHarmonics(theta, phi, M0, M1, M2, M3, M4, M5, M6, M7);
		
	// Shading calculation
	colpos = length(vertex.xyz + LightPosition);

	// Transform vertex by modelview and projection matrices
	gl_Position = gl_ModelViewProjectionMatrix * vertex;
	
	// Forward current color and texture coordinates after applying texture matrix
	gl_TexCoord[0] = gl_TextureMatrix[0] * gl_MultiTexCoord0;
}
}

Fragment Shader:

/////////////////////
//// TWEAKABLES  ////
/////////////////////

// Base color
uniform vec4 Color;
// Lighting range. Range 0.1 > 1.0 (use exponential control)
uniform float LightRange;

/////////////////////
////  VARYINGS   ////
/////////////////////

// Shading calculation from Vertex Shader
varying float colpos;

/////////////////////
////  TEXTURES   ////
/////////////////////

// Shading lookup table input
uniform sampler2D LUT;
// Lookup y-position
uniform float LUT_Y;

// Surface texture input
uniform sampler2D TileImg;
// Surface texture scale
uniform vec2 Tile;

/////////////////////
////  MAIN LOOP  ////
/////////////////////

void main()
{
	// Fake lighting shading with Lookup Table
	float lookupX = clamp((1.0-LightRange) * colpos,0.0,0.999);
	// Lookup shade across x-axis of LUT
	vec4 shade = texture2D(LUT, vec2(lookupX,LUT_Y));

	// Surface tiling texture
	vec2 xy = gl_TexCoord[0].xy;
	vec2 phase = fract(xy / Tile);
	vec4 texTile = texture2D(TileImg,phase);

	// Output color compute
	//if (texTile.a == 0.0) {
		//discard;
	//} else {
		gl_FragColor = Color * shade;// * texTile;
	//}
}

It’s supposed to render like this (and does on the ATI Radeon X1600 on my MacBook Pro):

But on the more powerful NVIDIA GeForce 8800 GT on my MacPro, it renders like this:

As you can see, big chunks of the mesh are missing.

alx
http://machinesdontcare.wordpress.com

Seth_Hoffert · April 5, 2008, 10:12am

It seems to me like it’s an issue with how pow() is implemented on some GPUs.

From page 57 of the GLSL specification,

genType pow (genType x, genType y)

Returns x raised to the y power, i.e., x^y
Results are undefined if x < 0.
Results are undefined if x = 0 and y <= 0.

So basically, when your trig calls go negative, you (correctly) get an undefined result on your NVIDIA card.

Since your y is an integer (according to the URL), you could do something like this:

If x < 0 and y is even, then use abs(x) for the first argument.
If x < 0 and y is odd, then use abs(x) for the first argument and negate the result of the pow().
If x >= 0, then just use the result of the pow().

Funnily enough, I’ve had to do something like this before. Here’s the code I came up with:

float pow2(in float x, in int y)
{
    float s = (sign(x) + 1.0) / 2.0;
    return (1.0 - s) * (pow(abs(x), y) * (2 * (y % 2) - 1)) + s * pow(abs(x), y);
}

toneburst · April 5, 2008, 10:21am

Wow, thanks for getting back to me so quickly HexCat!

I’ll give that code a try!

alx

toneburst · April 5, 2008, 10:42am

Ah… the only problem with that code is y also needs to be a float rather than an integer…

alx

Seth_Hoffert · April 5, 2008, 10:48am

Hrm, are you sure about that? The URL at the top of your code uses integers (but casts them into doubles for libmath’s pow()).

toneburst · April 5, 2008, 10:58am

This is true.
I’ve been using float so I can get smoother morphs between forms. Non-integer values result in non-closed forms, but I don’t mind that, actually.

I’ve just tried your function in place of the standard pow, but with y as a float.

float pow2(float x, float y)
{
    float s = (sign(x) + 1.0) / 2.0;
    return (1.0 - s) * (pow(abs(x), y) * (2.0 * (y * 2.0) - 1.0)) + s * pow(abs(x), y);
}

Which seems to do the trick, though the mesh is now much bigger. I’ll have to compensate for that. That’s cool though- just a question of tweaking the ranges of some controls. I can actually restrict the M0-M7 controls to integer values if you think it makes sense to do so.

Thanks a lot for the advice- I’d never have worked that out on my own!!!

alx

Seth_Hoffert · April 5, 2008, 11:00am

Ah, cool. Glad you got it work work.

toneburst · April 5, 2008, 11:02am

Me too!
Thanks a lot, once again.

alx

Seth_Hoffert · April 5, 2008, 11:30am

Hmm, the large mesh may be because you replaced my modulo % operator with a multiplication, but switching it back to % limits which hardware this’ll run on. So, I’ve come up with an improved version:

float pow2(float x, float y)
{
    float s = (sign(x) + 1.0) / 2.0;
    return mix(pow(abs(x), y) * (2 * int(mod(y, 2.0)) - 1), pow(abs(x), y), s);
}

This ought to produce the correct results.

toneburst · April 5, 2008, 11:44am

Actually… it’s not quite nailed it.
I’ve just been doing some A/B comparisons, and the output of the modified shader is actually VERY different from the old one on the laptop (where the old code ‘worked’). I’ve also noticed there seem to be problems with the tiling texture that weren’t present before (a hairline gap appearing around the edge of each tile).

In terms of the forms produced, I think I prefer the results from the old code running on the laptop.
I’m starting to think the results from the ATI card are definitely less mathematically ‘correct’, and further from the example renderings in Paul Bourke’s site. I’m getting a far greater range of mesh sizes than I was with the old code, too, which Paul mentions on his page, so I think with your code in place, I’m getting something closer to the correct result. I’m using these formulae for their abstract shapes rather than as any sort of scientific visualisation though, so ultimately, it’s all about what they look like.

Do you think it would be possible to work out what the ATI card (that rendered the original code without holes appearing in the mesh) is doing with the POW function and values of zero or below, that the NVIDIA one isn’t, and somehow emulating that in the code so they both behave in the same way?

alx

toneburst · April 5, 2008, 2:00pm

HexCat, you’re a genius!!!

I’m afraid I misses the post with your revised function completely, so apologies for my last reply.

I’ve slotted the updated version into my code, and now it seems to work almost the same as the other old code. There’s one slight difference- with some values of the input M variables, the resulting mesh jumps between one state and the next, where in original code it moves smoothly from form to another. This doesn’t effect all controls, but is especially obvious on M3, M5 and M7.

Any idea what might be going on?

alx

Seth_Hoffert · April 5, 2008, 2:11pm

Hmm, I noticed this too. I think I need to figure out how to rewrite it to basically find the real part of a complex number - right now it’s ignoring that. This should be fun.

toneburst · April 5, 2008, 2:16pm

Hmmm… I wish I found maths fun
If you could get it to work though, that would be brilliant!!
I’ve got a good incentive to try though. I’m pretty-much a 3D graphics newbie, but I love the things I can do with just a few lines of GLSL!

EDIT
… and a lot of help from people who actually understand this stuff

alx

Seth_Hoffert · April 5, 2008, 3:02pm

Give this one a try:

float pow2(in float x, in float y)
{
    float s = (sign(x) + 1.0) / 2.0;
    return pow(abs(x), y) * (s + (1.0 - s) * cos(3.141592654 * y));
}

Or, the cleaner version:

float pow2(in float x, in float y)
{
    if (x >= 0.0)
    {
        return pow(x, y);
    }
    else
    {
        return pow(-x, y) * cos(3.141592654 * y);
    }
}

Note that neither version handles pow(0,0).

What I did was assume that ATI’s implementation was returning the real portion of a complex result from exponentiation, then I derived it:


If x > 0
x^y = exp(y * ln(x))
    = pow(x, y)

If x < 0
ln(x) = ln(-x) + pi*i
x^y = exp(y * (ln(-x) + pi*i))
    = exp(y * ln(-x) + y * pi * i)
    = exp(y * ln(-x)) * exp(y * pi * i)
    = pow(-x, y) * exp(y * pi * i)
    = pow(-x, y) * (cos(y * pi) + i * sin(y * pi))

real => pow(-x, y) * cos(y * pi)

toneburst · April 5, 2008, 3:17pm

Hmm… it’s so close… but not quite there.
The problem still seems to be with parameters M5 and M7. Adjusting these inputs seems to produce a result that oscillates between two values, rather than moving smoothly from one to the other. Very odd…

I really appreciate the time you’re putting in to this…

EDIT
lol- you edited your previous reply as I was writing mine!
I was wondering if it wouldn’t be easier to use conditionals. Unfortunately, the latest version still exhibits the same problem.

Very odd…

alx

Seth_Hoffert · April 5, 2008, 3:24pm

This one might be silly, but what about just trying pow(abs(x), y)?

toneburst · April 5, 2008, 3:27pm

Let me try that one.

alx

toneburst · April 5, 2008, 3:46pm

Wellll… that definitely smoothes out those steps. The results look quite different, mind you, so I guess this isn’t the method ATI are using.

Seems to work on both GPUs though, so I’m happy to go with it

Thanks once again for all the time you’ve put into this. I really appreciate it!

alx

system · October 19, 2021, 7:36pm

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.