GLSL madness on ATI GPUs

Hello comunity,

I have some really weird problems with this (very simple) GLSL fragment shader. The purpose of this is to “rotate” point sprites.

I’m developing an old school shoot-em-up game and wanted to use point sprites for the bullets and similar stuff. But since I needed rotation for some of the bullets (missiles) I came up with these shaders to modify the point-sprite’s texture coordinates depending on the projectile’s flight direction.

Since the game logic is 2D I send the position and direction as one 4D vertex to the GPU (vec4=(pos_x,pos_y,dir_x,dir_y)). Additonally vertex colors are transmitted.

The vertex shader extracts the rotation information, “repairs” the position (z=0, w=1) and retrieves the vertex color.
The fragment shader then uses the rotation information to modify gl_PointCoord, does the texture look-up and adds the vertex color.

Nothing fancy at all, I’d say. But:

  1. when I use a user-defined varying mat4 to transmit the rotation info to the fragment shader, the values get somehow interpolated on its way. Interpolated with what I keep asking myself? It’s just one vertex.
    Anyway: I could “solve” this problem using a vec2 instead. Better anyway.

  2. Now the really funny part: on my Radeon HD3850 (last week’s driver) the texture-lookup is totally corupted whenever I access gl_Color in the fragment shader. If I don’t use it, all is fine. It looks like as if the driver does wrong optimizations. Reformulating the code doesn’t make any difference.

Here are the shaders. It works even on an old Radeon 9600 (though the texture accessed is flipped…) and a Geforce 8400M.

Vertex-shader:


varying vec2 v_rot;
void main(void) {
  gl_FrontColor=gl_Color;
  vec4 l_position=gl_Vertex;
  v_rot=normalize(l_position.zw);
  l_position.z=0.0;
  l_position.w=1.0;
  gl_Position=gl_ModelViewProjectionMatrix*l_position;
}

Fragment-shader:


uniform sampler2D tex;
varying vec2 v_rot;
void main(void) {
  vec4 l_uv=vec4(0.0,0.0,gl_PointCoord.xy);
  l_uv.zw-=vec2(0.5,0.5);
  l_uv.x=l_uv.z*v_rot.x;
  l_uv.y=l_uv.w*v_rot.x;
  l_uv.x-=l_uv.w*v_rot.y;
  l_uv.y+=l_uv.z*v_rot.y;
  l_uv.xy+=vec2(0.5,0.5);
  // remove *gl_Color from the next line
  // and it'll work on HD3850 (of course no different colors anymore)
  gl_FragColor=texture2D(tex,l_uv.xy)*gl_Color;
}

Here’s how it should look:

And here’s the HD3850’s output:

Anyone any idea?

I don’t see anything in your snippits.

If you strongly suspect a bug please file a report so we can get these things fixed.

You can send bug reports to devrel ‘at’ amd.com.

Or you can submit a report in a form if you prefer here (I did anyway)
http://ati.supportcenteronline.com/ics/survey/survey.asp?deptID=894&surveyID=486

I’ll prepare a little commented test-app and send it over.

Here’s some related information to your problem. Last year, I tested GLSL support on the x1950 and HD-series ATI cards. The HD series had some bugs that the previous cards did not have. After some investigation, it turned out that the HD series uses another OpenGL dll (i.e. not ATIOGLXX.dll). Maybe they started fresh on their OpenGL driver for the HD cards and it isn’t quite up to par…

As for your specific problem, try using only user-defined varying arguments instead of the built-in gl_FrontColor. Maybe you’ll get better results.

Another thing to keep in mind : point sprites are deprecated in OpenGL 3, so you might want to avoid them. If you used regular quads, you could also do the rotation per-vertex instead of per-pixel.

Good luck!

Using user-defined varyings instead of gl_FrontColor doesn’t change anything. The problem appears whenever I just touch gl_Color.

Deprecated in OGL 3.0? Since when?
Quads are deprecated but not point-sprites, as far as I know.

When I look into the specs they seem well supported:
Specs 3.0

AFAIK quads are deprecated and Point Sprites are not.
And in any case the FIxed Function pipeline will still be doing the texture rotation that you have to do in the shader.

Also ironically, and I am not trying to be picky here bertgp, but back along on the Mac OpenGL lists I remember the only way to get around a certain Point Sprite problem was to misuse FF registers to sneak data to the Frag Shader!! :wink:

I have a rotating point sprite shader working on ATI hardware. However, it was fraught with problems, and even now certain amounts of point sprites will cause a very bad GPU lockup. Otherwise I’d chuck it out here for all.

Basically ATI point sprites are still a little buggy here and there from my experience.

As has been said do file a bug report.

If you want… Try this approach in your shaders…

Vertex:

// Texture rotations	
        gl_TexCoord[0] = gl_MultiTexCoord0;
	float aa = angle * (gl_Color.y - 0.5);		// Spin in colour data.
	float cc = cos(aa);
	float ss = sin(aa);
	mat01 = vec2(cc, -ss);
	mat02 = vec2(ss, cc);

Frag:

	
        vec2 st_centered = gl_TexCoord[0].st * 2.0 - 1.0;
	vec2 st_rotated = vec2(dot(st_centered,mat01),dot(st_centered,mat02));
	vec2 rot = st_rotated * 0.5 + 0.5;

Thanks for the comments!
But nothing works on the HD3850.
I really tried all. For instance:

  • (Ab)using built-in variable like gl_TexCoord or gl_SecondaryColor, well anything possible, to “sneak” in my data. Sometimes the texture-lookup then works in the FS, but when it does the color gets corrupted: fragments are then colored depending on the rotation data…

  • Different types, combinations, orders of varyings. Same results.

Perhaps it would work if I don’t send the color via glColorPointer. But that’s not an option for me.

Are you sure? As I said: on my old Radeon9600 it works (and is quite fast). And without the color it also works on the HD3850.

Anyway, I’ll send a bug report. Had no time yesterday to prepare one, but today I’ll go for it.

Heh. Like I said… It “works”. But I know what not to do with it!!
Hence it not being code I can share…

I am surprised you cannot get around this by totally dumping any fixed pipeline variables and using all your own in the shaders. But then again…

Good luck… ATI and Point Sprites is always fun!!
Having said that I am migrating to NVidia this month and am sure I’ll hit all their particular problems with my ATI code!!!

You are both right, I got my facts mixed up.

I said quad when I implied 2 triangles making up a quad. I should have been clearer on that.

As for sprites being deprecated, I mixed up different documents and specs. Sorry about that! From the Nvidia GPU programming guide for G80 :

In contrast to all the performance hazards of using the GS, one case generally will run very well, and be simple to implement. This case is point sprites. Given that the point sprite fixed function capability has been removed in DirectX 10, you can now simply generate a primitive from a single input vertex.

Just sent a bug report to AMD/ATI. Let’s see what happens.
I’ll keep you informed.

Thanks and greets!

ATI’s support cannot reproduce it.
So if you got a HD3850 or similar GPU it would be kind if you try out for yourself and post some feedback.

My specs:
XP Pro SP3
Catalyst 9.1 (earlier version didn’t work either)
Radeon HD3850 (AGP version)

Here is an archive containing an EXE and source-code (VC2008 project). Stripped to the core. Look at the beginning of main.cpp for detailed information.

point sprite bug

Thanks!

Just installed Catalyst 9.2:
bug still remains.

Hi Inquisitor, maybe little bit disgressing, I am just curious at your rendering effect. Are these sprites overlapped in the order of the vertex array? I mean is it possible to insert a new coming sprite behind the rendered sprites?

Yes. Smaller indices are drawn first.

Of course. The most simple solution would be something like that:

::memmove(&sprites[1],&sprites[0],sizeof(sprite)*(SPRITE_COUNT-1));
sprites[0]=new_sprite;

Or you could modify the hole stuff to use the z-buffer. There are many ways.

Nothing useful from AMD/ATI.
Interesting enough, since I’ve got no trouble reproducing it on some friend’s Radeons.

Anyway, I used AMDs GPU ShaderAnalyzer to see what garbage the compiler produces.
I got the following results:

  • all works fine as long as the texture access is done with R0
  • when R1, R2 or R3 are used it produces the known artifacts

Trashy result (with *gl_Color at the end):
SAMPLE R1, R1.xyxx, t0, s0

Good result (without *gl_Color):
SAMPLE R0, R0.xyxx, t0, s0

Here’s the current fragment shader:


uniform sampler2D tex;
varying vec2 v_rot;
void main(void)
{
	vec4 l_uv=vec4(0.5,0.5,gl_PointCoord.x-0.5,gl_PointCoord.y-0.5);
	l_uv.x+=v_rot.x*l_uv.z;
	l_uv.y+=v_rot.x*l_uv.w;
	l_uv.y+=v_rot.y*l_uv.z;
	l_uv.x-=v_rot.y*l_uv.w;
	gl_FragColor=texture2D(tex,l_uv.xy)*gl_Color;
	// use the line below instead to get it to work
	// gl_FragColor=texture2D(tex,l_uv.xy);
}

Here’s the complete disassembly:


; Trash
00 ALU: ADDR(32) CNT(6) 
      0  x: ADD         R127.x,  |R2.w|, -0.5      
         y: ADD         R127.y,  |R2.z|, -0.5      
      1  z: MULADD      R123.z,  R1.x,  PV0.x,  0.5      
         w: MULADD      R123.w,  R1.x,  PV0.y,  0.5      
      2  x: MULADD      R1.x, -R1.y,  R127.x,  PV1.w      
         y: MULADD      R1.y,  R1.y,  R127.y,  PV1.z      
01 TEX: ADDR(48) CNT(1) VALID_PIX 
      3  SAMPLE R1, R1.xyxx, t0, s0
02 ALU: ADDR(38) CNT(4) 
      4  x: MUL         R0.x,  R0.x,  R1.x      
         y: MUL         R0.y,  R0.y,  R1.y      
         z: MUL         R0.z,  R0.z,  R1.z      
         w: MUL         R0.w,  R0.w,  R1.w      
03 EXP_DONE: PIX0, R0
END_OF_PROGRAM


; Good
00 ALU: ADDR(32) CNT(6) 
      0  x: ADD         R127.x,  |R1.w|, -0.5      
         y: ADD         R127.y,  |R1.z|, -0.5      
      1  z: MULADD      R123.z,  R0.x,  PV0.x,  0.5      
         w: MULADD      R123.w,  R0.x,  PV0.y,  0.5      
      2  x: MULADD      R0.x, -R0.y,  R127.x,  PV1.w      
         y: MULADD      R0.y,  R0.y,  R127.y,  PV1.z      
01 TEX: ADDR(48) CNT(1) VALID_PIX 
      3  SAMPLE R0, R0.xyxx, t0, s0
02 EXP_DONE: PIX0, R0
END_OF_PROGRAM

So this leads me to the following questions:

Does anybody know (or is there) a way to enforce the use of specific registers?
Is there any NOP GLSL sequence that is known to force the ATI compiler to use different registers?
Is there a way to turn the optimizer off?

Thanks in advance!

ATI have always been pretty good at getting back to me on stuff like this.
If you can show them a specific problem, which you seem to have well documented here, they seem only too happy to deal with it.

Have you forwarded this latest info to them?

Hi. I’m new to these forums but found this thread after having some difficulties with OpenGL in an Autodesk Maya plugin. My problem sounds extremely similar…

If I use gl_Color or any of the non-essential OpenGL keywords (i.e. gl_Vertex and gl_FragColor are ok) then I run into problems. The shader will work at first, but based on certain other rendering calls it will suddenly fail. In my case it is certain Maya rendering operations that somehow break MY shaders. The problem appears to be in the linking of attributes/varyings between the mesh data and vertex shader, and between the vertex shader and fragment shader.

A typical artifact I’ll see is the position being used as UVs, or the colour used as a normal etc. This problem ONLY occurs on HD ATI cards (tested various 38xx and 48xx cards), older ATI cards and all NVidia cards do not have this problem, which matches what was said earlier about them using a new OpenGL dll for these cards. It’s as though some internal state tracking is bugged and the driver ends up patching the vertex data incorrectly.

I worked around my problem by just using generic vertex data pointers for all my attributes and not referencing gl_Colour/ gl_MultiTexCoord/gl_Normal etc.

ATI might not have been able to reproduce the problem because it only fails after certain OTHER gl calls (of which I can’t be sure of because they happen inside Maya). So my test case is pretty complex and probably of no use to them in reproducing the problem :-/

Ali Brown

Indeed, sounds like my problem.

Yes.

Not an option for me.

Yes, in my test-app this case is highlited (so they should be able to reproduce it if they knew what they were doing…). In fact the usage of

glTexEnvi(GL_POINT_SPRITE_ARB,GL_COORD_REPLACE_ARB,GL_TRUE);

results in buggy behaviour while

glTexEnvi(GL_POINT_SPRITE_ARB,GL_COORD_REPLACE_ARB,GL_FALSE);

results in a correctly working shader… Man, I really would like to see this trashy driver-spaghetti-code…

I gave up contacting ATI support. I sent well documented source-code (really small, just stripped to the core), I described the problem hundreds of times - just to find out they didn’t understand anything and are not even able to understand the usage of #define in a simple C++ source code.

When checking the catalyst feedback forum you can easily see that I’m not the only one thinking ATI’s driver support is just incompetent.

I didn’t check the latest 9.3 driver. Just read the release-notes and since they didn’t mention anything relevant I bet the bug remains (like trillions of others).

Could you please check out this test-app?
ATI bug app

Out of interest why can’t you just get rid of gl_Colour and interpolate your own generic ‘varying’?

I ran your sample program on mine and the results were odd. The first define (SHADER_UV_ERROR) caused the particles to appear much too smale compared to your example screenshots in this thread. After looking more closest it looks like the UVs are messed up (only top right few pixels of quad shows the texture). Using either of the other defines didn’t show anything on screen at all for me (the UVs were messed up in a different way such that it is always accessing a transparent part of the texture).

So I can’t get the correct results from the test-app at all :-/

Ali Brown

Stuffing gl_Color in an own varying in the VS doesn’t make any difference. And well, I just don’t want to use another interface to transfer my colors to the shaders.
I want glColorPointer.

Thanks for checking out. The messed UVs are the same effect I do get. The actual resulting trashy shape is different sometimes.

It didn’t work even without gl_Color-usage? Great… Welcome to ATI point-sprite/GLSL-hell!