PDA

View Full Version : GLSL performance problems



Ketracel White
09-29-2005, 10:27 AM
I am trying to implement some simple radial fog using GLSL but I am running into performance issues. Depending on the complexity of a scene I get a performance hit of 30-40% on my Geforce 6800.

These are the shaders I am using:


static const char * vshader=

"varying vec4 position;"

"void main()"
"{"
"gl_BackColor = gl_FrontColor = gl_Color;"
"gl_TexCoord[0] = gl_TextureMatrix[0] * gl_MultiTexCoord0;"
"gl_Position = position = ftransform();"
"}"
;

static const char * fshader=
"varying vec4 position;"
"uniform vec3 fogcolor;"
"uniform float fogdensity;"
"uniform vec3 camera;"
"uniform sampler2D tex;"

"void main()"
"{"
"vec4 texel = texture2D(tex,gl_TexCoord[0].st) * gl_Color;"

"float factor = exp ( -fogdensity * distance(position.xyz, camera));"
"vec3 fogged = (fogcolor * (1.0 - factor) + texel.rgb * factor);"

"gl_FragColor = vec4(fogged, texel.a);"
"}"
;The biggest hit comes from the 'distance' call but I can't find a way to get rid of it. Any ideas?

execom_rt
09-29-2005, 10:57 AM
Use the gl_FogFragCoord instead:

VS:

float FogEyeRadial(vec4 Rh)
{
vec4 Re = Rh / Rh.w;
return length(Re);
}

main()
{
...
gl_FogFragCoord = FogEyeRadial(gl_ModelViewMatrix * gl_Vertex);
...
}

And in your fragment shader (optional)

gl_FragColor = mix(gl_Fog.color, c0, saturate((gl_Fog.end - gl_FogFragCoord) * gl_Fog.scale))

with c0 the final texel color before 'fogging'

Drawback: On ATI, Fog GLSL is buggy

Ketracel White
09-29-2005, 11:34 AM
Doesn't that do a linear interpolation between vertices? That's not really what I want and if it doesn't work on ATI it is of no help.

The main reason I developed this code is because 'normal' fog is totally screwed up on ATI cards.

dronus
09-29-2005, 01:19 PM
Usually a fog is so soft that calculating it by-vertex is enough, unless you have really few large polys. its allways a good idea to split off too large polys into smaller ones for fogging and lighting by vertex. only if you want maximum quality you should shade and fog by fragment.
the ati problem is maybe fixed by using your selfmade distance-thing inside the vertex shader and a self defined variant instead gl_FogColor.

Ketracel White
09-29-2005, 01:52 PM
The problems I am having are caused by large geometry. Splitting this into smaller chunks would not only be a major chore but also explode the vertex amount to proportions where it'd also cause noticable performance hits.

If that is the only option I'll leave the shader as it is but add an option to use normal depth based fog instead.

zed
09-29-2005, 04:35 PM
instead of distance use dot(vec,vec)
the values are larger so u will need to alter your exquation but its faster
also i find linear fog gives a good enuf result :)

Humus
09-29-2005, 07:22 PM
Exchange exp() with exp2() and premultiply fogdensity with 1.442695 (that's 1.0 / ln(2.0)).

I'd also express the lerp() as such, rather than typing out the math. Not sure if it makes any difference though.

execom_rt
09-29-2005, 10:52 PM
Anyway, I find a bit stupid to write a shader just for that.
Also all nVidia video cards has an accelerated radial fog ( GL_NV_fog_distance ).

Maybe you should uses GL_NV_fog_distance for nVidia or write a vertex program using GL_ARB_vertex_program which is working even on ATI, and will be faster than GLSL and more 'reliable'

About the bugs on GLSL, just read that post (http://www.v3x.net/forums/viewtopic.php?t=309) about the current problem on ATI.

Of course it depends of : Easy to write, performance or compatibility (pick only one).

Ketracel White
09-30-2005, 01:56 AM
Originally posted by execom_rt:
[QB]Anyway, I find a bit stupid to write a shader just for that.
Also all nVidia video cards has an accelerated radial fog ( GL_NV_fog_distance ).

Maybe you should uses GL_NV_fog_distance for nVidia
I already tried that. It doesn't work with my geometry. All this extension seems to do is to create vertex based distance fog and my geometry isn't suitable for that without extensive preprocessing.



or write a vertex program using GL_ARB_vertex_program which is working even on ATI, and will be faster than GLSL and more 'reliable'
Can anyone help me with that? I'm sorry to say that I can't get a grasp to the assembler-like language. :( But it is what I feared. GLSL is too new to be a reliable means to do things.

Humus
09-30-2005, 01:08 PM
Originally posted by execom_rt:
Anyway, I find a bit stupid to write a shader just for that.In this time and age I would consider it highly normal and desirable to use shaders for fog.

Humus
09-30-2005, 01:18 PM
Originally posted by Ketracel White:
Can anyone help me with that? I'm sorry to say that I can't get a grasp to the assembler-like language. :( But it is what I feared. GLSL is too new to be a reliable means to do things.First of all, ARB_vp won't help, as you've already ruled out vertex fog yourself. Maybe ARB_fp. Second, your other work besides fog is just a texture lookup and a multiplication. Given that, I'm surprised it doesn't slow you down more than the 30-40% you mentioned. I doubt using ARB_fp will improve much if anything. Did you try my exp2() suggestion? It should improve performance. Another suggestion would be to interpolate view vector instead of interpolating position and computing the view vector in the fragment shader. That should cut it down another instruction.

execom_rt
09-30-2005, 03:50 PM
So for you, there is three solutions:

1) write a fixed pipeline solution. it will be always faster than using a programmable pipeline. Especially if you are just doing fog + simple lighting. This will work on every hardware (ATI, nVidia and other).

2) write a GLSL vertex shader only, which will compute the fog fragment color, and use the fixed pipeline for the per pixel fogging. Remember that you can write a GLSL vertex program without a fragment shader: It works fine, it is as fast as the the fixed pipeline, and works even on a Geforce 4MX.

Only drawback is on ATI, you GLSL shader will be forced to software rendering as soon it will detect that you are attempting to write into a Fog Fragment coordinate (even if the link shader reports that the shader will running in hardware). Or at worst, unexpected results will occurs.

Note that on MacOS X, it works fine, as their GLSL for ATI implementation is far better than the one available on Windows (so it's not an hardware limitation, just a software problem).

Writing the code using ARB_vertex_program will works on ATI and nVidia. It will requires some changes in your code, I wouldn't recommend that. Actually, on nVidia, a GLSL program is converted in NV_vertex_program, but it is possible to convert a GLSL program into an ARB vertex program quite easily.

3) write a vertex shader + fragment shader (your actual solution), it's the slowest, but better quality. It will works only on ATI and nVidia. You can uses the tweak from Humus in order to speed up a bit the shader.

So best it to do:
If running on nVidia or running on MacOSX (if you plan to port your code on Mac) use the vertex shader only solution for maximum of performance, and if running on ATI, use your current solution. It will render exactly the same except that on ATI/Windows, it will uses a slower but compatible.

Ketracel White
09-30-2005, 11:02 PM
Originally posted by execom_rt:
So for you, there is three solutions:

1) write a fixed pipeline solution. it will be always faster than using a programmable pipeline. Especially if you are just doing fog + simple lighting. This will work on every hardware (ATI, nVidia and other).
The problem is: It does not! Fog looks totally screwed on ATI cards as if the driver isn't preparing the coordinates properly. But this is only noticable with rather large polygons that get clipped by the screen's borders. That's why I started this approach in the first place.
Furthermore, on NVidia regular fog is visibly inferior to shader based fog. The banding artifacts can become quite visible.



2) write a GLSL vertex shader only, which will compute the fog fragment color, and use the fixed pipeline for the per pixel fogging. Remember that you can write a GLSL vertex program without a fragment shader: It works fine, it is as fast as the the fixed pipeline, and works even on a Geforce 4MX.
Which still doesn't remove the problem I am having on ATIs.



Only drawback is on ATI, you GLSL shader will be forced to software rendering as soon it will detect that you are attempting to write into a Fog Fragment coordinate (even if the link shader reports that the shader will running in hardware). Or at worst, unexpected results will occurs.
One of the reasons I completely circumvented the fog variables.



Note that on MacOS X, it works fine, as their GLSL for ATI implementation is far better than the one available on Windows (so it's not an hardware limitation, just a software problem).
Interesting. What's up with ATI and their stupid driver bugs anyway? I've never seen any NVidia driver screw up on basic features like fog.


[QB]
So best it to do:
If running on nVidia or running on MacOSX (if you plan to port your code on Mac) use the vertex shader only solution for maximum of performance, and if running on ATI, use your current solution. It will render exactly the same except that on ATI/Windows, it will uses a slower but compatible.[QB]On NVidia I can as easily use the fixed pipeline. The only drawback is that it doesn't look as good but aside from that it works fine. I just would have liked the nicer look of the shader without the performance hit.
On ATI I really wouldn't bother anymore if regular fog actually worked for them.

Right now it seems that GLSL is quite far ahead of the capabilities of current hardware. Half the stuff I tried slowed the rendering down even more than the rather condensed code I posted at the start of the thread. It's no fun developing such stuff if you can't use the really interesting features. :(

canuckle
10-06-2005, 09:12 PM
The problems I am having are caused by large geometry. Splitting this into smaller chunks would not only be a major chore but also explode the vertex amount to proportions where it'd also cause noticable performance hits.Are you sure about the performance hit? My guess is that your program is so fill-rate limited that moving instructions from the fp to the vp would speed things up even if you added a lot more vertices.

Although most would agree fog is probably not a good enough reason to add a lot of vertices to your geometry.