i’ve written my first per-pixel lighting shader,
and its not optimised, i’ve looked at it, tryed moving some things to the vertex shader, but i loose visual quality.
first, you dont need to normalize the vectors in the vertex shader, as you will do it in the fragement shader: they WILL become wrong when interpolated for each fragment, so you can save this.
the second point is the if-statement. unless you have a sm3.0 complient card, you wont benefit from it, and it will even be slower than just calculate the sper variable for each fragment.
If u want high speed, do what zed says and normalize the vectors only in the vertex shader.
If u don’t want the resulting less quality, normalize them in the fs, then u can get rid of of the normalizations in the vertex shader like wizzo says.
But there is another big optimisation:
a dot product never can become more than 1.0. That’s why you don’t need to clamp it. Use max(dot(), 0.0) instead, so it can’t become less than 0.0.
just to correct splat: a dot product with normalized vectors is always in the range [-1…1]. I guess he meant the same thing though.
You can try speeding your shader up by using a cubemap normalizer. On non SM3 hardware and not using half there is no fast native normalize instruction afaik.
Another hint: if you got an nvidia card, use half! Its range is limited (±65536 i think) but you can use it for unit length vectors, scalars, colors (texture values).
Originally posted by splat:
But there is another big optimisation:
a dot product never can become more than 1.0. That’s why you don’t need to clamp it. Use max(dot(), 0.0) instead, so it can’t become less than 0.0.
I am also trying to optimize my per-pixel-lighting and for me (GeForce FX5800) using max(dot(), 0) is slower than using clamp(dot(), 0.0, 1.0).
Probably the clamp turns into a dot_sat instruction? (If I remeber correctly, there is a _sat modifier)
Using a normalization cubemap is a big performance boost on a GeForceFX. Or is for me…