GL ES: Fragment shader optimization

Summary:
I get FPS slowdown as soon as I try to tint the sprites (i.e: multiply texture with color in the fragment shader)

Details:

Hardware: iPod touch 4

Using 64x64 png texture containing alpha channel, rendering with glEnable(GL_BLEND). (A smiley face with a drop shadow)

I am drawing 700 sprites on the screen using glDrawArrays. And yes I am batching all of these in a single draw call. Following shows Vertex data structure:


    struct Vertex {
        float Position[2];
        float Color[4];
        float Texture[2];
    };

Yes I am sending colour with each vertex because I selectively need to tint some sprites but not others. Following is the fragment shader I am using:


    varying lowp vec2 TexCoord;
    uniform sampler2D TextureSampler;
                                            
    void main(void)
    {
        gl_FragColor = texture2D( TextureSampler, TexCoord );
    }

Till now it is working GREAT, giving me full 60 FPS !!!

BUT

As soon as I change the fragment shader to the following (to allowing tinting):


    varying lowp vec4 DestinationColor;
    varying lowp vec2 TexCoord;
    uniform sampler2D TextureSampler;
                                             
    void main(void)
    {
        gl_FragColor = texture2D( TextureSampler, TexCoord ) * DestinationColor;
    }

The performance drops to 47 FPS only due to this single change {just by multiplication with ONE vector} (FPS measured using xcode instruments and OpenGL detective). Any ideas what is going on ?

Thanks.

Try using glDrawElements with an appropriate indexlist.
Try increasing the complexity of the fragment shader, (e.g by multiplying with one more uniform vec4) to see if the performance will continue scaling down.

My guess is that because the Color vertex attribute is not being used, the compiler is optimizing it out and it’s not being fetched. Try multiplying by a constant instead of a varying; I expect you’ll see performance remain roughly the same (60fps). The extra attribute may be causing you to run into memory bandwidth limitations.

It would also be helpful to see the vertex shader.

The makers of powerVR hardware recommend against the use of alpha testing. ); It interferes with hardware optimizations that take place behind the scenes. Search for the PowerVR Performance Recommendations document.

I assume that they are referring to glEnable(GL_BLEND); If you use it you are going to see a dramatic drop in performance. Everything you add will make matters much worse than you’d normally expect.

Also, in the powerVR SDK shaders I’ve noticed that they use a precision qualifier for the texture samplers.

uniform vec4 sampler2D TextureSampler; instead of
uniform sampler2D TextureSampler;

I assume that if you leave the lowp out of the statement then the drivers will set a default value, whatever it may be. If that default value is mediump or highp then maybe your dramatic drop in fps is due to the driver casting from highp to lowp.

47 fps seems to be the number that the iPhone 4’s will drop down to when under load.

Why are you passing color as an attribute instead of a uniform?

Can you post your vertex shader?

[QUOTE=marcClintDion;1251377]The makers of powerVR hardware recommend against the use of alpha testing. ); It interferes with hardware optimizations that take place behind the scenes. Search for the PowerVR Performance Recommendations document.

I assume that they are referring to glEnable(GL_BLEND);[/QUOTE]

I assume that they were referring to alpha testing, i.e. glEnable(GL_ALPHA_TEST) and glAlphaFunc().

With blending, fragments always update the depth buffer even if they leave the colour buffer unchanged (when alpha is zero). Alpha-testing causes fragments to be conditionally discarded based upon the alpha value, and discarded fragments don’t update the depth buffer.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.