PDA

View Full Version : opengl es 2 shader on Ipad



billconan
07-30-2010, 11:37 AM
Hello everyone,

I'm using gles2 for my ipad project. right now, I'm working on duplicating the opengl fix pipeline. but I met some problems.

I want to support more lights in the scene and I store all the lights in an array and group the lights by their types in the array (directional, point and spot light)

so my lighting function has three loops (simplified sudo code):

void computeLights()
{

for(int i=beginOfDirectionalLights;i<numOfDirectionalLights;++i)
{
ComputeDirectionalLight(...);
}

for(int i=beginOfSpotLights;i<numOfSpotLights;++i)
{
ComputeSpotLight();
}

for(int i=beginOfPointLight;i<numOfPointLights;++i)
{
ComputePointLight();
}

}




this shader code works perfectly on the ipad simulator, but on the ipad device, i couldn't see anything.

i tested each of the three light functions and also my input values. they are all correct.

however, if i remove two of the three loops, the shader starts to work on the ipad.

it seems that the opengl es 2 shader on ipad doesn't support three loops inside one function.

I'm aware that the shader cannot be too large (complex), otherwise the registers will be used up.

but if that is the case, I should get come compiling error, but i got nothing.

so right now i've changed my code into this one, which works but pretty slow:


const int numLights = 8;

void computeLights()
{

//get a light from the light array


for(int i=0;i<numLights;++i)
{

if(currentLightType==DirectionalLight)
{
ComputeDirectionalLight();
}
else if(currentLightType==SpotLight)
{
ComputeSpotLight();
}
else if(currentLightType==PointLight)
{
ComputePointLight();
}
else if(currentLightType==EmptyLight)
{
//do nothing here
}


}

}





the problem with this code is that, it runs very slow even all lights are the EmptyLight type.

because GLES2 does not allow to use a uniform dynamic value to specify the loop times, i fixed the light number as 8 and loop over the entire light array no matter how many lights are defined. the way to determine if it is an empty light as well as the light type is through the value "lightType".

I hard coded every light to be empty. but the performance is still low.

does this has something to do with loop unrolling? is there any solution to this?

thank you very much in advance.

kRogue
07-31-2010, 12:07 AM
Is that in the vertex of fragment shader? The GLSL spec for GL ES2, explicitly says that in a fragment shader, that a vendor does not need to support uniform array access if the index cannot be determined at compile time, i.e.:



varying float I;
uniform vec3 array[10];
.
.
.
v=array[I];



does not need to be supported in a fragment shader.

Also, doing if/else in a fragment shader: typically the GPU will execute all branches and then choose the correct result.

There are some tricks though:
(1) You can do spot lights and point lights with the same code. You compute d=max(0, dot(L,N)) as always and then compute a=smoothstep(cosCutoffStart, cosCutOff, dot(-L, D)) [L=normal vector from pixel to light, D=directional vector of light] for point lights cosCutoffStart is anything smaller than -1 and cosCutOff is anything bigger than +1, if you do not want smooth cutoff, use step().

(2) a directional light can be realized as a point light: have each light position stored as a vec4 (x,y,z,w). Non-directional lights have w=1.0 and (x,y,z) the position. Directional lights have w=0 and (x,y,z) normalized.

This will avoid all the ifs.

By the way the Power VR SGX 535's GLSL compiler is no way near as advanced as the NVIDIA GLSL compiler you are likely using on your Mac (it does not move instructions around to hide latency, etc) and for that matter, that GPU needs to be used differently to get optimal performance, go to Imagination Technologies website, get their SDK examples and register (for free). Also Apple has quite a few docs (mostly replication Imagination Technologies docs) on what hurts performance on that GPU [discard hurts performance on that GPU].

Xmas
08-02-2010, 04:27 AM
because GLES2 does not allow to use a uniform dynamic value to specify the loop times
That's not quite correct. GLSL ES 1.00 does not mandate non-const loop bounds, but they're not disallowed either. They should in fact be supported on the iPad. The same is true for arbitrary array indexing in both vertex and fragment shader.

However, depending on your applications requirements, it may be better to simply generate an optimised vertex shader for each combination of light types that actually gets used.

Kornrumpf
04-26-2011, 11:40 PM
because GLES2 does not allow to use a uniform dynamic value to specify the loop times
That's not quite correct. GLSL ES 1.00 does not mandate non-const loop bounds, but they're not disallowed either. They should in fact be supported on the iPad. The same is true for arbitrary array indexing in both vertex and fragment shader.


Also, doing if/else in a fragment shader: typically the GPU will execute all branches and then choose the correct result.