Shader Loading Uniforms Expensive. Alternatives

Hello ,i am currently working on a project on OpenGL. At the moment i have pretty solid scene that is loading a bunch of stuff like: Terrains,Models,Animated Models,SkyBox etc. The project is written on C++. Recently i tried to benchmark the project and i found out that loading uniforms can be quite expensive. Currently i am using one shader per structure type so:

  • For Terrains i use1 shader (vertex + fragment)
  • For Models i use 1 shader (vertex + fragment)
  • Etc.

So for me to have multiple Models using one shader i have to reupload the uniform data + bind the vao for each model every frame before the model is rendered. But according to the benchmark the per-frame uniform loading takes about 40%~ of the time which is quite alot.

My Questions is: Should i use a different shader for each model and load the data that won’t change to each shader ONCE before the main loop. Ofcourse there will be some special cases which can be handled when needed. Is this good practice, is this the way to go ?

Should i use a different shader for each model and load the data that won’t change to each shader ONCE before the main loop.

If those shaders are the same shader (ie: you’d be building multiple copies of the same program), no. Changing which program is bound program object is far more expensive to OpenGL than changing uniform state.

First, you seem to be benchmarking CPU time. While that is certainly important, it’s not the most critical thing. Also, you report your CPU time as being “40%~”. Well, what is the time in absolute numbers? Because if it’s 1ms, and your total CPU time is 2.5ms, then you’re probably OK.

Second, it’s not clear if what you’re seeing represents the good use of the API or a pathological use of it. Are you calling glGetUniformLocation when you want to change uniforms? If so, that would explain quite a lot.

Third, if all else fails, upgrade to UBOs. That way, it’s just one bind to change all the uniform values. You can map the buffer once and update uniforms for lots of objects at once, then bind sections of the buffer when it comes time to render.

First yes that was my idea. To build multiple copies of the same shader. Here i dont see why this is expensive since after all on every draw call in the renderer i have shader->start and shader->stop before i do vao binding and drawing. I am using a Renderer class and rendering two models for instance looks like: Renderer->(Tree), Renderer->(House).
Should i limit my use of shader->start and shader->stop for shaders that are the same ?
For example start shader then draw all models that use it , stop shader and move on … ?
Next , i am not calling glGetUniformLocation , i have cached the Uniform IDs before the main loop.
I have not tried UBOs

The Scene usually renders for ~approx. 5 ms, so 40% would be somewhere around the line of 2.2 ms



void Model::ShaderLoadUniforms()
{
	for (int i = 0; i < ModelData->textureNumber(); i++)
		activeShader->loadUniform1i(GL_TEXTURE0 + i, i, ModelData->getTbo()[i]->TBO_Id(), activeShader->getVariableID("texture[" + to_string(i) + "]"));

	for (int i = 0; i < ModelData->getMtlProperty().size(); i++)
	{
		activeShader->loadUniform4fv(&ModelData->getMtlProperty()[i].diffuseColor[0], activeShader->getVariableID("diffColor[" + to_string(i) + "]"));
		activeShader->loadUniform4fv(&ModelData->getMtlProperty()[i].specularColor[0], activeShader->getVariableID("specColor[" + to_string(i) + "]"));
		activeShader->loadUniform1f(ModelData->getMtlProperty()[i].specularFactor, activeShader->getVariableID("specFactor[" + to_string(i) + "]"));
	}
	activeShader->loadUniform1f((float)ModelData->isFakeLight(), activeShader->getVariableID("fakeLight"));
}


This is running every frame for the model to upload the uniforms. According to the benchmark - getVariableID function ,is causing most problems ~35% cpu time.



inline GLuint BasicShader::getVariableID(string variable)
{
	if (variable == "s_vPosition")
		return this->s_vPosition;
	else if (variable == "s_vNormal")
		return this->s_vNormal;
	else if (variable == "s_vTexCoord")
		return this->s_vTexCoord;
	else if (variable == "mTranslate")
		return this->mTranslate;
	else if (variable == "mRotate")
		return this->mRotate;
	else if (variable == "mScale")
		return this->mScale;
	else if (variable == "mProjection")
		return this->mProjection;
	else if (variable == "mView")
		return this->mView;
	else if (variable == "fakeLight")
		return this->fakeLight;
	else if (variable == "fogColor")
		return this->fogColor;
	else if (variable == "clipPlane")
		return this->clipPlane;
	else if (variable.substr(0, variable.find('[')) == "diffColor")
	{
		int dindex = atoi(variable.substr(variable.find('[') + 1, variable.find(']') - 1).c_str());
		return this->diffColor[dindex];
	}
//.........
}

Now i find that this function is quite expensive as i look at it. I should find another better more efficient way to do this.
I have used an abstract class which is called StaticShader and every other ModelShader,TerrainShader etc are derived. And there i implement the virtual function getVariableID.

EDIT: I would like to still use the structure *abstrShader = ModelShdr , but still be able to access the ModelShader uniform locations (somehow)

EDIT2: What if instead of string for parameter i use enum + index if the variable is an array

Next , i am not calling glGetUniformLocation , i have cached the Uniform IDs before the main loop.

Given the code you’re using, glGetUniformLocation would probably be an improvement :wink:

The general idea is that you shouldn’t be using string names for uniform look up per-frame, whether you’re using glGetUniformLocation or your own mapping table from string to uniform location. Instead, you should have actual variables in an actual C/C++ data structure. So instead of:


shader.getVariableID("mScale");

You have:


shader.mScale;
shader.diffColor[2];

Of course, this means that you have to hard-code name mapping. But then, you’re already doing that, since you’re using a switch statement instead of a std::unordered_map or other hash table.

I have used an abstract class which is called StaticShader and every other ModelShader,TerrainShader etc are derived. And there i implement the virtual function getVariableID.

This is where you run into a problem with runtime polymorphism. You shouldn’t be using that in rendering code, not if performance is a concern. Or at least, you should be using it a lot less than you are.

Really, what you should have is a virtual function that sets that shader’s particular uniforms. So instead of making dozens of virtual calls per object, you only make one. It’s ShaderLoadUniforms that should be virtual, not getVariableID.

Thats an idea. I tried to rework the function now takes as parameters enum and index (if array). The function getVariableID dropped from 35-40% to 0.3-0.2% usage. Which is huge Still if it isnt an virtual function how much better performance can i get ?
Strangly Enought (even before) GLFW swap buffers takes up to 30-40% is that normal ?

EDIT: As for why i am using abstract shader is to benefit from using the polymorphic abilities as well as using something like this for instance: vector<AbstrShaderGroup> as i can load all shaders in one vector and load similar data at once looping thru them (like View Matrix for the camera which should update in each shader)