Shaders optimization

Hi all,

I found an interresting thing : if you get the assembly code of your shader with something like :


   GLint formats = 0;
   glGetIntegerv(GL_NUM_PROGRAM_BINARY_FORMATS, &formats);
   GLint *binaryFormats = new GLint[formats];
   glGetIntegerv(GL_PROGRAM_BINARY_FORMATS, binaryFormats);

   #define GL_PROGRAM_BINARY_LENGTH 0x8741
   GLint len=0;
   glGetProgramiv(yourShader, GL_PROGRAM_BINARY_LENGTH, &len);
   char* binary = new char[len];
   glGetProgramBinary(yourShader), len, NULL, (GLenum*)binaryFormats, binary);
   glUseProgram(0);
   FILE* fp = fopen(name.c_str(), "wb");
   fwrite(binary, len, 1, fp);
   fclose(fp);

It’s possible to look at memory limitation :


...
TEMP lmem[12];
....

In this case, the shaders need to use lmem witch is global GPU memory===>>>> SLOW!!!

So use it to opitmize you shader vram usage!!!

That’s an interesting statement.

What makes you say that assembly-language TEMPoraries are stored in global device memory rather than in registers or shared memory? Do you have a reference you could post a link to?

If they were stored in device global memory, that would surprise me a bit. Because there are queries for the maximum number of these that are supported (MAX_PROGRAM_{,NATIVE}_TEMPORARIES_ARB) and on modern GPUs the total space allowed here is still relatively small seems to suggest these are kept somewhere else besides device global memory. Given the size returned, this suggests they may be kept in shared memory, but that’s a guess.

if you get the assembly code of your shader with something like :

Thanks for the tip! That’s interesting.

What I’ve done a lot in the past to generate ASM from GLSL is to use NVidia’s Cg compiler to compile the GLSL to ASM from the command-line. For instance:


> cgc -oglsl -strict -glslWerror -nocode -profile gp5vp vert.glsl
> cgc -oglsl -strict -glslWerror -nocode -profile gp5fp frag.glsl

However, they haven’t updated this in 5 years (it’s legacy and no longer supported), so it won’t compile GLSL shaders using more recent GLSL syntax and extensions. But it compiles most shaders out there just fine. Your solution is better though since it presumably still works with the latest GLSL syntax and extensions.

A word of warning - using glGetProgramBinary the binary format is allowed be implementation-dependent, so any inferences you may draw from results on one implementaion may not necessarily hold good for others.

Yes this is drivers dependant, but it’s a nice way to see if shader is memory limited!

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.