PDA

View Full Version : Shaders optimization



__bob__
05-22-2017, 03:55 AM
Hi all,

I found an interresting thing : if you get the assembly code of your shader with something like :


GLint formats = 0;
glGetIntegerv(GL_NUM_PROGRAM_BINARY_FORMATS, &formats);
GLint *binaryFormats = new GLint[formats];
glGetIntegerv(GL_PROGRAM_BINARY_FORMATS, binaryFormats);

#define GL_PROGRAM_BINARY_LENGTH 0x8741
GLint len=0;
glGetProgramiv(yourShader, GL_PROGRAM_BINARY_LENGTH, &len);
char* binary = new char[len];
glGetProgramBinary(yourShader), len, NULL, (GLenum*)binaryFormats, binary);
glUseProgram(0);
FILE* fp = fopen(name.c_str(), "wb");
fwrite(binary, len, 1, fp);
fclose(fp);

It's possible to look at memory limitation :



...
TEMP lmem[12];
....

In this case, the shaders need to use lmem witch is global GPU memory===>>>> SLOW!!!

So use it to opitmize you shader vram usage!!!

Dark Photon
05-22-2017, 07:28 AM
TEMP lmem[12];

In this case, the shaders need to use lmem witch is global GPU memory===>>>> SLOW!!!

That's an interesting statement.

What makes you say that assembly-language TEMPoraries are stored in global device memory rather than in registers or shared memory? Do you have a reference you could post a link to?

If they were stored in device global memory, that would surprise me a bit. Because there are queries for the maximum number of these that are supported (MAX_PROGRAM_{,NATIVE}_TEMPORARIES_ARB) and on modern GPUs the total space allowed here is still relatively small seems to suggest these are kept somewhere else besides device global memory. Given the size returned, this suggests they may be kept in shared memory, but that's a guess.


if you get the assembly code of your shader with something like :

Thanks for the tip! That's interesting.

What I've done a lot in the past to generate ASM from GLSL is to use NVidia's Cg (https://developer.nvidia.com/cg-toolkit-download) compiler to compile the GLSL to ASM from the command-line. For instance:



> cgc -oglsl -strict -glslWerror -nocode -profile gp5vp vert.glsl
> cgc -oglsl -strict -glslWerror -nocode -profile gp5fp frag.glsl


However, they haven't updated this in 5 years (it's legacy and no longer supported), so it won't compile GLSL shaders using more recent GLSL syntax and extensions. But it compiles most shaders out there just fine. Your solution is better though since it presumably still works with the latest GLSL syntax and extensions.

mhagain
05-22-2017, 10:40 AM
A word of warning - using glGetProgramBinary the binary format is allowed be implementation-dependent, so any inferences you may draw from results on one implementaion may not necessarily hold good for others.

__bob__
06-26-2017, 01:11 AM
Yes this is drivers dependant, but it's a nice way to see if shader is memory limited!