Is there anywhere to find more detail on how shaders are compiled? I’d like to be able to tell how many instructions each statement will compile as. I know that the CUDA docs detail the performance and instruction count of some of their functions but I cannot seem to find anything with regard to shaders. I’m looking at NVIDIA chips primarily. I currently only have the gist of what to avoid like sqrt’s and normalize and to exploit MAD instructions, but I’d like to be able to quantify their performance.
Also, is their any documentation as to how memory should be aligned, what vertex sizes are best or which memory uniforms/locals/constants are stored in? I know from CUDA that branching should not be divergent within a thread-warp and read recently that this is similar in GLSL - spatially local fragments should follow the same path. Where do people get this information?
You can grab NVidia’s Cg, and use cgc (cmd-line tool) to generate the NV asm shader for your source. Very useful in diagnosing perf problems when the GLSL compiler isn’t doing something you think it should be.
Besides asm source, the bottom line gives you a summary of the number of asm instructions and num of R-regs used. E.g.:
23 instructions, 3 R-regs
You can also compile geom and tessellation shaders (diff profiles) with this using Cg 3, though I haven’t actually done so.
There’s also the old NVShaderPerf. Windows-only and don’t think this has been updated for any GPUs more recent than G80 (GeForce 8) or for recent GLSLs (works for GLSL 1.2 IIRC), but this gives you a little different info. Example: