what is the advantage (in OGL 3.0) of having the possibility to combine/attach several (compiled) shaders into a single program object?
Each shader doesn't have to be a complete piece of a programmable component. It's the same reason you don't necessarily put all of your C/C++ code in one file.

A shader can be a set of utility functions that are included in every program you link. You then let the linker's dead-code elimination remove the stuff it doesn't need. And, in a reasonable implementation of this construct, you won't need to do the compilation work multiple times.

But that being said, the translation from an imperative bytecode to SSA form takes only a small fraction of the compile process, so it really isn't worth it.
That kinda begs a question.

If we have the following compiler stages:

  • Compile source into SSA
  • Perform dead-code removal/inlining
  • Convert SSA to machine code


Where is the glslang compiler performance going? I mean, is compiling a C-like language into SSA-form really that time consuming? Even my fairly old computer can compile a several-thousand line .cpp file in less than a second, and it's doing optimizations, inlining, and all sorts of other stuff. Obviously if I start instantiating a bunch of templates, it takes longer, but straight C++ is pretty fast in terms of compilation. And glslang is much simpler.

Is it the dead-code removal?

Or is it simply that IHVs haven't prioritized the performance of their glslang compilers? I mean, we all know about nVidia's silly "dobule-compile" in their glslang. But is it simply that all IHVs have one part-timer working on their glslang compiler, such that after 2-3 years they still don't have decent implementations or compiler performance?

Maybe we should find some way of putting pressure on the IHVs. I mean, GL 3.0 has to do that, since glslang is required to do anything. But GL 3.0 adoption will be impacted by the quality of 3.0 implementations. And the quality will be impacted by the adoption. Etc.

Just for the sake of correctness, none of the above is true.
Such a fine argument you have made

If what you say is true, then GPU's are in fact functional programming devices. Also, doing register allocation in C-style parsing is not a solved problem. And what Zengar was talking about is not functional programming.