I’ve never spoken to John; I’ve sent him a few short emails at times, and never gotten replies.
Gorg, you are taking a very limiting view of what a “shader” is. It is absolutely essential for the app to know and specify how a given series of computations will be implemented. For example, whether they happen at vertex or fragment level, or whether they occur at IEEE float precision or some sort of fixed-point precision. Whether they are done with a texture lookup or with real math operations. Whether you implement specular exponents using lookup table, repeated squaring, or a power function. Apps most definitely need to specify which technique they desire. A “shader” is not just a series of desired math operations: “specular bumpmapping with constant specular exponent” or something. At the API level, a shader needs to consist of a sequence of operations that compute a result. The driver does not have the semantic information of why this particular computation is being done. Remember that graphics hardware can be used for non-graphics computation, for one; someone would be rather angry if a driver dropped in a texture for a power function in a scientific app.
In the example of a “normalize”, sure, several approaches should exist. You could implement it using DP3/RSQ/MUL, or you could implement it as a cubemap texture. But that cubemap texture could exist in widely varying resolutions and precisions, either with nearest or linear or fancier filtering. But I think the onus falls clearly on the application to not just say “normalize” and expect the vector to come out normalized, but to clearly specify whether it needs a full (e.g.) IEEE float normalize, or it can live with a texture lookup.
I expect that – in practice – artists will be writing different shaders for different generations of hardware. That’s not going away, no matter what. On each particular piece of hardware, not only might you pick a different shader, but you might compile that shader to different low-level operations and a different low-level API.
Evan, I think you have it backwards on who can do multipass more easily. The vast majority of multipass scenarios are handled in the real world by doing the first pass with (say) DepthFunc(LESS), DepthMask(TRUE), and later passes as DepthFunc(EQUAL), DepthMask(FALSE). This is a very nice way to be able to implement multipass. But this method is completely out of bounds for a driver implementing multipass, because splitting a DepthFunc(LESS), DepthMask(TRUE) pass into multiple passes, the later ones with DepthFunc(EQUAL), DepthMask(FALSE) does not produce the same results! In particular, if several fragments have the same X,Y,Z triplet, you get double blending. Again, the app is the one that has the semantic information about whether this sort of thing will work or not.
I am very skeptical of the practicality of F-buffer approaches.
If GL2 were merely “forward-looking”, I’d be all for that. But I think in practice it is (as currently proposed) a very worrisome direction for OpenGL to be heading. The proposals generally do not reflect my vision of where graphics should be 5 years from now. In fact, they fail to reflect it to such an extent that I think I would seriously consider finding a different line of employment [i.e. other than working on OpenGL drivers].
Again, I think it is completely wrong to be talking about how people are going to stop writing for piece of hardware X or Y. You may be able to write a piece of software that works on any given piece of hardware, but this completely fails to solve the problem of graceful degradation of image quality between different hardware platforms. It does you no good to write a single shader that runs on all platforms, but runs at 30 fps on the newest hardware and 0.1 fps on all the old hardware! Instead, I see future applications and engines playing out a careful balancing act between image quality and framerates. Ideally, your app runs at the same framerate on all hardware, but the complexity of the shading or the scene in general changes. Indeed, the hope would be to maintain a constant framerate, insofar as that is possible.
Some people seem to think that these problems are going to get solved by the API. I disagree. I think they will be solved via a very careful interplay between the driver and a higher-level game engine/scene graph module, and also with some effort on the part of app developers and artists. Scalability takes work on everyone’s part. The important thing is to draw the right lines of division of labor.
In thinking about GL2, I’m reminded of Fred Brooks in the Mythical Man-Month warning about the second-system effect. Everyone wants the second system to be everything for everyone…