NVidia to depreciate CG in favour of Glslang?

I’ve noticed rumours of this on the Beyond3d forum. http://www.beyond3d.com/forum/viewtopic.php?t=7268

Here’s a relvant quote from the DirectX mailing list http://discuss.microsoft.com/SCRIPTS/WA-MSD.EXE?A2=ind0308a&L=directxdev&D=1&F=&S=&P=5807

And also please note that NVIDIA’s line seems to now be that Cg is ‘almost
legacy’ (my words, not theirs). They were asked at SIGGRAPH which language
to prefer and in open conference stated that HLSL is the preferred language
to use on DirectX and OpenGL’s High Level shading language is to be
preferred when using OpenGL.

Assuming that this is the company line, I guess this whole issue will
therefore go away soon and we can return to a more pure technical
discussion.

I agree that tools with Cg integrated will remain as an issue since it will
clearly take some little time before they are all upgraded to use HLSL etc,
but realistically I believe that the language war is pretty much over.

[I’m sure someone from NVIDIA will be quick to correct me if I have wrongly represented their position.]

Thanks,

Richard “7 of 5” Huddy
European Developer Relations Manager, ATI

Can anyone who attended SIGGRAPH 2003 confirm?

Cg is not being deprecated in favor of GLslang.

NVIDIA does recommend that developers working in a pure DirectX development environment use Microsoft’s HLSL.

NVIDIA is 100% committed to OpenGL, and will fully support the OpenGL Shading Language. It is worth noting that NVIDIA is one of only two vendors to sign the OpenGL Shading Language contributor agreement. In addition, NVIDIA has contributed more than 25% of the code in the reference compiler and pre-parser.

We are also pleased to announce that we will be offering a new profile for Cg, so that developers can continue to use the Cg Shading Environment on any hardware that supports the OpenGL Shading Language. This will also allow users to take advantage of high-level Cg features such as CgFX and subshader interfaces.

NVIDIA is dedicated to providing the best graphics development experience, regardless of which shading language, API or operating system you choose.

In the future, if anyone has any questions regarding NVIDIA’s products or strategy, please feel free to ask us directly.

Thanks,

Simon Green
NVIDIA Developer Relations

That’s for the response. It’s always good to get rumours cleared up quickly.

I wouldn’t say the situation is clear yet, as there is still one, very low-level, very practical issue unresolved:

The only way to get good performance from current nv3x WH is to have precision control in fragment shader - which GLslang lacks.

It will be interesting to see whether nVidia will provide appropriate extension to GLslang. Notice that half, fixed, hvec2, etc. are currently reserved keywords in GLslang spec.

Without such extension NV FP would remain the only really effective interface to nv3x, and user of GLslang’s fragment shader will have to accept only 25%-50% of potential performance. This could be serious obstacle for practical usablity of GLslang.

Exactly how serious it is, you may judge yourself after reading this and this (just ignore the noise, and look for messages posted by thepkrl)

Originally posted by simongreen:
NVIDIA is 100% committed to OpenGL, and will fully support the OpenGL Shading Language. It is worth noting that NVIDIA is one of only two vendors to sign the OpenGL Shading Language contributor agreement. In addition, NVIDIA has contributed more than 25% of the code in the reference compiler and pre-parser.

Hi Simon,

It’s great to hear about NVIDIA’s strong support for the OpenGL Shading Language as expressed in your post!
A small clarification on the IP issues: NVIDIA signed a contributor agreement covering the Cg specification when they brought it to the ARB last year, just as 3Dlabs signed a contributor agreement covering the glslang specification and the associated OpenGL extensions they brought to the table. These were much appreciated actions by both companies and made the ARB as a whole more comfortable in moving forward developing the OpenGL Shading Language.

Now that the Shading Language has been approved by the ARB and the required IP exclusion period has passed without claims, everyone in the ARB has agreed to offer up necessary IP for implementing the Shading Language.

Jon Leech
OpenGL ARB Secretary
Silicon Graphics, Inc.

I’m also very happy to hear that NVIDIA is supporting ( and developing ) glslang. NVIDIAs support is obviously important to the success of glslang.

The only way to get good performance from current nv3x WH is to have precision control in fragment shader - which GLslang lacks.

glslang certainly isn’t alone in that. D3D’s HLSL doesn’t have it either. In fact, the only hardware shader languages that do are produced by nVidia.

The fundamental problem isn’t with implementing them on hardware that supports them; it’s implementing them on hardware that doesn’t. Should an implementation have to emulate wrap-around behavior, for each math operation, if the operation goes outside the specified range? How should the precision be defined? If they simply allow an implementation to use higher-precision (wihtout wrap-around), then a shader run on 2 different implementations can produce wildly different results.

And even if nVidia offers an extension, this doesn’t really fix the problem, as shaders written without the extension would still run slowly, and shaders written with the extension can’t compile without that extension.

Originally posted by Korval:
The fundamental problem isn’t with implementing them on hardware that supports them; it’s implementing them on hardware that doesn’t. Should an implementation have to emulate wrap-around behavior, for each math operation, if the operation goes outside the specified range? How should the precision be defined? If they simply allow an implementation to use higher-precision (wihtout wrap-around), then a shader run on 2 different implementations can produce wildly different results.
I think you are too demanding here, most of such things can be left as “undefined result”. BTW, there is no defined wrap-around for any type in any low or high level shading language, even for int in GLslang.

Actually, these problems have already been solved in Cg: (quote from spec)

Cg’s primary numeric data types are float, half, and fixed. Fragment profiles are required to support all three data types, but may choose to implement half and fixed at float precision.
For the half type, there is actually no problem at all - it is merely a hint, like FASTEST vs. NICEST in ARB FP. It is programer’s responsiblity to care for overflow, but this applies even to the float type.

For the fixed type, situation is a bit tougher. I think programmer should not rely on implicit clamping to type’s range (it’s the “undefined result”), but rather clamp explicitly, if he wants to achieve it. In case the type is implemented with higher precision, explicit clamp is necessery to get correct result. In case the type is native, then explicit clamp to it’s range is a no-op, and can be optimized by compiler easily.

And even if nVidia offers an extension, this doesn’t really fix the problem, as shaders written without the extension would still run slowly, and shaders written with the extension can’t compile without that extension.

This problem is even easier to solve. Assuming ATI never exposes such an extension (what is rather likely, at least for the R3xx generation), you could do something like this:

char* my_header;
char* my_header_for_nVidia = "";
char* my_header_for_ATI = 
  "#define half float 
"
  "#define hvec2 vec2 
"
  "#define fixed float 
"
  (...) ;

if ( the extension is supported )	
  my_header = my_header_for_nVidia;
else
  my_header = my_header_for_ATI;

and then specify “my_header” as first string in every glShaderSourceARB() call (it lets you specify multiple strings at once, and they will be concatented by driver).

Wow, I’m almost optimistic for the future now. It is good to know NVIDIA intends to support glslang and has been so active in providing it.

There is some merrit to this solution, but it is one that the ARB, ultimately, decided against. Observe Issue 68:

68) Should the language support explicit data types such as 'half’ (16-bit floats) and ‘fixed’ (fixed
precision clamped data type)?

ADDED on September 17, 2002.
It is common for high level languages to support multiple numeric data types, to allow programmers
to choose the appropriate balance between performance and precision. For example, the C language
supports the float and double data types, as well as a variety of integer data types. This same general
consideration applies for a shading language.
For shading computations, precisions much lower than 32-bit floating point are often adequate. Until
recently, most graphics hardware performed all shading computations in 9 or 10 bit fixed-point
arithmetic. Lower-precision data types can be implemented with higher performance, especially
when the data must travel off chip (e.g. texture data). For this reason, it is desirable to provide access
to data types with precision of less than 32 bits in a hardware shading language.
Issue (33) discusses precision hints. Precision hints are less useful than additional data types,
because precision hints do not allow function overloading by precision. Developers find it very
convenient and useful to be able to have functions with same names and argument lists with different
precision data types.
It is also important to be able to specify data type per variable (as opposed to per-shader), because it
is common for some computations (e.g. texture-coordinate computations) to require higher precision
than others.
On the other hand, there is a desire to ensure that shaders are portable between different
implementations. In order to achieve portability, implementations that don’t have native support for
half will be penalized because they will have to clamp intermediate calculations to the appropriate
precision. If these additional data types are hints that the compiler can choose to do the calculations to
lower precision then this leaves the ISV open to unintended clamping or overflow semantics so
different architectures can give very different results. The hint also implies that there is a well
specified way to convert to between types under the hood so function overload resolution gets more
complicated and additional rules are needed to resolve ambiguities, unless all legal combinations of
functions must be supplied. Specifying all legal combinations requires adding quite a large number
of additional function types (dot product will need {float, half, fixed} * {float, half, fixed} * number
components or 36 versions (vs 4 with only float).
If the additional data types are real types then what can they be applied to? If it is to uniforms and
attributes then the different sizes now reflect in the API, but half and fixed have no native support in
C. If a half is followed by a float does this mean a float has to start on a 16 bit boundary? What about
packing of fixed - the true size is undefined. If half and fixed are just restricted to temporaries then
this makes things easier but now the storage efficiency benefit is lost.
The OpenGL spec currently says “The maximum representable magnitude of a floating-point number
used to represent positional or normal coordinates must be at least 232.” Should we introduce
something that runs counter to this? s10e5 precision is inadequate for texture coordinates even for a
1k by 1k texture. It seems that half-floats open a door for precision issues to propogate throughout a
shader.
RESOLUTION: Performance/space/precision hints and types will not be provided as a standard part
of the langauge, but reserved words for doing so will be.
CLOSED: November 26, 2002.

Personally, I think the alignment issues are a non-issue, because, well, there aren’t support for 24-bit floats in C either. As such, the implementation is going to have to translate for hardware that only offers 24-bits of precision. Because of this translation, it will likely never be the case that the user will pass some struct to the GL API that is immediately uploaded without translation. And, since shaders likely aren’t going to be working with bare pointers (and certainly not in the future), the in-shader alignment problems don’t matter.

The clamping/wraping behavior is the real issue. The idea that one shader compiled for two different pieces of hardware will have vastly diffferent results is not a good one.

[This message has been edited by Korval (edited 08-07-2003).]

Originally posted by Korval:

Personally, I think the alignment issues are a non-issue, because, well, there aren’t support for 24-bit floats in C either.

Not true. sizeof float is implementation dependent. They are usually 32-bit with x86 compilers. That doesn’t mean they couldn’t be 24-bit on another implementation.

The clamping/wraping behavior is the real issue. The idea that one shader compiled for two different pieces of hardware will have vastly diffferent results is not a good one.

I dunno. There’s no uniquely defined behavior for floats in C/C++, yet you don’t see software breaking down because of that(*)

Having multiple float sizes is a good thing. It gives programmers the flexibility to trade speed for precision (or vice versa). What if C only had doubles? or just floats? or just ints?
If you use 16-bit floats to address textures larger than 1k by 1k, then you get what you deserve. It’s just like using a char to index an array which might have more than 127 elements, and betting that all platforms have 32-bit chars.

(*) Stuff built by NASA excluded

Originally posted by Korval:
[b]How should the precision be defined? If they simply allow an implementation to use higher-precision (wihtout wrap-around), then a shader run on 2 different implementations can produce wildly different results.

And even if nVidia offers an extension, this doesn’t really fix the problem, as shaders written without the extension would still run slowly, and shaders written with the extension can’t compile without that extension.[/b]

A solution would be to have a glHint saying if we want min or max precision. Some hardware (ATI’s) would just ignore it, some other (NVidia’s) would use it to know is the shaders loaded from now on should use fp16 or fp24/32.

Then you could set this for each shader, knowing which works in 16 bit fp and which doesn’t.

Yeap, but hint affects whole program, though you could have function that sums two floats and returns half. When instr count is high enough that is a good thing too.

Korval:
The clamping/wraping behavior is the real issue. The idea that one shader compiled for two different pieces of hardware will have vastly diffferent results is not a good one.

Even without precision hints you still can get different results on ATI and nVidia HW due to 24 vs 32 bit difference. Probably not “vastly” different, but I think in computations for which low precision is appropriate, you wouldn’t get vastly different results with 24 vs 16 bits either. Generally, possiblity of getting different results is not an excuse IMO, because:

  1. Hints are optional.

If you want best performance - then you are free to use precision hints, just use caution.

If you want most uniform result across all HW - then just don’t use the hints. Ignore them, simple?

  1. Hints are explicit

If programmer uses the hint explicitly, then he must know what’s he doing, doesn’t he? In such case compiler should simply assume that overflow does never occur, for any data the shader will be given. Otherwise results are undefined. If overflow can occur, then you shouldn’t have used precision-reducing hints in the first place.

(from the spec issue you quoted)
On the other hand, there is a desire to ensure that shaders are portable between different
implementations. In order to achieve portability, implementations that don’t have native support for half will be penalized because they will have to clamp intermediate calculations to the appropriate precision.
It is ill idea to require such clamping. Cg doesn’t do this. Enforcing clamping of ‘hinted’ type just for sake of portablity is like enforcing portablity of undefined results - it doesn’t make sense.

If we get 16-bit-per-channel framebuffer pixel format, should GL clamp it to 8 bits to ensure portablity? If your app relys on rounding framebuffer data to 8 bits, then you are exploiting undefined result, and your app is already not portable.

Does any shading language require nVidia to clamp its 32-bit floats to 24 bits to ensure portablity? None does. In just the same way ATI wouldn’t be required to clamp its 24-bit floats to 16 bits.

It seems like misconception resulting from perceiving the fixed/half/etc. types in strict C/C++ sense (like short vs int vs byte, float vs double) rather then taking them as mere precision hints, what they should be.

[This message has been edited by MZ (edited 08-08-2003).]

One area where you can get very different results with 24-bit vs 32-bit precision is in computing texture coordinates, in preparation for a texture lookup. If there is insufficient precision, then the calculation may be off a bit, and thus you hit the wrong texel. For standard texture mapping this may not be a big deal. However, if the texture is being used as a table of indices, e.g. for an indirect lookup into another table, then this can lead to completely different results.

Some of the research out there involving scientific computations on GPUs run into these issues.

Eric

#pragma

If we get 16-bit-per-channel framebuffer pixel format, should GL clamp it to 8 bits to ensure portablity?

That’s a bit different. You allocate a framebuffer of a particular color depth. Either the implentation support it (either in hardware or through software fall-back) or it doesn’t. If it does, there is defined behavior for that color depth.

Does any shading language require nVidia to clamp its 32-bit floats to 24 bits to ensure portablity? None does.

GL does specify the required precision a bit loosely. Something like, ‘numbers greater than 2^32 and with a decimal precision to about 1 part in 1^-5,’: not an actual quote. As such 24 and 32 are perfectly fine.

It seems like misconception resulting from perceiving the fixed/half/etc. types in strict C/C++ sense (like short vs int vs byte, float vs double) rather then taking them as mere precision hints, what they should be.

Or, as you pointed out in another thread, this could just be more politicking in the ARB. After all, it doesn’t hurt ATi or 3D Labs in the slightest to not have these facilities. If their hardware supports them later, they can be added in at that point. But, of course, nVidia needs them now.

“One area where you can get very different results with 24-bit vs 32-bit precision is in computing texture coordinates, in preparation for a texture lookup. If there is insufficient precision, then the calculation may be off a bit, and thus you hit the wrong texel. For standard texture mapping this may not be a big deal. However, if the texture is being used as a table of indices, e.g. for an indirect lookup into another table, then this can lead to completely different results.
Some of the research out there involving scientific computations on GPUs run into these issues.”

Are you sure that’s right? I mean, if you use texRECT lookups (from a TEXTURE_RECTANGLE_NV bound texture) the texture coordinate offsets are integer based and bounded by the number of pixels. Is precision in addressing going to be lost in this case as well? I hardly think you’re going to have an address space that is much larger than 32bit per dimension when dealing with a texture ( I could be wrong here, go ahead and correct me if I am).

Originally posted by Korval:
Or, as you pointed out in another thread, this could just be more politicking in the ARB. After all, it doesn’t hurt ATi or 3D Labs in the slightest to not have these facilities. If their hardware supports them later, they can be added in at that point. But, of course, nVidia needs them now.

The whole reasoning lacks consistency. The exact same “issues” (most of them are actually non-issues) are present with the int type. Still it’s included.

OTOH half is made a reserved keyword, and NVidia already has an extension introducing half to OpenGL. So it’s almost a given that they will come up with another extension allowing half in GLslang soon. So what they basically did is let NVidia decide on the few unclear details - and place the burden of “correctly” supporting it (i.e. using a header that #defines half as float for drivers that don’t support it) on the developer. Which is a stupid thing IMO. Everyone would have been better off if they put it into the language. Though it’s really not that big an issue.