PDA

View Full Version : ARB_vertex_program2 when?



JD
04-15-2003, 07:38 AM
I hope all the ihv get together and work on this. I understand that it's much faster to ignore everyone else and work on your own stuff but one interface sure would be neat. How many of you are planning on using NV_vertex_program2 extension? I'm not.

davepermen
04-15-2003, 07:52 AM
me neighter. it sucks in the language, a really ugly assembler. just made to force people to use cg..

SirKnight
04-15-2003, 07:54 AM
I don't see any point to using it directly if we can use Cg and make the Cg compiler generate all that ugly code for us. http://www.opengl.org/discussion_boards/ubb/smile.gif

-SirKnight

Obli
04-15-2003, 08:24 AM
A bit off topic: anyone has an idea why SIN/COS were left out from the spec? They are present is ARB_fp.

I feel the asm approach is the correct one - other extensions / higher level languages will build on this.

JD
04-15-2003, 12:04 PM
I don't have anything against asm shaders since the programs are usually very short for games at least. I used d3d9 hlsl and asm and eventhough hlsl is easier to use I felt ok using asm(it's more straighforward than hlsl). I won't use CG but I might use glslang in the future. Shading language needs to be governed by ARB and not one company to favor their way of thinking.

Humus
04-15-2003, 12:41 PM
I think there's much interest in an ARB_vertex_program2 extension right now. Most focus is on finishing GL2 glslang, and once that's done noone would want a asm language anymore.

V-man
04-15-2003, 01:47 PM
Originally posted by davepermen:
me neighter. it sucks in the language, a really ugly assembler. just made to force people to use cg..




What do you mean it sucks and its for forcing people into cg?

NV_vertex_program2 is just an extenstion of 1.0 and 1.1

And it's not really different then using ARB_vertex_program except for about 3 new features. It just adds looping, conditional execution and subroutine calling. Maybe a few other instructions too.

I have no real use for it at the moment but it's a logical progression.

cass
04-15-2003, 02:35 PM
Originally posted by Humus:
I think there's much interest in an ARB_vertex_program2 extension right now. Most focus is on finishing GL2 glslang, and once that's done noone would want a asm language anymore.

There are a couple of nice things about having low-level interfaces:

- reduced driver complexity
- 3rd party shading language design

Driver quality has always been a dodgy issue for OpenGL implementors. Having a simple-to-implement, reliable interface is not a particularly bad idea.

Hardware shading languages are still a relatively new thing, and we should expect them to change (perhaps significantly) as hardware becomes more general and capable and we learn the most natural programming models. Allowing 3rd party development of shading languages is a good idea, and a flexible, powerful, low-level interface is usually the best target for a shading system.

I'm all for high-level shading. I just don't think it's time to do away with low-level interfaces.

Thanks -
Cass

jwatte
04-15-2003, 07:08 PM
vertex_program2 is needed for a few things which most current hardware can support:

Subroutines
Conditionals
Jumps
Loops

fragment_program2 could conceivably add at least a little bit of predication without breakign on current hardware, but fragment_program feels more complete than vertex_program.

nystep
04-15-2003, 11:41 PM
Hi,


I plan to use ARB_vertex_program2, since I prefer the assembler approach to the high level langage approach to program the video card.

The current complexity of GPUs don't really justify the use of a higher level langage, in my eyes at least.


regards,

sanjo
04-16-2003, 12:02 AM
that's the same as asm vs. c/c++.
i think hlsl is really nice. less work,less errors ... at least it can be used for fast prototyping of shaders. if exec speed isnt fast enough and you have the time, go back to asm and hack around.

Mazy
04-16-2003, 12:22 AM
If arb_HLSL becomes a part of the driver it doesnt have to compile to arb_vp first and then to internal format.. it can just skip the middle layer. Thats good, it makes the internal implementation more flexible. like for instance vertex*matrix that always has to be in 4 operations nowadays.. if the chip has a single instruction for that it will be unused until ARB ratifies the VP spec. With hlsl it could use it immediately.

but until then i will use arb_vp, and i would like to see a vp2 spec with branches and such stuff

Zak McKrakem
04-16-2003, 03:56 AM
Back to the original question. Does anybody know the status of ARB_vp2?
In december meeting notes: "Pat will convene a new working group..."
But in the march notes: "The low-level instruction set is not complete (e.g. supporting looping/branching constructs). There's still interest in reviving the vertex_program WG and doing this work."
So it seems that none has been done.

Humus
04-16-2003, 12:51 PM
Originally posted by cass:
There are a couple of nice things about having low-level interfaces:

- reduced driver complexity
- 3rd party shading language design

Driver quality has always been a dodgy issue for OpenGL implementors. Having a simple-to-implement, reliable interface is not a particularly bad idea.

Hardware shading languages are still a relatively new thing, and we should expect them to change (perhaps significantly) as hardware becomes more general and capable and we learn the most natural programming models. Allowing 3rd party development of shading languages is a good idea, and a flexible, powerful, low-level interface is usually the best target for a shading system.

I'm all for high-level shading. I just don't think it's time to do away with low-level interfaces.

Thanks -
Cass

Well, I think it's time to do away with it. The sooner the better. The bad things about maintaining asm languages by far outweights the good things. Not only will it just sum up to the same huge mess as all fixed function stuff did over the years as technology progresses, but it will also put arteficial restraints on how hardware is designed, which sucks. We should have learned from the x86 that asm middle layers will just prevent innovation, cause trouble and limit performance. I don't think we should repeat that for GPU's.

cass
04-16-2003, 01:14 PM
Originally posted by Humus:
Well, I think it's time to do away with it. The sooner the better. The bad things about maintaining asm languages by far outweights the good things. Not only will it just sum up to the same huge mess as all fixed function stuff did over the years as technology progresses, but it will also put arteficial restraints on how hardware is designed, which sucks. We should have learned from the x86 that asm middle layers will just prevent innovation, cause trouble and limit performance. I don't think we should repeat that for GPU's.

What are the bad things about a low-level interface?

Note that I'm not saying a "limited" interface. Just "low-level".

Humus
04-17-2003, 01:03 PM
1) As I said, it adds up to a huge mess over time. Look at DirectX, we have ps1.1, ps1.2, ps1.3, ps1.4, ps2.0. You must support all these interfaces. Where are we in 3 or 4 years? ps3.0, ps3.1, ps4.0, ps4.1, ps5.0, ps6.0... ? All which must be supported.

2) It limits innovation. Hardware will have to be designed around a certain set of instructions. You're kicking a huge level of optimization out of the sight from the driver. If you for instance have special hardware for executing sines and cosines you should be able to use it, something that will not be feasible if you're fed with a low level taylor series, especially if the compiler have tried to optimize and rescheduled instrucions.

3) It limits performance. Look at DX9 HLSL today. If I compile a shader using fancy swizzling, say .xzzy and use ps2.0 as a target, then your beloved GFFX which should be able to do that in a single instruction will have to run several instruction for that swizzle due to hardware limitations that exists on ATi boards, all because of this middle layer and that pesky ps2.0 target that is designed around the smallest universal set of functionality.

Korval
04-17-2003, 01:15 PM
As I said, it adds up to a huge mess over time. Look at DirectX, we have ps1.1, ps1.2, ps1.3, ps1.4, ps2.0. You must support all these interfaces. Where are we in 3 or 4 years? ps3.0, ps3.1, ps4.0, ps4.1, ps5.0, ps6.0... ? All which must be supported.

I don't imagine that ARB_vertex_program_2 is going to do much more than add instructions. It's not an issue of supporting an interface then; it's simply a matter of whether or not you compile opcodes.

As for the ps1.* mess, blame Microsoft. Per-fragment programs didn't even really exist at that time; it was more like a more flexible fixed-function pipe. Microsoft wanted to turn it into some kind of assembly, rather than recognise it for what it was.

ps3.0 won't look much different from ps2.0. Mostly new opcodes.


It limits innovation. Hardware will have to be designed around a certain set of instructions. You're kicking a huge level of optimization out of the sight from the driver. If you for instance have special hardware for executing sines and cosines you should be able to use it, something that will not be feasible if you're fed with a low level taylor series, especially if the compiler have tried to optimize and rescheduled instrucions.

Which is why you have new revisions of your low-level language. All it adds are new instructions.

You haven't pointed out why these have to go into my drivers (and come out of driver development time). The Cg method is to compile to some assembly language that can then be further compiled into a program object. Since you can pick and choose which target you want, you don't much need to worry about the underlying hardware issues.

If it can stand alone, it should stand alone. And I don't want driver developers spending their valuable time writing a C compiler when they could be getting me better performance.

jwatte
04-17-2003, 02:50 PM
If I compile a shader using fancy swizzling, say .xzzy and use ps2.0 as a target, then your beloved GFFX which should be able to do that in a single instruction will have to run several instruction for that swizzle due to hardware limitations that exists on ATi boards


I hope you're not assuming that all hardware actually runs the "assembly" code just as it's written? Think of the shader "assembly" code as the logical equivalent of a Java Virtual Machine bytecode. The driver compiles to whatever the hardware is doing at the time of load. It can "easily" be made to detect a swizzle to temporary preceding an operation on that temporary, and turn that into a single "native instruction" if that's what the hardware provides.

Humus
04-17-2003, 03:44 PM
Originally posted by Korval:
Which is why you have new revisions of your low-level language. All it adds are new instructions.

Which is what x86 did over the years too, add new instructions. And it sure it a fatass, inefficient mess today.

What instructions do we want? It may be clear today, but in 5 years it will most likely be different. Do we want vector instructions or many independent scalar processors? Do we want attributes and constants to be limited to four components? Things will change, and we'll get a fatass mess that will hunt us for many years thereafter.


Originally posted by Korval:
You haven't pointed out why these have to go into my drivers (and come out of driver development time). The Cg method is to compile to some assembly language that can then be further compiled into a program object. Since you can pick and choose which target you want, you don't much need to worry about the underlying hardware issues.

I have pointed it out already. Compiling to a target and then loading that shader is not only inefficient, but it will cause a huge mess over the years. Not only will we need to create new assembler version, we will also need to update the compiler, and we need to update optimizers for each shader version we create. We also loose the ability to optimize on high level, which will also limit IHV innovation. We'll not get the maximum out of our hardware and driver writers will have to spend MORE time trying to reverse engineers a low-level assembler into their high level meanings than would they have direct access to the shader itself.

Plus, why should I have to care about targets? Give me one valid reason. What I care about is the hardware and I want to get the max out of it and get the max out of future hardware. I don't want to have to detect ps shader version and try to select the best combination of what the compiler support and what versions the driver support.

To turn the question around, give me one valid reason why the driver should NOT have access to the high-level shader?


Originally posted by Korval:
[B]If it can stand alone, it should stand alone. And I don't want driver developers spending their valuable time writing a C compiler when they could be getting me better performance.

By taking the approach of compiling against a target pixel shader version instead of targetting the underlying hardware you have already lost your performance. I don't want driver writers to have to spend their valuable time trying to figure out high-level sematics by reverse engineering low-level assembler.

Humus
04-17-2003, 04:01 PM
Originally posted by jwatte:
I hope you're not assuming that all hardware actually runs the "assembly" code just as it's written?

Which is the actual point. It doesn't represent the hardware. Today it may slightly does, but in the future it wont. Just as the x86 instructions does not represent the executing hardware. The CISC instructions are decoded into micro-ops and executed on RISC processors. And the prices we pay is in transistors, power consumption, heat and a non-optimal executing of code. Why not learn from the history? Why repeat the x86 debacle all over again?


Originally posted by jwatte:
Think of the shader "assembly" code as the logical equivalent of a Java Virtual Machine bytecode. The driver compiles to whatever the hardware is doing at the time of load.

Which is an inefficient model and proven to be so by compilers that compile java code directly into native platform code. Despite loads of effort Sun has put into it, their virtual machine will never run as fast as the java code compiled directly to hardware code. The VM already runs lots of code natively, tries to match the bytecode into native code whenever possibly, but since the bytecode does not represent the underlying hardware we aren't getting anywhere near the performance we could get. Not only that, but I can garantuee that the amount of time Sun has spent optimizing the bytecode compiler and the VM easily exceed whatever effort the gcc teams put into optimizing their java-compiler.


Originally posted by jwatte:
It can "easily" be made to detect a swizzle to temporary preceding an operation on that temporary, and turn that into a single "native instruction" if that's what the hardware provides.

But why the double work? First the compiler splitting it into many instructions, then the driver has to go through the shader and try to figure out where it can pack instructions into one and eventually (if successful) be back where we started. It's a waste of time. What if the compiler tries to be smart and reschedules instructions in an order that it thinks will benefit some hardware? Will the driver still find these swizzles it can pack into one? Swizzles are also fairly simple. Will the driver be able to detect trigonometric functions expresses as series of adds and muls? Powers? Exponentials? It goes on.

[This message has been edited by Humus (edited 04-17-2003).]

Korval
04-17-2003, 06:04 PM
The ultimate problem with the futurist approach is that I'm working today, not 5 years from now.

Eventually, using assembly language shaders will become somewhat prohibitive. Though I doubt that the common vertex or fragment program will ever be particularly huge, you will want to do more with your programs and take advantage of hardware that allows you to do more.

That being said, that day isn't today, and it isn't tomorrow, and it isn't next year. At best, for vertex programs, it's 2 years from now by the time when the hardware will want to start becoming complex enough to virtually require a high-level language. Fragment programs will take longer. Not to mention, if you open up a new programmable domain (command processor!), that's going to take 3-5 years before it's going to really need C-like complexity.

As an aside, personally, I don't want to see C-like complexity in vertex programs until and unless they start running off into memory and accessing structs out of that memory. The same goes for fragment programs; until you can start accessing an arbiturary struct out of a texture rather than 4 floats, there's no need.

In any case, what do we do in the meantime? Well, while C-like complexity isn't necessary, for NV_vertex_program_2's programs, it is kinda nice to have around. It isn't something you want driver writers to spend time on, but it's something that could make a good stand-alone.

OK, so you have a language like Cg which is compiled into a low-level language, which is then fed into another compiler. However, in the future, when C-like compilexity is required to really do anything, you can't have this middle-man approach getting in the way of progress, right?

Therefore, you have "native" program targets. Rather than binding a program with the vertex/fragment_program_ARB target, you say, "native_vertex/fragment_program_ARB". The "native" means that it is in the precise form the driver wants it. Now, of course, this requires the "standalone" language to be bound to the driver. Unlike Cg, which returns a character string that is readable assembly, this function simply returns a binary block of data. This data is passed to the driver under the "native" target. And the data is in the representation that is best for the hardware.

This sort of thing sounds very much like glu*-type of functionality. The initial version look a lot like Cg: they have a target parameter. However, once the "native" target is supported (ie, driver developers get the time to write a good compiler), people start using that exclusively. After a while, you can retire ARB_*_program_* functionality, and allow each vendor to expose their own low-level programs if they want to (much as people today can still code in assembly).

By using this approach, you can still have what you will need in 5 years, but you haven't wasted driver development time today to do it.

One nice thing about x86 is that the core hasn't changed in years. Imagine having to write a C compiler knowing full well that the language it compiles into, as well as all the little speed tricks you use on it, will change next year? At least CPU dev cycles are 2-3 years; you're looking at 9-12 months before you have to have a new compiler for your next-gen graphics card.

In 5 years, this will have stabilized for the most part. The optimal path (for a particular vendor) will be found to allow for fast vertex processing. And all the wrinkles with F-buffers will be ironed out, so per-fragment programmability will be dealt with. At that point, the underlying opcodes will become stable enough to build compilers on, knowing that they will only change every few years.

Either that, or the middle-man approach is always required because the underlying opcodes never become stable enough to spend the time to write a C compiler for.

jwatte
04-17-2003, 06:51 PM
Let's think about this from another point of view:

Driver writers are working hard to keep up correctness and performance requirements for new hardware, as it is, with the breathtaking speed of progress we're seeing today.

A high-level language is something very complex. Do you really think that right now is the right time to add yet another layer of complexity to the driver programmers' jobs?

I'd be extatic if simply all currently announced GL specifications and extensions were implemented with no bugs and high performance by the 8 most common vendors of hardware today. Meanwhile, I haven't used a compiler in years that I haven't gotten some internal error or codegen bug out of at some point. Do I want that in my OpenGL driver as well? No thanks!

If you think that's the right thing to do, perhaps you should build your own chip and write your own drivers for it (j/k :-)

evanGLizr
04-17-2003, 07:05 PM
I think Humus has made a perfect exposition of my thoughts on this matter. There's no need for a low-level interface, it doesn't have any value-added feature:
- The low-level interface never matches the hardware (the hardware is too complex, may not allow certain undocumented combinations of instructions, optimisation is always needed on the low-level program, etc). My point is that the driver needs to parse and optimise and generate code for the lowlevel program anyway, so the programmer could be coding the shader in high-level for that matter.
- The low-level interface precludes or places a burden on certain optimisations (trascendental or trig functions expanded to taylor or newton-raphson in the low-level interface are a good example of that).

What I think is really missing is a "binary" form of glslang and an api function to load this binary programs. Note this wouldn't be a low-level interface, but almost a one-to-one mapping of glslang (operator and type-wise) to a binary token stream, with the difference that - say - only one destination and two operands would be allowed per instruction (so you would effectively have more instructions in the token stream than in the string program).
The token stream would preserve all the type and operator richness of glslang (but without some complexities like operator overloading).

Having this binary form of glslang instructions would give several benefits:
- Aplications can generate shaders on the fly more efficiently. The shaders would be more compact than the verbosity of a string interface and the driver doesn't need to perform parsing & validation on the token stream loading interface. For this purpose, the instruction format would be set as an opengl standard, with a C header file with the structures necessary for people to parse/generate shaders that way if app writers want to.
- The compiler to token stream can be a standard part of the opengl sample source or even Mesa. This means that compiler error codes will be standard accross drivers, and the effort needed to generate driver-specific compilers will be much less: Barring optimisations, much of the workforce of developing a compiler is done in the parser front, coping with function overloading, compiler error catching & error recovery, etc.
- Anyone would be able to roll out his own shader language which compiles to this token stream.
- Would also ease the task of creating software interpreters or debuggers of glslang.

Regarding the "I want it now" fever, the only thing I can say is: push your favourite graphics company to support glslang and you will be able to use jumps and loops in a way that you won't need to relearn in 6-months time.

[This message has been edited by evanGLizr (edited 04-17-2003).]

cass
04-17-2003, 10:32 PM
Why don't we have "CPU drivers" and just do away with assembly altogether?

If CPU vendors got together and decided what that one language was, would C++ still be implemented as "c-front"? Would we have to convince CPU vendors to provide "native support" for all high level languages? How many high level languages are sufficient for CPUs?

Thanks for all the discussion on this topic - I think it's really interesting to hear the different perspectives.

Cass

JD
04-17-2003, 11:14 PM
Talk about topic going off the deep end http://www.opengl.org/discussion_boards/ubb/smile.gif

V-man
04-18-2003, 07:36 AM
The issue of ASM shaders is not the same as the issue of asm for the x86.

The asm for the x86 is the way it is because we get to program for the chip directly.

For GPUs, we dont get to program for it directly. The driver is like an interpreter.

Whether its ASM shading or HLSL, it doesnt matter, the driver is still an interpretter (with different complexity).

What matters is, will driver writters be able to make good drivers?
Let those driver writters suffer and make bug free up to date drivers.
That's all I ask for.

Humus
04-21-2003, 01:04 PM
Originally posted by Korval:
<long post>

Korval,
I suppose we have different perspective on HLSLs. Having worked with both glslang and DX9 HLSL I'm confident HLSLs are ready to go today, not first in a couple of years. The last months I haven't written a single assembly shader line, and I don't miss it one bit. Using HLSLs really boosts your productivity over assembly language and even though for most of the stuff I've done I could just aswell have used assmembly I just don't want to.
I realize that people are still going to target current and previous generations for a number of years to come. But I don't think we need to expand the assembly path for this and future generations. I don't think anyone who's going to target the next generation hardware will work in assembly, so this middle-layer will becomes redundant.
Looking at how GL2 is coming around it seams that the GL2 workground agrees, there's no target other than the underlying hardware in GL2 (Yay! I win! *snickers* http://www.opengl.org/discussion_boards/ubb/wink.gif)

Humus
04-21-2003, 01:10 PM
Originally posted by jwatte:
A high-level language is something very complex. Do you really think that right now is the right time to add yet another layer of complexity to the driver programmers' jobs?

Yes, I think so. I don't think it's a "very complex" task, at least not for parsing and getting a usable hardware shader out of it. If you factor in work done on optimizations I think you end up saving time if you go directly from a HLSL to underlying hardware.
Right time or not, time is already being spent on this. (Yay! I win again! *snickers* http://www.opengl.org/discussion_boards/ubb/biggrin.gif)

cass
04-21-2003, 03:53 PM
Originally posted by Humus:
Yes, I think so. I don't think it's a "very complex" task, at least not for parsing and getting a usable hardware shader out of it. If you factor in work done on optimizations I think you end up saving time if you go directly from a HLSL to underlying hardware.
Right time or not, time is already being spent on this. (Yay! I win again! *snickers* http://www.opengl.org/discussion_boards/ubb/biggrin.gif)

As a hardware abstraction (which OpenGL is), I think low level programming interfaces are still quite appropriate.

Designing *good* high-level compilers is a very complex task, despite what you might think. Future generation GPUs will likely have enhanced programming models that will make it even more complex.

As the shading environment gets more complex, the desire to use an independent (and separately QA'd) compiler will be attractive to ISVs. Developers won't want to be QA'ing a new high-level shader compiler for every driver update.

Thanks -
Cass

Korval
04-21-2003, 07:01 PM
I don't think it's a "very complex" task, at least not for parsing and getting a usable hardware shader out of it. If you factor in work done on optimizations I think you end up saving time if you go directly from a HLSL to underlying hardware.

I can't claim experise in the field of compilers by any stretch of the imagination, but I've written a full lexer/parser/interpreter for a language of the approximate complexity of Cg, and it was definately something that would be considered complex. And that didn't even involve the equally complex task of taking the output of the parser, optimizing it for a given assembly language, and outputting the valid assembly.

Writing an optimizing compiler isn't particularly easy. I would also like to point out that the skills it takes to do that have nothing to do with the skills it takes to write a hardware driver. Which means that hardware vendor's will have to hire staff who are more familiar with compilers (more money) or train their current staff to write good compilers (costing both money and dev. time).


Looking at how GL2 is coming around it seams that the GL2 workground agrees, there's no target other than the underlying hardware in GL2

Which was hardly unexpected, considering that GL2 is the brainchild of 3DLabs. Their customers don't buy GeForce's or Radeon's.


Yay! I win!

That's the sad part about it. You see, you actually lose, but you don't even realize that you lose. You're going to get less stable drivers, or your next card is going to cost more. Or, your shader language will just not be optimized worth anything. You can't have it all; take your pick.

BTW, having looked at glslang, I much prefer Cg. Glslang looks like it was made by 3DLabs for 3DLabs hardware (ie, a scaler processor), rather than for a vector processor. Also, it doesn't seem to expose per-vertex color interpolators; it only has enough per-vertex interpolators for 8 texture coordinates.

Mark Kilgard
04-21-2003, 08:42 PM
Originally posted by Obli:
A bit off topic: anyone has an idea why SIN/COS were left out from the spec? They are present is ARB_fp.]

I think by "the spec", you mean ARB_vertex_program. Keep in mind that the ARB_vertex_program specification predated the ARB_fragment_program extension. Hardware in the timeframe of ARB_vertex_program's standardization did not support a vertex program SIN/COS instruction. DirectX 8 expected these instructions to be approximated by multi-instruction sequences that amounted to a Taylor series approximation. Because the quality of the approximation can vary, the ARB_vertex_program extension left it to programmers to implement their own approximations.

You can look at the code the Cg compiler generates when the Cg standard library cos and sin functions are used.

Note that NVIDIA's CineFX architecture (exemplified by GeForce FX) has SIN and COS instructions at both the vertex and fragment level. These are not compound instruction approximations, but a single fast instruction for each. These SIN and COS instructions are just as efficient as a DP4 or other instruction.

Use the NV_vertex_program2 (or the "vp30" profile for Cg users) to get at these instructions. If you use the sin & cos Cg runtime routines with the "vp30" profile, the compiler generates a single instruction.

By the way, the Cg runtime has a sincos routine if you need both the sin and cosine of a value.


I feel the asm approach is the correct one - other extensions / higher level languages will build on this. [/B]

I agree wholeheartedly. The problem with building a shading language into an OpenGL driver is that then you are at the mercy of whatever compiler bugs exist in the driver which could be different from machine to machine. You can confirm the quality of the generated code and make sure your application uses a compiler runtime library that you distribute with your application so you can be confident of its behavior.

- Mark

Mark Kilgard
04-21-2003, 10:15 PM
Shading language needs to be governed by ARB and not one company to favor their way of thinking.[/B]
This is indeed an odd statement. As far as languages go, Cg and HLSL are pretty much different implementations from different companies of the same basic language. The shader examples Randy and I wrote for "The Cg Tutorial" compile with both Microsoft's HLSL compiler (fxc) and NVIDIA's Cg compiler (cgc). Feel free to verify this fact for yourself.

And if you carefully examine the 3Dlabs shading language proposal, you'll find that their proposal has the same basic C-like language structure as Cg and HLSL. It's crazy to think that there's a shader you can write in Cg or HLSL that could not be readily translated into the 3Dlabs shading language proposal and vice versa. The only difference is that the 3Dlabs shading language proposal ties your shader source code to "OpenGL and only OpenGL" whereas a shader written in Cg can readily straddle the API fence.

With Cg, the same shader program can be compiled for either OpenGL or Direct3D profiles.

Shaders, like models and textures, are part of the art path for 3D development. The look of a 3D art asset is determined by a combination of the 3D model, one or more shaders for the model, and associated textures. If I'm going to create an art asset, why do I want it tied to a particular 3D API? Would you really prefer an ARB-governed 3D file format or image format over an API-neutral file format? Why would you want a shading language that tied your art asset only with OpenGL?

glLoadOpenGLOnlyImageFileFormatARB anyone?

For makers of Digital Content Creation applications (3D modeling packages, etc), an ARB-sanctioned shading language that works only with OpenGL locks these vendors into producing programmable shading content that is tied to OpenGL. While these vendors are quite happy to use OpenGL because OpenGL is a stable, cross-platform, industrial-strength API, these DCC application vendors don't want to get stuck in a situation where WYSIWYG real-time 3D asset creation with programmable shading ties them to OpenGL when producing 3D content for both OpenGL- and Direct3D-based real-time rendering engines is a fact of life for the end-users of DCC applications.

Cg is winning as the common shading language of choice for the whole 3D industry for much the same reason OpenGL beat PEX (the old and dead X Consortium 3D API). OpenGL had the advantage of being a cross-platform 3D API that supported both Unix workstations running the X Window System and PCs running Microsoft Windows. One reason PEX lost was because it was tied to a minority window system. An ARB-governed shading language will suffer from much the same ills PEX did. PEX was designed by an entrenched standards committee where the background was window system issues, not 3D. This is much the way the ARB-governed shading language proposal is designed by 3D driver engineers (as opposed to design by a small focused group of shading and programming language compiler experts as in the case of Cg).

At its inception, OpenGL too was accused of being a plot by "one company" (SGI back then) when OpenGL was, like Cg is today, really a collaboration of a relatively small group of very talented engineers from different companies who were unencumbered by standards committee inertia.

Ultimately, the facts on the ground matter. Cg and HLSL represent two shipping implementations of essentially the same underlying shading language. You are now seeing lots of companies embracing Cg, even looking for it in job postings. It's cross-platform so the same shader can work on a Windows PC, a Mac, and a Linux PC. Most importantly, all this exists today. Shaders written in Cg work on GPUs from both NVIDIA and ATI--and any other vendor that implement DirectX 8, DirectX 9, or OpenGL's ARB-approved vertex and fragment program extensions. Game developers are using it; researchers are using it; and DCC application vendors are using it. The Cg runtime libraries are redistributable at no cost. CgFX provides an API-neutral meta file format for combining Cg source code with non-programmable 3D state to fully encapsulate a shading technique in a single file. Cg has good documentation and a published tutorial. There's an open source parser for the Cg language available. The SIGGRAPH 2003 papers committe even accepted a paper about Cg and its design.

- Mark

harsman
04-22-2003, 07:44 AM
Well, cg is far from vendor neutral. It's obvious that the cg aRB_fp profile tries to minimize register usage for example, which is agood fit for GeForceFX but doesn't work well with the Radeons limited dependant instructions. Cg also has fixed datatypes that enable a better fit to NV_Fragment_program but have no hardware counterpart on the Radeons. I'm also not convinced of the merits of a cross API shading language at the level of cg or HLSL. How many modern games and other end user products actually have a rendering engine that can use either OpenGL or DirectX? Not that many. And if you want shaders to be a part of the art path you'll need a application specific way to tag environment textures, ways to differentiate light shaders from surface shaders and maybe even ways to store different shader LOD levels. All this means you will need a higher abstraction than the current shading languages anyway, so the language used to express the actual calculations performed seems less important. If cg wants to become the cross platform standard for shading languages it needs to become an *open* standard not controlled by nvidia and it needs to get support form other IHVs than nvidia. The only difference I see between nvidias cg and glslang is if the compiler is part of the driver or not. cg still advocates using run time compilation with a native profile to get the best performance. That's exactly the same thing as if the OpenGL driver does it except it's done in a different dll, the cg compiler one. Having a standard lower level than cg format in Opengl and putting the parser and higher level stuff in the sample implementation is fine but I still fail to see what the great difference is between that and having a open source cg frontend and having IHVs write their own cg backends/profiles (which none beside nvidia do currently).

cass
04-22-2003, 08:58 AM
Originally posted by harsman:
Well, cg is far from vendor neutral. It's obvious that the cg aRB_fp profile tries to minimize register usage for example, which is agood fit for GeForceFX but doesn't work well with the Radeons limited dependant instructions.

This is an unfortunate limitation of the ARB_fp spec. Reconstructing the DAG and walking it to reduce dependent instructions (or to perform other simple optimizations) is something that the driver should do.



Cg also has fixed datatypes that enable a better fit to NV_Fragment_program but have no hardware counterpart on the Radeons.


So? The positive benefit is that Cg allows vendors to expose features of their hardware without breaking compatibility. In the ARBfp profile, the 'half' and 'fixed' types map to the same type as 'float' on ATI *and* NVIDIA hardware. Nobody is forcing you to use those data types, either.


I'm also not convinced of the merits of a cross API shading language at the level of cg or HLSL. How many modern games and other end user products actually have a rendering engine that can use either OpenGL or DirectX? Not that many.

Many software vendors use tools to create their art. For most tools vendors, if they had to choose between supporting only Direct3D or only OpenGL, they would choose Direct3D. That's where the market is. Cg allows them to choose both. And it allows them to continue to use whichever API they desire for the development of their tool, and still target the broadest market.




And if you want shaders to be a part of the art path you'll need a application specific way to tag environment textures, ways to differentiate light shaders from surface shaders and maybe even ways to store different shader LOD levels. All this means you will need a higher abstraction than the current shading languages anyway, so the language used to express the actual calculations performed seems less important.


I don't follow most of this. I agree, however that the level of abstraction will need to be raised. That is a certainty.


If cg wants to become the cross platform standard for shading languages it needs to become an *open* standard not controlled by nvidia and it needs to get support form other IHVs than nvidia.


I don't disagree, but timing is everything. One of the chief advantages for Cg today (while it is rapidly developing) is that it is not bogged down in a standards committee.



The only difference I see between nvidias cg and glslang is if the compiler is part of the driver or not.


There are numerous differences, but the fact that the compiler is part of the driver is a very significant one. What if you had to upgrade Windows every time you wanted to get the latest service pack of Visual Studio.

There's also a significant difference in the level of abstraction between the two languages. Cg chooses a high level of abstraction, while glslang chooses a low level of abstraction. I think the impact of this decision will be more obvious as the programming models become more obvious.


cg still advocates using run time compilation with a native profile to get the best performance. That's exactly the same thing as if the OpenGL driver does it except it's done in a different dll, the cg compiler one. Having a standard lower level than cg format in Opengl and putting the parser and higher level stuff in the sample implementation is fine but I still fail to see what the great difference is between that and having a open source cg frontend and having IHVs write their own cg backends/profiles (which none beside nvidia do currently).

The fundamental difference is that the Cg model treats the compiler as a software product that is separable from the device driver. At some point ISVs will probably come to appreciate the advantages of this separation.

Thanks -
Cass


Edit: fixing ubb markup...


[This message has been edited by cass (edited 04-22-2003).]

jwatte
04-22-2003, 09:42 AM
> the Cg model treats the compiler as a
> software product that is separable from
> the device driver

I'd like to call this out as the most important actual argument I get out of this thread.

Direct3D HLSL also has a compiler separate from the driver -- the compiler is part of the DirectX runtime, not part of the driver that each vendor provides.

If you put a C-style compiler into the driver, how many of the SiS, S3, Trident etc chips would support it? My guess is 0. Microsoft realized this a long time ago, and created a very rich runtime environment with DirectX, which wraps a potentially very sparse hardware-dependent implementation layer, to make creating drivers easier for the "low cost" vendors.

It's hard enough to get decent OpenGL support from smaller vendors as it is. Putting the compilation outside the driver seems like the right choice. (This of course assumes that taking ARB_vp/fp code and turning into GPU-executable code is reasonably simple for the driver.)

harsman
04-22-2003, 10:21 AM
This is an unfortunate limitation of the ARB_fp spec. Reconstructing the DAG and walking it to reduce dependent instructions (or to perform other simple optimizations) is something that the driver should do.

But if the driver needs to reconstruct it, don't you think that is a sign that it needs a higher level representation? I'm sure the nvidia driver would be simpler if ARB_fp allowed some sort of per register or per instruction precision hint given the FX's speed with 16-bit floats. I could just as well claim the the lack of support for multiple render targets in the ARB_fp spec is an unfortunate limitation, but it would sound a little strange given that only the Radeon supports it.


So? The positive benefit is that Cg allows vendors to expose features of their hardware without breaking compatibility. In the ARBfp profile, the 'half' and 'fixed' types map to the same type as 'float' on ATI *and* NVIDIA hardware. Nobody is forcing you to use those data types, either.

I was just showing a symptom of cg being very targeted towards current nvidia hardware, not a forward looking novel design by language experts uncumbered by design by commitee issues (I doub't future chips will have lots of transistors devoted to fixed point math, but what do I know). I just don't see this big difference between MS's HLSL, Cg and glSlang when it comes to elegance, simplicity or ease of use. If anything Cg looks like the more clumsy one with all the different compiler targets and their differing abilities. To be fair, this impression is from an earlier version, maybe the current one (1.0) has improved considerably. The DX8-level profiles seemed especially ugly and out of place.

You have convinced me that there are several big advantadges with having a compiler separate from the driver, I'll give you taht :-) but I still think the low level interface to OpenGL needs to be higher level than the current ARB_fp one.


There's also a significant difference in the level of abstraction between the two languages. Cg chooses a high level of abstraction, while glslang chooses a low level of abstraction. I think the impact of this decision will be more obvious as the programming models become more obvious.

Can you give an example of this higher level of abstraction? I agree that abstraction is good, but I see a need for something much more abstract than Cg, something like Cgfx on steroids. Coupling animation stuff like skinning to pixel shader setup stuff like TBN vectors will lead to nothing but trouble in the long run, you need a vertex shader for skinned bumpmapped models, one for morphing bumpmapped etc ad nauseam. Not that this should be part of OpenGL mind you, but given your comments regarding DCC tools I thought it was appropriate. The kind of info artists work with in such a tool needs to be at a much higher level than Cg, for that you need something like a technical director.

[This message has been edited by harsman (edited 04-22-2003).]

[This message has been edited by harsman (edited 04-22-2003).]

Korval
04-22-2003, 10:31 AM
This is an unfortunate limitation of the ARB_fp spec. Reconstructing the DAG and walking it to reduce dependent instructions (or to perform other simple optimizations) is something that the driver should do.

And it is precisely that thinking that makes nVidia look monopolistic or unfair, even when they are not trying to be. Would it really be that hard for the Cg compiler to querry the hardware as to the number of texture dependencies it allows and use a different optimizer if the dependencies are "low" (compared to an FX)?


Cg also has fixed datatypes that enable a better fit to NV_Fragment_program but have no hardware counterpart on the Radeons.

I'd like to point out that the 9500+ series cards don't have any type other than 'float'. And, assuming their hardware did, blame the ARB for not exposing that in the fp spec.


I don't disagree, but timing is everything. One of the chief advantages for Cg today (while it is rapidly developing) is that it is not bogged down in a standards committee.

The two current programmable domains are almost as feature complete as they reasonably need to be. Maybe vertex programs need to start accessing arbiturary memory for things like matrices (at which point, I'll switch to Cg). Fragment programs obviously need looping and possibly some generalization here and there to an extent. However, by the next generation of hardware, the actual Cg language should be quite standardized in terms of functionality for per-vertex and per-fragment programs. Granted, the whole problem comes back when we open up a new programmable domain (Command Processor!), but a standards committee that is small (maybe 6 individuals total?) should be able to govern that domain.

dorbie
04-22-2003, 10:42 AM
Originally posted by cass:

One of the chief advantages for Cg today (while it is rapidly developing) is that it is not bogged down in a standards committee.


Oh the irony. This has a familiar refrain. You make it sound like Cg dodged a bullet when the ARB declined.

The one saving grace in all of this is that high level shaders can target multiple platforms and therefore be vendor neutral (using the word very loosely). However vendor neutral does not mean optimal. It's just not enough for someone to offer an option and when it isn't optimal point fingers at competing implementors and say it's their fault it isn't fast enough. Those implementors have other plans and a differing vision. They have no intention of supporting your 'standard' unless market conditions force them to. For all we know they are HAPPY that your shaders are less than optimal on their platform because that impacts the viability of YOUR strategy, not theirs.

The ARB is the available forum for getting common functionality, sure you don't always get your way but the alternative is a tower of babel and a miserable game of multiple choice for people who choose to use OpenGL.

I'm still hoping that the dust settles on all of this w.r.t. a common HLSL, but it's looking increasingly forlorn.

Mark Kilgard
04-22-2003, 11:01 AM
Originally posted by jwatte:
> the Cg model treats the compiler as a
> software product that is separable from
> the device driver

I'd like to call this out as the most important actual argument I get out of this thread.

Direct3D HLSL also has a compiler separate from the driver -- the compiler is part of the DirectX runtime, not part of the driver that each vendor provides.

If you put a C-style compiler into the driver, how many of the SiS, S3, Trident etc chips would support it? My guess is 0. Microsoft realized this a long time ago, and created a very rich runtime environment with DirectX, which wraps a potentially very sparse hardware-dependent implementation layer, to make creating drivers easier for the "low cost" vendors.

It's hard enough to get decent OpenGL support from smaller vendors as it is. Putting the compilation outside the driver seems like the right choice. (This of course assumes that taking ARB_vp/fp code and turning into GPU-executable code is reasonably simple for the driver.)

I concur. The strength of Cg is that you can use a high-level language today with NVIDIA drivers, ATI drivers, and Mesa (all support one or more of the NV/ARB vertex/fragment program extensions) - and any other IHVs when they too implement the multi-vendor OpenGL program object extensions.

The OpenGL driver should expose the basic functionality for 3D hardware acceleration but not try to take on everything. That's what the assembly language interfaces provided by the OpenGL "program object" extensions do.

You wouldn't expect OpenGL to do image file format loading for you (and wait until all interesting OpenGL implemntations provided the extension). You are quite happy to link with a file format loading library.

You've also got the option to write your own image file format library if you don't like the one's available. The cool thing about assembly language interfaces for vertex and fragment programmability is that you may well not like Cg, HLSL, or the ARB proposal. You might want to write shaders in a language that looks like Lisp perhaps. That's fine. If you can translate your language to one of the avaialable OpenGL "program object" extensions, you are in business.

The same could be said for scene graph support, etc. These software components, like a high-level GPU programming language, simply don't belong in a 3D graphics driver.

Unbundling the GPU programming language from the 3D driver is great for researchers or ISVs that want to implement domain-specific languages (say just for volume rendering or just for image processing).

It also means that if there are better optimizations in a future release of Cg, you can just link with the new Cg library and get the advantage on all your platforms. You don't have to wait until every IHV releases a driver that implements the optimization you are looking for.

The ARB really screwed itself by attempting to embed a shading language in the driver. They were warned against it, but the ARB as a whole insisted. If not for Cg, OpenGL users would really be suffering now without a standard GPU programming language for OpenGL. The fact that Cg supports the same language as Microsoft's HLSL implementation is icing on the cake since now your shaders are interoperable between major 3D APIs, a great thing.

The facts on the ground are much clearer now. Both Microsoft and NVIDIA managed to release production-quality GPU programming languages last year while the ARB is still without even an approved specification. It will be longer still until drivers show up that support the final specification. Separating the GPU programming language from the driver ended up being the right call for schedule reasons as well as all manner of technical reasons.

- Mark

evanGLizr
04-22-2003, 11:06 AM
Originally posted by jwatte:

Putting the compilation outside the driver seems like the right choice. (This of course assumes that taking ARB_vp/fp code and turning into GPU-executable code is reasonably simple for the driver.)


There we go again with the fallacy of "low-level-language means you don't need a compiler". You may not need a string parser because the driver receives a token stream from the Direct3D layer (which you do need with OpenGL programmable extensions, BTW), but you still need all the other parts of the compiler.
Generating code is always "reasonably simple", the key here is in generating efficient code for your architecture, and you need a compiler for that.


Originally posted by jwatte:
If you put a C-style compiler into the driver, how many of the SiS, S3, Trident etc chips would support it? My guess is 0.

Didn't those companies license OpenGL's source code from SGI? They could do exactly the same with the compiler part (from SGI or from any other third party - anyone up for new markets?).
You already have a sample glslang implementation available.


Originally posted by jwatte:
Microsoft realized this a long time ago, and created a very rich runtime environment with DirectX, which wraps a potentially very sparse hardware-dependent implementation layer, to make creating drivers easier for the "low cost" vendors.

Direct3D HLSL also has a compiler separate from the driver -- the compiler is part of the DirectX runtime, not part of the driver that each vendor provides.


In fact Microsoft'd better change the Direct3D model so driver writers can implement their compilers in userspace.

And last, an ICD is not a driver but a userspace DLL (Microsoft - what a surprise - decided to erronously call it "driver"). AFAIK the unix implementation of OpenGL is implemented in userspace shared object libraries as well.

In that sense the ICD model is much better suited for the current programmable trend than the current Direct3D model, where the compiler has to be implemented in a *real* driver (where it's cumbersome to use C++ with static initialisation, and where crashing or being resource-hungry means crashing the system, etc).

[This message has been edited by evanGLizr (edited 04-22-2003).]

dorbie
04-22-2003, 11:19 AM
Mark,

way back when Cg started out, NVIDIA stated they had NO intention of offering Cg to the ARB. When it looked like the result of this closed policy was going to be a competing standard from 3DLabs NVIDIA hastily tried to offer Cg to nip it in the bud.

Sure NVIDIA did a great job with Cg, but the lack of a common HLSL that does what you would call "the right thing" has as much to do with NVIDIA's closed shop, secret squirrel approach to this early on as it has to do with lack of consensus. Consensus is also tricky when one of the major vendor's view is simply, Cg, good, anything else, bad. When you're half of the only significant consensus required, don't bitch about delays reaching a consensus.

Well that's at least one alternative interpretation of events.

It's also worth pointing out that one of the major bones of contention over competing HLSLs is that Cg for all the vaunted platform abstraction, does not offer enough of it. It is not write once run on many, the shader complexity is seriously restricted by the target hardware. I'm not taking sides on this, everyone has weighed in on it, including Carmack & Matt. It is ironic that this is being cited as a benefit of *Cg* though in this context.

[This message has been edited by dorbie (edited 04-22-2003).]

JD
04-22-2003, 12:18 PM
Ati or any other ihv has no say in the CG so they won't support it. I and many others here would rather use glslang because it's coming from an ARB who in most part has no conflict of interest. The fact that CG was turned down by ARB should give nvidia heads up that they need to concentrate on glslang and not CG.

Let's not make an another microsoft out of nvidia. If we did then we would be getting gffx instead of 9800pro. Our market economy works only as there is also competition. Take away competition and you can kiss your choices good bye. How would nvidia like it if IBM didn't exist? They wouldn't.

The ARB was created so that both ihv/isv have a say in how 3d api should progress. I'm slowly beginning to believe Richard Huddy when he said that Nvidia wants to control and dominate opengl. Let's not have another 3dfx and glide, okey?

Mark Kilgard
04-22-2003, 12:29 PM
Originally posted by dorbie:
The ARB is the available forum for getting common functionality, sure you don't always get your way but the alternative is a tower of babel and a miserable game of multiple choice for people who choose to use OpenGL.

I'm still hoping that the dust settles on all of this w.r.t. a common HLSL, but it's looking increasingly forlorn.

The ARB is the available forum? What does that mean? Apparently it is entirely possible, without the ARB, to design and implement a cross-platform high-level GPU programming language that supports hardware from multiple hardware vendors and on multiple operating systems and multiple 3D APIs without having the ARB involved.

The ARB recognized very early on that not everything belongs in the 3D driver. That was the point of the GLU library. It is also why the ARB never tried to standardize a scene graph library within the confines of the OpenGL API. Open Inventor and similar libraries were layered above OpenGL.

Why the ARB decided to embed a programming language within the OpenGL driver is beyond me. Andy why tie the shading language to OpenGL and only OpenGL?

The dust has pretty much settled. Cg provides a high-level GPU programming languages that works for Windows, Linux, and MacOS X, for both OpenGL and Direct3D, for both the latest GPUs and older GPUs, and it's redistributable at no cost, has good documentation, good optimizations, an open source parser available, an accompanying meta format for non-programmable state and resources (CgFX), and continues to be improved.

Rather than looking "increasingly forelorn" about the prospect of a "common HLSL", I'd say that the dust has already settled and we've got one already.

- Mark

MZ
04-22-2003, 12:48 PM
Originally posted by cass:
There are a couple of nice things about having low-level interfaces:

- reduced driver complexity
- 3rd party shading language design

Originally posted by Mark Kilgard:
The cool thing about assembly language interfaces for vertex and fragment programmability is that you may well not like Cg, HLSL, or the ARB proposal. You might want to write shaders in a language that looks like Lisp perhaps. That's fine. If you can translate your language to one of the avaialable OpenGL "program object" extensions, you are in business.

Amd and Intel release "Optimization Guide" documents for their CPUs, which are useful for both compiler development and hand-written assembler code.

Assuming your intentions about 3rd party compilers were honest, I'll ask:
Does nVidia intend to release "Optimization Guide" for their GPUs?

dorbie
04-22-2003, 12:59 PM
I think you've just illustrated exactly what NVIDIA means by concensus.

Why complain that the ARB is bogged down when it might be legitimate to say that it is the attitude you've just demonstrated towards "cooperation" that is the cause of the lack of progress on a common HLSL? Once more with feeling, NVIDIA is exactly half of any consensus people care about.

I was actually playing devils advocate here and you've jumped in with both jackboots and proved my worst misgivings 100% justified.

There remains an element of contradiction in saying that Cg offers increased abstraction over other options when it is the lack of abstraction and the tying of language features to hardware specific profiles that is the key objection others (including Carmack) have w.r.t. Cg, more so than tying of compiler to driver, although the issues may not be unrelated. The F-buffer seems to make the counter objections to glslang moot.


[This message has been edited by dorbie (edited 04-22-2003).]

Humus
04-22-2003, 01:13 PM
Originally posted by cass:
Designing *good* high-level compilers is a very complex task, despite what you might think. Future generation GPUs will likely have enhanced programming models that will make it even more complex.

The difference between "works" and "good" is in the optimisations. Plain parse -> working executable is a fairly simple task. Fairly simple relative to other driver tasks that is. Once you get into optimisations the task gets way harder of course, but it gets even harder if you've lost high-level semantics and need to figure it out from a low-level assembly semantics.

Humus
04-22-2003, 01:22 PM
Originally posted by Korval:
That's the sad part about it. You see, you actually lose, but you don't even realize that you lose. You're going to get less stable drivers, or your next card is going to cost more. Or, your shader language will just not be optimized worth anything. You can't have it all; take your pick.

BTW, having looked at glslang, I much prefer Cg. Glslang looks like it was made by 3DLabs for 3DLabs hardware (ie, a scaler processor), rather than for a vector processor. Also, it doesn't seem to expose per-vertex color interpolators; it only has enough per-vertex interpolators for 8 texture coordinates.

Well, having actually worked with such drivers and given the alpha status I'm more than satisfied with the quality. It's more functional than GL_ARB_fp/GL_ARB_vp or even GL_ATI_fs was at this early stage.

As for the language itself, I don't see your connection to scalar processors. Color interpolators are there too and there's no limit whatsoever on the number of interpolators. The good thing about glslang compared to DX9 HLSL and Cg is also that I don't need to care about which attribute goes where, I don't care if my normal is passed through texcoord0 or texcoord5, all I need is to declare 'varying vec3 normal;' and the driver can pass it through whatever interpolator it thinks is best.

Humus
04-22-2003, 01:27 PM
Originally posted by Mark Kilgard:
I think by "the spec", you mean ARB_vertex_program. Keep in mind that the ARB_vertex_program specification predated the ARB_fragment_program extension. Hardware in the timeframe of ARB_vertex_program's standardization did not support a vertex program SIN/COS instruction. DirectX 8 expected these instructions to be approximated by multi-instruction sequences that amounted to a Taylor series approximation. Because the quality of the approximation can vary, the ARB_vertex_program extension left it to programmers to implement their own approximations.

You just provided a real world example where high level semantics was lost and your hardware will not be utilized for this reason. Right on my point.


Originally posted by Mark Kilgard:
The problem with building a shading language into an OpenGL driver is that then you are at the mercy of whatever compiler bugs exist in the driver which could be different from machine to machine.

I fail to see how this is different from the current situation. We will always be on the mercy of driver quality. There are no garantuees whatsoever that an asm shader will compile correctly.

Humus
04-22-2003, 01:31 PM
Originally posted by Mark Kilgard:
With Cg, the same shader program can be compiled for either OpenGL or Direct3D profiles.

Shaders, like models and textures, are part of the art path for 3D development. The look of a 3D art asset is determined by a combination of the 3D model, one or more shaders for the model, and associated textures. If I'm going to create an art asset, why do I want it tied to a particular 3D API?

This is hardly an issue. Not only is the target audience for such portability very small, but if you plan on targetting both D3D and OpenGL shader compatilibity is probably the least of your worries.

Mark Kilgard
04-22-2003, 01:38 PM
Originally posted by dorbie:
way back when Cg started out, NVIDIA stated they had NO intention of offering Cg to the ARB. When it looked like the result of this closed policy was going to be a competing standard from 3DLabs NVIDIA hastily tried to offer Cg to nip it in the bud.


NVIDIA did offer Cg as a starting proposal to the ARB at the appropriate time when proposals were being considered. The ARB, as a whole, turned NVIDIA's Cg proposal down unfortunately.

NVIDIA solicited a lot of input about the language from developers. The strength of the language and its implementation is a testament to all the feedback that was recieved and integrated. NVIDIA went quite far to offer hands-on sessions for developers at SIGGRAPH and other events so the Cg design team could find out what developers really wanted.

I'm not sure what you mean by "[NVIDIA] had NO intention of offering Cg to the ARB" when you subsequently say that, in fact, NVIDIA did what you claim NVIDIA had no intention of doing. I think NVIDIA's action of offering Cg up as a starting proposal for the ARB is stronger evidence of NVIDIA's true intent than your personal assessment of another's intent.


Originally posted by dorbie:
Sure NVIDIA did a great job with Cg, but the lack of a common HLSL that does what you would call "the right thing" has as much to do with NVIDIA's closed shop, secret squirrel approach to this early on as it has to do with lack of consensus. Consensus is also tricky when one of the major vendor's view is simply, Cg, good, anything else, bad. When you're half of the only significant consensus required, don't bitch about delays reaching a consensus.


Consider what actually happened. Cg had several beta releases, including public releases, all intended to solicit comment on Cg. NVIDIA got a lot of comments and bug reports and they were taken to heart and addressed.

As far as consensus, NVIDIA worked with Microsoft so that NVIDIA's Cg and Microsoft's HLSL implementations would share a common underlying language. It is straightforward to verify that NVIDIA's cgc compiler and Microsoft's fxc compiler can compile the exact same shader source code and generate functionally equivalent code. This wasn't an accident; it required substantial collaboration and consensus.

At the same time, NVIDIA didn't walk away from the ARB process. You'll find multiple NVIDIA engineers listed on the "Contributors" list for the glslang proposals. There are multiple instances of glslang being positively influenced by the experience and functionality of Cg.

However, in my personal opinion, I find that a lot of the major high-level decisions about glslang (trying to embed it in the driver, trying to tie it unnecessarily to OpenGL only, not providing any practical way to support exisiting GPUs too) make the proposal a poor choice for a common language for GPU programming.

The glslang process setup certain non-goals that were in direct opposition for the goals NVIDIA wanted to see in a GPU programming language. NVIDIA wanted to see a language that worked with both major APIs, that supported older GPUs as well as the latest GPUs, and provided a way for the language implementation to improve without continual driver upgrades. The ARB simply didn't value these goals. That left NVIDIA to 1) continue to participate in the ARB process despite a vote that didn't go NVIDIA's way, and 2) continue to develop Cg to address the goals that NVIDIA felt were important to achieve. In the end, Cg is here now. When glslang is eventually available, 3D developers can decide which GPU programming system best meets their needs. Ultimately, 3D programmers will decide.


Originally posted by dorbie:

It's also worth pointing out that one of the major bones of contention over competing HLSLs is that Cg for all the vaunted platform abstraction, does not offer enough of it. It is not write once run on many, the shader complexity is seriously restricted by the target hardware.


Actually, you'll find that Cg 1.1 includes a meta file format called CgFX that allows you to embed Cg source code along with the non-programmable state required to realize a given shading effect. This includes support for multiple passes and support for implementing different versions of the same effect (called techniques) in the same file so that you can select the implementation of the effect that best matches your hardware.

Now I'll agree that this is not total virtualization, but it does provide a pragmatic way to support multiple hardware generations in a common way. Lots of software developers have found the support provided by the format helpful. Additionally, CgFX and Microsoft's .fx file formats are compatible.

- Mark

Humus
04-22-2003, 01:39 PM
Originally posted by Mark Kilgard:
You wouldn't expect OpenGL to do image file format loading for you (and wait until all interesting OpenGL implemntations provided the extension). You are quite happy to link with a file format loading library.

This is a silly argument. Loading an image from .tga instead of say .bmp does not provide any runtime advantages. Loading a HLSL shader instead of an assembler shader provides significant runtime advantages.

cass
04-22-2003, 01:47 PM
Originally posted by dorbie:
There remains an element of contradiction in saying that Cg offers increased abstraction over other options when it is the lack of abstraction and the tying of language features to hardware specific profiles that is the key objection others (including Carmack) have w.r.t. Cg, more so that tying of compiler to driver, although the issues may not be unrelated. The F-buffer seems to make the counter objections to glslang mute.

Sorry if I was unclear. What I mean by "level of abstraction" is that the language specification is not filled with gl-centric keywords or special variables. In this sense it is also more abstract than the RenderMan shading language.

Cg sought to provide a high level of abstraction to the problems being solved by Cg code, but it did *not* seek to hide the limitations of the hardware on which Cg code was running.

If your code is sufficiently simple, it really is write-once, use-many. If you use features that require special hardware, fine. It's silly to pretend that a GeForce256 can do exactly the same things as a GeForceFX. Providing that level of abstraction (if it amounts to sofware fallback) doesn't really help anybody.

Thanks -
Cass

Humus
04-22-2003, 01:47 PM
Originally posted by Mark Kilgard:
Why the ARB decided to embed a programming language within the OpenGL driver is beyond me. Andy why tie the shading language to OpenGL and only OpenGL?

As if GL_ARB_vp/GL_ARB_fp are not languages embedded in the driver ...
The only difference is it's not a high-level. The question would better be, why does OpenGL bother with low-level hardware details in ARB_vp/ARB_fp that are likely to change in the future?

cass
04-22-2003, 01:56 PM
Originally posted by Humus:
As if GL_ARB_vp/GL_ARB_fp are not languages embedded in the driver ...
The only difference is it's not a high-level. The question would better be, why does OpenGL bother with low-level hardware details in ARB_vp/ARB_fp that are likely to change in the future?

Because OpenGL is a hardware abstraction. What you want is something higher level than OpenGL, it sounds like.

Humus
04-22-2003, 01:56 PM
Anyway, to complete my spamming spree here ... I think there is a solution to satify both worlds here. My main concern is that high-level semantics are lost which makes optimisations a sad business and severely limits hardware innovation.
If however a middle-layer language were to be designed that preserves high-level semantics we could kick the parser out of the driver and open for a choice of languages. Assembly languages is not such a language though. One could come up with a standard middle-language, which could even be binary, that simply defines expressions, datatypes and their relations. That oughta make everyone happy. http://www.opengl.org/discussion_boards/ubb/smile.gif

Humus
04-22-2003, 02:05 PM
Originally posted by cass:
Because OpenGL is a hardware abstraction. What you want is something higher level than OpenGL, it sounds like.[/B]

Last time I checked OpenGL was a graphics API and not a hardware abstraction layer. Confusing it with D3D I suppose?

[This message has been edited by Humus (edited 04-22-2003).]

cass
04-22-2003, 02:39 PM
Originally posted by Humus:
Last time I checked OpenGL was a graphics API and not a hardware abstraction layer. Confusing it with D3D I suppose?


Of course, it is a graphics API, but one of the guiding principals of its design was that it should represent an interaction of application software with graphics hardware.

To quote the OpenGL 1.4 spec, section 1.2, "OpenGL (for “Open Graphics Library”) is a software interface to graphics hardware."

This is why things as "simple" as concave polygon tessellation were not included in the OpenGL core.

GLU and other libraries were deemed the right place for software to talk to software.

Again, I'm not saying I think a higher level of abstraction is not useful. Quite the contrary. But I don't think the OpenGL driver is the best place to put that higher level abstraction.

Thanks -
Cass

dorbie
04-22-2003, 02:42 PM
Mark, history is not so important, but my reason for stating NVIDIA's intent back then is because it was their (your) expressed intent not to offer it to he ARB, not just some peice of Kremninology. There was a clear U-turn around the time it looked like 3DLabs was gaining traction, that's still the way I read events. It's only worth mentioning when someone gets on their soap box and complains about ARB inaction over this. The ARB may have adopted Cg if it had been floated sooner IMHO. By unveiling it virtually complete AND keeping it proprietary you made the emergence of an alternative inevitable.

I too thought it was a great shame that the ARB never worked harder to arrive at a common API, I think there was intransigence on all sides and plenty of blame to spread around. When two ideas are so well developed with fundamental design differences and competing commercial interests agreement becomes virtually impossible.

I wouldn't expect IBM to cede their compiler development to Intel AND hand over control the language placing limits on the features it exposes. I don't expect ATI to do the equivalent with Cg. It doesn't matter how hard you push on the ramrod, so now we've all got to live with the consequences.

I came in here saying it's not enough to simply offer something, (mandate most of it with no real consultation with those who have the largest stakes in this) and blame your enemy for not supporting your strategy. That's where I'll leave.

cass
04-22-2003, 03:26 PM
Originally posted by dorbie:
Mark, history is not so important, but my reason for stating NVIDIA's intent back then is because it was their (your) expressed intent not to offer it to he ARB, not just some peice of Kremninology.

Angus,

Yes, but it's not as if this is duplicitous. Our vision has always been that high-level shading languages should exist on top of the low-level API, not in it. When it became clear that the ARB was going to put *something* into the low-level API, we offered a Cg-compatible language proposal. Again, this is consistent with our desire to have a cross-platform, cross-api shading language. OpenGL is not well served by reinforcing the 3D graphics API partition where it does not need to exist.

Thanks -
Cass

Korval
04-22-2003, 07:06 PM
This is a silly argument. Loading an image from .tga instead of say .bmp does not provide any runtime advantages.

Says who? Granted, a .tga isn't particularly useful, but I wouldn't mind having the facility for having a memory image of an OpenGL texture, where I can just give OpenGL a file handle and it can read it directly into AGP memory via async-reads. I, however, understand that it would be inappropriate for OpenGL to have said functionality in it.


Once you get into optimisations the task gets way harder of course, but it gets even harder if you've lost high-level semantics and need to figure it out from a low-level assembly semantics.

Alternatively, you just understand what the user wanted. So what if your assembly language, or internal language, had "sin" in it, but some old shader used a Taylor approximation. By the time "glslang" hits, the vast majority of these issues will have been ironed out.


I fail to see how this is different from the current situation. We will always be on the mercy of driver quality. There are no garantuees whatsoever that an asm shader will compile correctly.

You've never written an optimizing compiler, have you? This is hardly a trivial task. There's no need to add complexity to an already increasingly complex problem.


If however a middle-layer language were to be designed that preserves high-level semantics we could kick the parser out of the driver and open for a choice of languages. Assembly languages is not such a language though. One could come up with a standard middle-language, which could even be binary, that simply defines expressions, datatypes and their relations. That oughta make everyone happy.

And, precisely, how is that different from what ARB*program is now? Because it looks like assembly rather than some "expression tree"? The method of encoding is not the issue.

Also, Humus, you haven't pointed out precisely how tossing an optimizing C compiler into hardware doesn't complicate the drivers and take away driver development time. You seem to simply ignore that issue, as though it didn't matter.


I think that the point of the issue of Cg being "open" is the following. There are around 6 avaliable profiles for Cg. Two ARB ones, and 4 for various nVidia-only extensions. Granted, yes, Cg predates the ARB extensions, but that isn't really the point. The point is, why isn't ATi's EXT_vertex_shader or ATI_fragment_shader on that list? Fragment shader, for example, is quite versitle, and could be compiled in ARB_fragment_program (with limitted dependent texture accesses. Too limitted for the spec). And EXT_vertex_shader is virtually identical to NV_vertex_program. Admittedly, yes, ATi had a very strange way of building and binding programs. However, I saw no push whatsoever to add profiles these once ATi added more standardized binding methods.

Has anyone even considered adding platform-specific profiles for 3DLabs or Matrox (do they exist?) cards? The fact is, Cg binds you to nVidia cards. The ARB compiler even generates nVidia-friendly code.

The problem is that Cg was never originally presented as the high-level means of accessing shader hardware on video cards. It was originally presented as the high-level means for doing shading on nVidia cards. And it, quite frankly, remains so to this day. Oh, it's more other-platform friendly in that it outputs to standard assembly languages, but nVidia controls the compiler. Do I really expect nVidia, or any single corporate entity, to do something so far outside of their own best interests as play fair with a product that they solely control?

There was never a real question that I would ever see an ATi or any other vendor-specific profile for Cg. This is how the world works. It looks good for nVidia to pretend to be open about Cg while still controlling it to enough of an extent behind the scenes to keep themselves in control.

Which is why ATi, and many of nVidia's other compeditors, did everything they could to keep a language that nVidia ultimately controls out of the ARB. Also, by binding the language to OpenGL 2.0, they ensure that it would take an nVidia-specific extension program to allow Cg to even work (unless nVidia finds a way to make Cg 'compile' to glslang). And how many people are going to want to have to write to 2 code paths: one glslang, the other Cg, especially with the inroads ATi is making in the current market? In short, GL 2.0's glslang limits the usefulness of Cg as an OpenGL shading language.

To be honest, that would be perfectly fine with me if:

1) I didn't hate the glslang language itself, and
2) I didn't think that binding a high-level language into a graphics API was a terrible idea.

So, on the one hand, glslang will make it very difficult for Cg to gain a foothold in GL 2.0 territory. But, on the other hand, it damages driver quality (and glslang suxors).

[This message has been edited by Korval (edited 04-22-2003).]

[This message has been edited by Korval (edited 04-22-2003).]

tweakoz
04-23-2003, 08:54 AM
Originally posted by cass:
Angus,

[clip]

Thanks -
Cass


/////////////////////////////////////////////////////////////////

Well regardless of any arguments between the CG team and the glslang team, CG is going to support
glslang as a target - !!!! RIGHT?

I certainly (within my power) intend on supporting
CG/HLSL in our pipeline, simply for the fact that
it enables us to have wysiwyg functionality
in our Maya based Art Pipeline (to some extent anyway) and not have to redo the shaders for
PC targets (DX/GL).

The major practical issue I have with all the high level stuff however is that outside of prototyping
effects (on a PC) we still have to rewrite the equivalent code in VU assembly/VCL
(Playstation2)

It would be nice if NVidia teamed up with the VectorC people and made Cg work on PS2,
although that might be difficult (but not impossible) considering the PS2 has full branching/looping support and does not have the simple vertex stream limitations of current PC Gfx Architecture
(Although PS2 has NO fragment shader support - its all MultiPass).

Considering that the PS2 represents probably the
largest market share of 3D applications (even over DX) there is probably still some merit in getting CG on the PS2 (or even Consoles in General). Even though the PS2 is not the most glorious platform to work on, as a games developer, there is a reasonable chance you will have to support it to make money...... (not that making money matters to me, but it does to my employer...)

//////////////////////////////////////////////////////////////

Michael T. Mayers
Sr. Engineer
Jaleco Entertainment
mmayers@jaleco.com
michael@tweakoz.com

dorbie
04-23-2003, 09:28 AM
Cass, I'm not accusing anyone of being duplicitous, (revisionist perhaps). I have nothing but admiration for the work done on all sides. It's really incredible IMHO.

I just hoped for a better outcome.

tweakoz, I'm not on either team. I'm just observing from the sidelines.

Korval
04-23-2003, 11:09 AM
Well regardless of any arguments between the CG team and the glslang team, CG is going to support glslang as a target - !!!! RIGHT?

To the extent that you can compile C into Java... If the languages are sufficiently different, then it may not be reasonable to compile Cg into glslang.


It would be nice if NVidia teamed up with the VectorC people and made Cg work on PS2, although that might be difficult (but not impossible) considering the PS2 has full branching/looping support and does not have the simple vertex stream limitations of current PC Gfx Architecture
(Although PS2 has NO fragment shader support - its all MultiPass).

The difficulty is that PS2 vector units have to do things that Cg was never designed to even consider. The VU's have to perform the functions of a command processor (fetch the data from memory) and a vertex unit (transform the data). It's not necessarily reasonable to make Cg have these additional facilities.

Also, the way in which you fetch data for the VU's are very engine-specific. As such, each project would need to be able to modify the Cg compiler in order to support their methods for doing DMA's and so forth.


Considering that the PS2 represents probably the
largest market share of 3D applications (even over DX) there is probably still some merit in getting CG on the PS2 (or even Consoles in General). Even though the PS2 is not the most glorious platform to work on, as a games developer, there is a reasonable chance you will have to support it to make money...... (not that making money matters to me, but it does to my employer...)

Define "you". If "you" means "nVidia", they get their money from PC's and X-Boxes. Indeed, based on that, I seriously doubt you will ever see a PS2 Cg profile, as the X-Box is a compeditor to the PS2.

Humus
04-23-2003, 01:08 PM
Originally posted by Korval:
Alternatively, you just understand what the user wanted. So what if your assembly language, or internal language, had "sin" in it, but some old shader used a Taylor approximation. By the time "glslang" hits, the vast majority of these issues will have been ironed out.

"So what"? Wasn't it you who cried over driver developer efforts needed? The high-level semantics are extremely useful for the driver. Not only does is save loads of time to have that sin in there, but it also improves performance. I'm getting really confused over what you really want.


Originally posted by Korval:
You've never written an optimizing compiler, have you? This is hardly a trivial task. There's no need to add complexity to an already increasingly complex problem.

No I haven't, but I've written several parsers and interpreters over the years. Regardless, I fail to see your point. Having the high-level semantics at hand makes it way easier to make optimisations. You can't lose on having more information available. Optimisations is the hard part, not the transformation from high-level semantics to a low-level assembly. If there's going to be any middle-layer, it better preserve the high-level semantics.


Originally posted by Korval:
And, precisely, how is that different from what ARB*program is now? Because it looks like assembly rather than some "expression tree"? The method of encoding is not the issue.

The method of encoding is the exact issue. It's not the look, it's that information that is extremely valuable to the driver is lost in the process going to assembly. If sine is expanded to taylor series, it's darn near impossible to recover that information.


Originally posted by Korval:
Also, Humus, you haven't pointed out precisely how tossing an optimizing C compiler into hardware doesn't complicate the drivers and take away driver development time. You seem to simply ignore that issue, as though it didn't matter.

How many times need I address this issue before you realise it?

Korval
04-23-2003, 05:29 PM
"So what"? Wasn't it you who cried over driver developer efforts needed? The high-level semantics are extremely useful for the driver. Not only does is save loads of time to have that sin in there, but it also improves performance. I'm getting really confused over what you really want.

The point is, stop trying to turn a Taylor series into a sine operation. Don't look for it, don't attempt to make it faster. Simply understand that the user expects the operations to be precisely what they are.

Since it is far too difficult to reclaim that information at the assembly level, don't spend the time trying.


Having the high-level semantics at hand makes it way easier to make optimisations. You can't lose on having more information available. Optimisations is the hard part, not the transformation from high-level semantics to a low-level assembly.

My point is that, since you shouldn't attempt to reconstruct high-level semantics from the assembly language, writing an assembler for the low-level language is quite easy. Let's take this example:

Let's say I'm writing some vertex program for ARB_vertex_program. And I look for a "sin" opcode, but there isn't one. The first thought I have is, "Can I do perform the same task without it?" If the answer is "no", then my next question is, "Can I approximate it well enough without a significant hit on performance?". And I would measure "significant hit" based on the number of opcodes it takes to perform the approximation. If I decide that the number of opcodes is simply too many, I declare that, on ARB_vertex_program hardware, I don't use the effect in question.

Now, let's say I decide that a Taylor expansion is fast enough. If I run this program on hardware that actually supports an internal 'sin' operator, do I really expect it to figure out that I'm using a Taylor expansion and adjust my code accordingly? No; indeed, to do so may lead to doing the wrong thing. That aside, I, as the ISV in this case, have decided that spending the time to execute the code as written is OK. I do not expect the assembler to perform this task, and I am definately not relying on it. I expect the assembler to convert the assembly into the internal representation, doing modest optimization where it can, but not trying to reconstruct my intent.

Sure, it'd be nice if it could figure out what I meant and do something reasonable. But, let's look at the alternative case.

Let's say that ARB_vertex_program has a "sin" opcode. I say, "Sounds nice, I'll use it." Now, the 8500 doesn't support that opcode, but let's say that it exposed this vp extension anyway, emulating it with a Taylor expansion.

Suddenly, I find that my vertex programs are running unacceptably (to me) slow on an 8500. Nothing in my vertex program code tells me that "sin" is the culprit, but it is. Why? Because I couldn't see the actual code the driver choose to use.

In that case, I need a way to tell which implementations of ARB_vertex_program will, and will not, Taylor-expand "sin". I'd need to know this regardless of whether it's ARB_vertex_program, Cg, or glslang (though with Cg, you can see what it compiles to). That's why it's nice to be able to simply querry extensions; you know (or have a pretty good idea) what a card can and cannot do. One thing I don't like about ATi's implementation of ARB_fragment_program is that their cards don't actually support "sin" in hardware, so they have to emulate it without telling you. Not supporting "sin" may well be grounds for me not using effects that require "sin", so I should know beforehand.


The method of encoding is the exact issue. It's not the look, it's that information that is extremely valuable to the driver is lost in the process going to assembly.

So, what you want is a more complete assembly language than ARB_vertex_program to use as the "middle-ware"? Good; so do I. Which brings us back to the initial question of this thread, "When is vp2 coming out?"

I don't look at vp1 as the end; it is merely the beginning. Even NV_vertex_program_2 is just another step (though I don't imagine there'll need to be too many more afterwards).


How many times need I address this issue before you realise it?

Your arguments tended to hinge on the assumption that the assembler's were doing things like looking for Taylor expansions or other such nonesense, instead of simply assembling the language into hardware opcodes and doing some instruction re-ordering/etc as needed. Since drivers are patently not doing so (and if they are, maybe ATi's driver people need to get their priorities in order?), your argument is moot. It is, therefore, far easier to write an assembler for ARB_vertex_program than a full-fledged optimizing compiler for something like Cg or glslang. It is certainly much easier to do it every 6 months when the hardware changes.

cass
04-23-2003, 08:09 PM
I agree with Korval on this point. The assembly languages should be a pretty accurate and thin hardware abstraction.

There will be different versions of them, and they will support "OPTION" mechanisms for extension. In the traditional OpenGL style, you can go vanilla for portability or use extensions for advanced/non-standard functionality.

The key is that the driver should not need advanced compiler algorithms to turn optimized assembly into efficient GPU microcode. Generating correct, optimized assembly is then no longer the responsibility of the driver.

You may not see this a big deal today when many shaders look like

result = diffuse * texture2D(tex0);

but that's just the HelloWorld tip of the shading iceberg. When you have a full-blown object-oriented language, complete with polymorphism and generics and all the other language jargon, you may feel differently. You're kidding yourself if you think we won't get there -- soon.

What if WinXP systems only ran programs that were compiled *from source* *on that system* and with Visual Studio .NET? What if different WinXP systems had different versions of VS.NET, each with slightly different compiler bugs. Sound attractive?
Does putting a different optimizing high-level compiler into each driver sound like a very "forward looking" plan?

Gratuitous Dennis Miller quote: "Of course, that's just my opinion. I could be wrong." http://www.opengl.org/discussion_boards/ubb/smile.gif

Cass




[This message has been edited by cass (edited 04-23-2003).]

KRONOS
04-24-2003, 07:19 AM
What if WinXP systems only ran programs that were compiled *from source* *on that system* and with Visual Studio .NET? What if different WinXP systems had different versions of VS.NET, each with slightly different compiler bugs. Sound attractive?
Does putting a different optimizing high-level compiler into each driver sound like a very "forward looking" plan?

Do we have to start worrying about driver bugs now?

It does sound great. I can't see the big deal. Won't glslang code run everywhere?

It's like fp in a diferent level.
cass: all your arguments can be aplied to fp. fp code still as to be compiled. Should we have the fp compiler outside of the driver to?

Really, I can't see the big deal with the compiler living inside the driver, I really can't...

cass
04-24-2003, 08:08 AM
Originally posted by KRONOS:
Do we have to start worrying about driver bugs now?

I'm sorry. I probably shouldn't have brought up driver complexity and reliability. Nobody really cares. http://www.opengl.org/discussion_boards/ubb/wink.gif



It does sound great. I can't see the big deal.

Have you ever developed a high-quality, high-performance, robust OpenGL driver? A high-quality, high-performance optimizing high-level language compiler? Ever tried to integrate the two and release them in lock-step?

If not, I can understand why you might not see the big deal.



It's like fp in a diferent level.
cass: all your arguments can be aplied to fp. fp code still as to be compiled. Should we have the fp compiler outside of the driver to?

You're right - assembly is like C++ at a different level. Fundamentally they're both programming languages, but there's a huge difference in software complexity between a fully ISO-compliant optimizing C++ compiler and a straight assembler.

The assembly language provides the hardware abstraction. That's why the assembler should live in the driver. Also, assembly language is assembled, not compiled. As I said before, this is a huge difference.


Really, I can't see the big deal with the compiler living inside the driver, I really can't...

Just because you can't see it doesn't mean it isn't there. http://www.opengl.org/discussion_boards/ubb/smile.gif

Reminds me of that quote: "Just because you're paranoid doesn't mean they aren't really out to get you." http://www.opengl.org/discussion_boards/ubb/smile.gif

Thanks -
Cass

Edit: fix ubb markup

[This message has been edited by cass (edited 04-24-2003).]

tweakoz
04-24-2003, 08:19 AM
To the extent that you can compile C into Java... If the languages are sufficiently different, then it may not be reasonable to compile Cg into glslang.


There is one thing I learned about stuff like in my
relatively short lifetime, when there is a will, there is a way (its been proven time and time again) ....
With computers anyway.... ;>



The difficulty is that PS2 vector units have to do things that Cg was never designed to even consider. The VU's have to perform the functions of a command processor (fetch the data from memory) and a vertex unit (transform the data). It's not necessarily reasonable to make Cg have these additional facilities.

Also, the way in which you fetch data for the VU's are very engine-specific. As such, each project would need to be able to modify the Cg compiler in order to support their methods for doing DMA's and so forth.


I know that it is possible to write a CG backend for PS2, at least in the context in which i would want it - IE - process 1 vertex, that backend would dump VCL code which I could paste into a higher level VU renderer that handles quad buffering, instancing, etc.... I wouldnt even want the CG runtime for this,
just the compiler....

Since data is always 'pushed' into the VU (the VU can NOT say 'retrieve data from HOST mem'),
The VU with a couple of exceptions (curved surfaces, etc...) is still a STREAM processor.
As far as it is concerned it gets sent a serial
stream of vertices to be processed with a particular set of xform ops to be performed on them
which it then assembles into packets to be sent to the rasterizer ( 1 in, 1 out )

That matches to some degree the model of CG
which is that of the vertex stream...

This method would not require a change to the CG
spec, just profile specific options....



Define "you". If "you" means "nVidia", they get their money from PC's and X-Boxes. Indeed, based on that, I seriously doubt you will ever see a PS2 Cg profile, as the X-Box is a compeditor to the PS2.


Its in NVidia's best interest for CG to prosper,
And to do so, it needs Industry acceptance.
Its not going to get that if they target it only for NVidia cards, and I dont honestly believe thats thier strategy anyway, especially after reading this thread. And im not to get into a long discussion about XBox Vs PS2 - but Xbox is not really a 'viable' competitor in the marketplace. there are 10X as many PS2's out there.
Even Microsoft would support PS2 if they had no choice (if XBox fails) and they thought they could make money doing it....

mtm

KRONOS
04-24-2003, 08:24 AM
I'm sorry. I probably shouldn't have brought up driver complexity and reliability. Nobody really cares.


Is that sarcasm?! I hope not... http://www.opengl.org/discussion_boards/ubb/wink.gif I do care for driver reliability, but I don't care for driver or compiler or anything bugs... That would be sweet! http://www.opengl.org/discussion_boards/ubb/smile.gif



Have you ever developed a high-quality, high-performance, robust OpenGL driver? A high-quality, high-performance optimizing high-level language compiler? Ever tried to integrate the two and release them in lock-step?


I always thought that the two would be seperate, but not... Like a glslang.dll and oglnt.dll kind of thing. For example, a driver release could just update one dll and not the other. But this is just me, I never did developed a compiler or driver... http://www.opengl.org/discussion_boards/ubb/wink.gif

But I guest that a driver nowadays is a lot more complex than it was, say, 3 years ago? Even more complex than including a compiler in it today? Won't GL2 clean this all up?

But hey! I'm just a newbie! http://www.opengl.org/discussion_boards/ubb/smile.gif Reminds me of a saying we have here in my countrie: It looks like an ass looking to a palace (me being the ass) http://www.opengl.org/discussion_boards/ubb/wink.gif

[This message has been edited by KRONOS (edited 04-24-2003).]

Korval
04-24-2003, 12:48 PM
Since data is always 'pushed' into the VU (the VU can NOT say 'retrieve data from HOST mem'),
The VU with a couple of exceptions (curved surfaces, etc...) is still a STREAM processor.
As far as it is concerned it gets sent a serial
stream of vertices to be processed with a particular set of xform ops to be performed on them
which it then assembles into packets to be sent to the rasterizer ( 1 in, 1 out )

First, you don't need to put a manual carriage return at the end of a line; my web-browser is perfectly capable of doing that itself.

Second, that is simply how you use the PS2. Other people use the PS2 in different ways. To make a common Cg compiler that compiled to PS2 vector code would require a common facility for doing PS2 vertex processing. There is no common facility for vertex processing, so you can't really make a single Cg compiler for it. You would instead have to make multiple Cg compilers, one for each way people do vertex processing in the VU.

Lastly, maybe you didn't notice, but one of the pimary purposes of a vertex program is to feed the fragment program data. Considering that you have to build your fragment program on a PS2 out of multiple passes (with really poor blending options, too), the output of a PS2 Cg compiler would have to involve some form of transparent multipass. And this transparent multipass could be done, on the VU, in any number of ways. But, regardless, the transparent multipass has to know the locations of texures and so forth so that it can setup state properly. Cg has no way to input that data.

The PS2 is simply not the kind of machine that could run a typical Cg program. Or, if it did, it would bind you to a particular way of sending vertex data, which could bind you to a particular way of building your meshes, which requires you to write new tools, etc. Not a very good idea.


Its in NVidia's best interest for CG to prosper

I wouldn't go that far.

It would be good for nVidia if Cg became really big. But, it would hardly kill nVidia if it didn't. And the gains for Cg's success aren't that huge for the company, compared to the possible gains for X-Box or X-Box 2 sales.

It is in nVidia's interests to see Cg succeed, but it isn't so important as to override current sales. After all, Cg doesn't actually make nVidia money directly; X-Box's do.

Humus
04-24-2003, 01:14 PM
Originally posted by Korval:
The point is, stop trying to turn a Taylor series into a sine operation. Don't look for it, don't attempt to make it faster. Simply understand that the user expects the operations to be precisely what they are. Since it is far too difficult to reclaim that information at the assembly level, don't spend the time trying.

The problem is then that it limits innovation. Your precious new innovative hardware features wont get used, so why innovate?


Originally posted by Korval:
Suddenly, I find that my vertex programs are running unacceptably (to me) slow on an 8500. Nothing in my vertex program code tells me that "sin" is the culprit, but it is. Why? Because I couldn't see the actual code the driver choose to use.

This is in no way different that what happens if you use odd swizzles like .xxzy on R300s in ARB_fp. Or any different from when stuff go into software mode. Or when dependent texture reads on R200 is slow because it lacks a crossbar memory archetecture etc. Assembly languages does not solve these problems.


Originally posted by Korval:
So, what you want is a more complete assembly language than ARB_vertex_program to use as the "middle-ware"? Good; so do I. Which brings us back to the initial question of this thread, "When is vp2 coming out?"

No, I want a language that preserves high-level semantics so that hardware vendors can continue to innovate. Look does matter much, but high-level semantics better be there. Unless the assembly style is significantly revamped in the next revision loads of precious information will still be lost.


Originally posted by Korval:
Your arguments tended to hinge on the assumption that the assembler's were doing things like looking for Taylor expansions or other such nonesense, instead of simply assembling the language into hardware opcodes and doing some instruction re-ordering/etc as needed. Since drivers are patently not doing so (and if they are, maybe ATi's driver people need to get their priorities in order?), your argument is moot. It is, therefore, far easier to write an assembler for ARB_vertex_program than a full-fledged optimizing compiler for something like Cg or glslang. It is certainly much easier to do it every 6 months when the hardware changes.

My argument is that IHVs will want to continue to innovate. In order for new innovations to be useful, they need be utilized. If trying to recover high-level semantics is the only way to utilize your new hardware, then the IHV is left with the choice of either not innovate or spend a whole lot of time reverse-engineering the assembly.
(Hardware doesn't change every 6 months any more either btw, 12 months is the norm today and it will get longer over the years to come.)

cass
04-24-2003, 01:34 PM
Humus,

You make very good arguments about the value of high-level shading languages. I agree. Do these high-level languages lose significant value if they live outside the driver?

Cass

Humus
04-24-2003, 01:39 PM
Originally posted by cass:
The key is that the driver should not need advanced compiler algorithms to turn optimized assembly into efficient GPU microcode. Generating correct, optimized assembly is then no longer the responsibility of the driver.

The problem is then that what is optimal assembly differs significantly between different hardware. On the GFFX you can do loads of smart stuff by using the fully general swizzles which may save you some instructions, which is something Cg takes full advantage of. But when you pass such a shader to the R9700 it results in horrible performance since many of these swizzles can result in like 4-6 instructions. This needn't be the case had the R9700 driver known the intent of the programmer.


Originally posted by cass:
You may not see this a big deal today when many shaders look like

result = diffuse * texture2D(tex0);

but that's just the HelloWorld tip of the shading iceberg. When you have a full-blown object-oriented language, complete with polymorphism and generics and all the other language jargon, you may feel differently. You're kidding yourself if you think we won't get there -- soon.

I don't see how splitting optimizer and driver will save nVidia any time. Unless you plan on dropping Cg soon you'll have to continue do both the driver side work and the high-level compiler and optimizer. Basically, it would be exactly the same effort (or even slightly less effort) if the Cg optimizer were integrated in the driver. ATi would perhaps lose on it if they need to spend time writing a optimizer too. Given the competitive nature of graphics ATi would be more or less forced to do that anyway since otherwise their hardware would have to deal with unoptimal code that Cg produces. Then we're in the scenario where I can choose the nVidia compiler and get bad performance on ATi, or use the ATi compiler and get bad performance on nVidia. Meanwhile, had the optimizer been integrated in the driver this would not be a problem.


Originally posted by cass:
What if WinXP systems only ran programs that were compiled *from source* *on that system* and with Visual Studio .NET? What if different WinXP systems had different versions of VS.NET, each with slightly different compiler bugs. Sound attractive?
Does putting a different optimizing high-level compiler into each driver sound like a very "forward looking" plan?

Look at the open-source world. It contantly lives in the world of those issues you picture, and it seams to work fine there. Sure, at times you need to squeeze in a #ifdef to workaround some problems, but generally speaking, it's not big deal. And again, assembly language does not automatically make driver bugs go away.

And yes, I would consider the open-source world way more forward looking than the precompiled-binary world which forces CPU makers to continuing to support loads of legacy code, 16bit apps and instructions that no longer makes sense.

dorbie
04-24-2003, 02:04 PM
Any high level shading language loses significant value if it is significantly less than optimal or insufficiently supported on important platforms.

A language outside the driver has huge appeal to a company trying to do an end run around the competition, but compilers written and optimized by the people with a vested interest in making each peice of hardware run quickly have an appeal to neutral observers who hope that they can *really* write once run anywhere at some point in the future. Same goes for a language with features that aren't under the exclusive control of one party.

I don't really care where the compiler resides as long as it works (some need for JIT compilation is clear), I do care who provides it and who controls the language. The reasons seem obvious to me.

jwatte
04-24-2003, 03:46 PM
For what it's worth, in the mid-90s I worked on/with a compiler whose global optimizer ran AFTER code generation. Yes, a commercial compiler that you might have used, derived all semantics needed for optimization from looking at generated assembly code, and most people liked this compiler pretty well.

This was good, because it let the optimizer work totally separately from the parser AND from the code generator. (It was also bad, because there was a long-standing bug where "volatile" would be lost somewhere on the way ;-)

I really don't think the parser and codegen should go in the/each driver. But I said that before, so I'm just repeating myself at this point.

Korval
04-24-2003, 04:42 PM
No, I want a language that preserves high-level semantics so that hardware vendors can continue to innovate. Look does matter much, but high-level semantics better be there. Unless the assembly style is significantly revamped in the next revision loads of precious information will still be lost.

Well, vp2 will, almost certainly, include opcodes for sin/cos/etc. So, what operations, in particular, are you concerned about? What is the nature of the "loads of precious information" that "will still be lost"?


My argument is that IHVs will want to continue to innovate. In order for new innovations to be useful, they need be utilized. If trying to recover high-level semantics is the only way to utilize your new hardware, then the IHV is left with the choice of either not innovate or spend a whole lot of time reverse-engineering the assembly.

You seem to be of the opinion that the world will be stuck with ARB_vp1 forever. It won't; there'll be more extensions that reveal more of the hardware's abilities. Each extension adds new features.


The problem is then that what is optimal assembly differs significantly between different hardware. On the GFFX you can do loads of smart stuff by using the fully general swizzles which may save you some instructions, which is something Cg takes full advantage of. But when you pass such a shader to the R9700 it results in horrible performance since many of these swizzles can result in like 4-6 instructions. This needn't be the case had the R9700 driver known the intent of the programmer.

Which is why there should be an ATI_fragment_program extension (mirroring the hardware), with a corresponding Cg profile optimized for it.

The idea is that you frequently create new assembly-like extensions, whenever new hardware comes out, and update your Cg profiles accordingly.


Sure, at times you need to squeeze in a #ifdef to workaround some problems, but generally speaking, it's not big deal.

I don't consider a world "fine" if I have to change someone else's source code just to use it on my machine. And I'm sure Joe-gamer, who wouldn't know a #ifdef from a #pragma, out there doesn't consider it reasonable either. But that's a different issue.

cass
04-24-2003, 07:02 PM
Originally posted by dorbie:
A language outside the driver has huge appeal to a company trying to do an end run around the competition...

These conspiracy theories baffle me. Any company can develop external high-level shading language compilers - they don't even have to be hardware vendors.

It's absurd to suggest that any one company is inhibiting another from developing an external compiler that optimizes for their hardware.

Cass

PH
04-24-2003, 07:49 PM
Isn't the glslang compiler frontend code written by 3Dlabs, so that each vendor "only" needs to implement a code generator for their hardware ?
And isn't one of the main benefits of having the compiler in the driver, that it allows the driver to automatically multipass complex shaders ?

cass
04-24-2003, 10:00 PM
Originally posted by PH:
... And isn't one of the main benefits of having the compiler in the driver, that it allows the driver to automatically multipass complex shaders ?

Paul,

Automatic multipass doesn't require a high-level language. ARBvp and ARBfp have plenty of flexibilty to advertise outrageously large limits that might require multipass.

Automatic multipass is a very hard problem to solve generally though (proper blending, managing order dependence, splitting complex vertex and complex fragment programs into simpler passes, kill, alpha test, depth replace) I don't think you'll see any high quality implementations of it -honestly, it's easier and more robust to just design hardware support for resource virtualization.

Thanks,
Cass

Mazy
04-24-2003, 10:28 PM
Which is why there should be an ATI_fragment_program extension (mirroring the hardware), with a corresponding Cg profile optimized for it.

Isnt the idea with Opengl that all hardware differences should be abstracted away so that your code can run on many different cards with no chage?
By saying that you need a 3rd part utillity at runtime in order to complete that, you are saying that were moving that funtionallity out of the driver. If that's ok with vertex and fragment programs, then why not with other parts.. and thus skipping the opengl layer and let the 'extra' programs take care of buffer and binding problem aswell? Either the opengllayer should be constant for all hardware, else we really dont need it. I'm not saying that a HLSL should be in the driver, im saying that we shouldn't relay on 10th of different vendorspecific extensions for programs.

But i've also for the idea that you have one common frontend compiler code, opensource? and shared, and thus only implementing the backend part ( which should be almost the same as making the arb_fragement_program to internal format conversion, or am i totally wrong? ), giving the drivers access to the real source if they really want it.

[edit:alot of spelling errors http://www.opengl.org/discussion_boards/ubb/smile.gif don't write posts before the first cup of coffe]

[This message has been edited by Mazy (edited 04-25-2003).]

PH
04-24-2003, 10:29 PM
Ok, that's what I meant, having complex high level shaders be broken up into passes. I see this could be done by increasing ARB_fp/vp limits. I agree that having this solved by hardware ( F-buffer for example ) would be much better. I'm of course only worried about DX9-level hardware wrt. HL languages ( GF3/4 level hardware is simple enough to not require HL shaders ).

Well, I'm worried about how all of this will turn out http://www.opengl.org/discussion_boards/ubb/smile.gif.

Btw. when will the Cg compiler source be updated ? And why not release source code for a real backend (profile) for, say, ARB-vp ?

cass
04-24-2003, 10:55 PM
Don't worry - it'll all work itself out one way or another. http://www.opengl.org/discussion_boards/ubb/smile.gif

I haven't really been keeping track of the source release details of Cg.

pkaler
04-25-2003, 06:14 AM
I think part of the problem with Cg is perception. nVidia is in the same position with Cg as Sun is with Java. They both ask the community for input, however the final say is always that of nVidia and Sun.

That's frustrating and naturaly makes the user of the language suspicious.

Despite the fact that nVidia has a good standing with the OpenGL community, MS does not. The community is wary of corporations that it has been marginalized by.

That's a pretty paranoid argument. But to restate what cass had to say, "Just because you're paranoid doesn't mean they aren't really out to get you."

I prefer glslang. However, the compiler should sit outside the driver. I have an NV22 based card by MSI. I can choose between Detonator, MSI, or Mesa drivers. The choice is mine. It should be the same thing with the compiler.

The driver should remain in opengl32.dll and libGL.so. The compiler should be in glslang.dll and libGLslang.so.

There should be a choice to precompile programs into a Java style byte-code or compile at run time. The opengl32/libGL driver should be able to handle the byte-code.

All of these issues have already been solved in the language/compiler world. The remaining issue is that we need to come to a consensus in the graphics community.

dorbie
04-25-2003, 07:10 AM
Cass, this isn't a conspiracy theory. There's no conspiracy, there's one company, in competition with others acting in it's own self interests. The issue here is one of trust, why should people, and more specifically NVIDIA's competition simply trust the good intentions of NVIDIA and cede control of the compiler and HLSL language spec to them? It is absolutely ludicrous to suggest that this is a good option, or suggest that someone is a conspiracy nut for pointing out that this would be a bad outcome. (P.S. a common optimized option is inhibited when one party says "our way or the highway", your suggestion at the outset was that performance issues were due to ATI not fully supporting NVIDIA's Cg strategy[!])

The perfect outcome for NVIDIA as an entity here is to dominate the graphics industry, the perfect outcome for ATI is for them to dominate. The perfect outcome as far as I'm concerned if for you to keep knocking lumps out of each other, producing great products that deliver great performance and functionality through common APIs. No offence, we just have different objectives, that doesn't make me a conspiracy nut.

I do appreciate that your technical motives are pure. I'm sure you believe in the technical options you espouse and I'm not trying to impugn that. I know that feeling, when you state what you think with the best of intentions and people mistrust you because of who you work for, I've been there. I'm not suggesting that your posts are purely cynical and evangelical.

[This message has been edited by dorbie (edited 04-25-2003).]

barthold
05-15-2003, 12:58 PM
Well, I read this thread a bit late, but with great interest. Thanks for a good discussion. I would like to point out a few misconceptions I've read here.

* The low-level language constructs in the ARB_vertex_program or ARB_fragment_program specs are not an assembly language, in the sense that the 'opcodes' easily translate to our native instruction set. Maybe they do for NVIDIA or ATI, but not for us. We have a complete compiler stack in our driver to support these extensions. Thus there already is a compiler in the OpenGL driver, which seems to scare some people.

* Making the step to re-use that same compiler to support glslang is a relatively small one. Yes, you will need a glslang parser, but that we already provide to anyone who wants it.

* Somewhere it was said that the fact that glslang is build into the GL API means that compilers and drivers now need to be released in lock step. This was compared to having to upgrade your whole OS when all you need is a newer version of Visual Studio .NET. Well, that is misleading. There's nothing that prevents an IHV from releasing their complete compiler in a separate DLL and de-couple their releases of the driver DLL(s) from the compiler DLL(s). Doing so will make QA a lot simpler too, and is the reason that we ship a separate glslang.dll with the Wildcat VP products today.

* My perspective on the shading language history. 3Dlabs has been very open about their ideas for a high-level shading language (and other concepts, often loosely called OpenGL 2). We've shared our early thoughts in the form of a presentation at Siggraph 2001, followup presentations at ARB meetings and of course the white papers we published a year and a half ago. We did this with the express goal of solliciting early feedback from developers, and generate excitement for OpenGL's future. I think we succeeded in doing that. However, by the time the ARB formed the ARB-GL2 working group (in July 2002) Cg was announced and there were two competing proposals on the table. glslang and Cg. At that point I was pretty disappointed that there actually were two proposals, and not one proposal that we all had worked on from the beginning. That would have meant more resources working on the same end goal, which would have benefitted OpenGL. Although Cg is still actively being developed by NVIDIA, several of their engineers have contributed (and argued :-)) to the ARB-gl2 working group, with the goal to make glslang and the three supporting extensions better. For that these individuals deserve credit.

Barthold

Korval
05-15-2003, 03:40 PM
We have a complete compiler stack in our driver to support these extensions. Thus there already is a compiler in the OpenGL driver, which seems to scare some people.

No, that's just your driver.

I'd, also, like to point out that this "compiler" can't be too terribly complex. At least, not compared to an optimizing compiler for C. Yes, maybe your hardware lacks the DOT3 opcode, but it's easy enough to convert this into 3 multiplies and 2 adds.

Unless, of course, you built a real CPU (or 2) into your graphics pipeline. At which point, I would say, "Shame on you for killing my performance like that". The vector-based solutions that nVidia and ATi provide are faster-performing and simply better for the kinds of computations that go on during vertex & fragment processing.

That's one of the reasons I don't like glslang; it tries very hard to hide the fact that you're passing texture coordinates and so forth that are bound to vector-values (and are undergoing perspective-correction).


Making the step to re-use that same compiler to support glslang is a relatively small one. Yes, you will need a glslang parser, but that we already provide to anyone who wants it.

Using various, freely avaliable, tools, one can make a glslang parser in a day. The real work comes from the rest of the code. IE, the optimizing compiler that changes for every hardware revision.


with the goal to make glslang and the three supporting extensions better.

What are these "three supporting extensions?"

Humus
05-15-2003, 09:46 PM
Originally posted by Korval:
No, that's just your driver.

I'd, also, like to point out that this "compiler" can't be too terribly complex. At least, not compared to an optimizing compiler for C. Yes, maybe your hardware lacks the DOT3 opcode, but it's easy enough to convert this into 3 multiplies and 2 adds.

Unless, of course, you built a real CPU (or 2) into your graphics pipeline. At which point, I would say, "Shame on you for killing my performance like that". The vector-based solutions that nVidia and ATi provide are faster-performing and simply better for the kinds of computations that go on during vertex & fragment processing.


What is a better hardware implementation is highly subjective. 3dlabs obviously thought having an array of independent scalar processors was better. It certainly has its benefits, but at a cost, as with everything else.


Originally posted by Korval:
That's one of the reasons I don't like glslang; it tries very hard to hide the fact that you're passing texture coordinates and so forth that are bound to vector-values (and are undergoing perspective-correction).

glslang:
varying vec3 texCoord;

Cg/DX9 HLSL:
float3 texCoord: TEXCOORD0;

I don't see the difference except that glslang doesn't care about in which texcoord the parameter is passed (which is a good thing IMO).


Originally posted by Korval:
Using various, freely avaliable, tools, one can make a glslang parser in a day. The real work comes from the rest of the code. IE, the optimizing compiler that changes for every hardware revision.

Work that nVidia is already doing for their hardware in Cg and work that ATi and 3dlabs obviously is willing to take on. Work they simply will be forced to do anyway due to the market conditions. Difference being that we don't need to deal with different compilers for every vendor out there.


Originally posted by Korval:
What are these "three supporting extensions?"

GL_GL2_fragment_shader
GL_GL2_vertex_shader
GL_GL2_shader_objects

velco
05-15-2003, 11:13 PM
Originally posted by Mark Kilgard:

The facts on the ground are much clearer now. Both Microsoft and NVIDIA managed to release production-quality GPU programming languages last year while the ARB is still without even an approved specification.


And this is exactly how it should be. Standards bodies should codify existing practice. If something is to be learned fromc the computing history is that standards "invented" by a commitee never get widespread acceptance.

While initially somewhat ... disturbed by nVidia's move with Cg, I now recognize that they do exactly the RightThing(tm) (unless, of course they screw it up like Sun did with Java).

IMHO, a high-level shading language should not be at all a concern of ARB. Just like other programming languages, the proper place of standardization should be ISO and national bodies.

~velco

cass
05-16-2003, 04:25 AM
Originally posted by barthold:
* Somewhere it was said that the fact that glslang is build into the GL API means that compilers and drivers now need to be released in lock step. This was compared to having to upgrade your whole OS when all you need is a newer version of Visual Studio .NET. Well, that is misleading. There's nothing that prevents an IHV from releasing their complete compiler in a separate DLL and de-couple their releases of the driver DLL(s) from the compiler DLL(s). Doing so will make QA a lot simpler too, and is the reason that we ship a separate glslang.dll with the Wildcat VP products today.

Hi Barthold,

I don't think this was misleading. If an app developer does QA with a particular compiler, they may want to ship the app with that compiler. How do they do this? Do they need to ship with different glslang.dll files for each vendor? Will any 3DLabs glslang.dll work with any 3DLabs driver?

For a separate compiler (like Cg) that lives above the driver, the communication between the compiler and the driver is well-defined -- it just uses the OpenGL (or D3D)API. This makes it possible for the compiler to work with any verion of any IHV driver. The compiler can also determine what target instruction set(s) are available.

Thanks -
Cass

dorbie
05-16-2003, 06:11 AM
Well if you're correct and NVIDIA are prepared to provide a glslang parser to Cg then the final outcome may not be so bad.

OK it's work they're forced to do but the alternative is thousands of independent developers being forced to do extra work in their applications.

The "facts on the ground" is a bit much, I do like the idea of an independent compiler, but the only real advantage is that you can get better baseline coverage faster, I think the orthogonality is a good thing in general. If that's all that was being said fine, but there is no 'standard' there, there's no more a Cg implementation available from ATI than there is a glslang available from NVIDIA. It really gets self fulfilling talking about the ARB being slow on this issue, the reason glslang is held up is Cg and the time spent trying to reach agreement this is intentional, the ARB want's to "do the right thing" as you put it, so does NVIDIA I think.

My biggest concern is they cannot reach agreement. This disagreement is the most important "fact on the ground". We need some path that get's us a single, fast, open, supported HLSL standard and that is worth the wait. It didn't look to me like it was going to happen until I read Humus' post.

[This message has been edited by dorbie (edited 05-16-2003).]

JD
05-16-2003, 01:43 PM
CG didn't pass arb voting so using it is a moot point. Tech is going forward and glslang is the next hlsl for us all. Let's just accept this and urge the arb to hurry up with the specs so that ihvs can begin implementing it. Nvidia already actively cooperates with glslang working group so maybe by the end of this summer we get to see some glslang nv drivers. Then the whole CG debacle will die and it will only be a bad memory like glide was. Here's for the future, cling...

MarcusL
05-16-2003, 02:07 PM
I don't mind using a C-like language to pass into the compiler, as that is low-level enough for most other style of languages to compile to. If it just variables and arithmetic, who really cares what else is going on? (just to give an examplem, my thesis work was a particle system compiler from a functional language to C++ and Cg in the possible future.) I really regard C and workalikes as a platform independent assembly with syntactic sugar.

But having a compiler outside the language is good, IMHO, as it allows anyone to write a kickass compiler to compete with the existing one(s). However, if the language is not standardized and rather owned by one company, it is quite risky to try to do that.

However, the best way would be to unite on _one_ shading language, standardize it through a comitte and have all the API's use a layer which the compiler uses to send compiled programs to the card. Compiling from Cg to glslang seems trivial, so Cg seems like a nice choice, except for the standardization-by-committee part.

Nevertheless, we might need a few tests and tries in the beginning before we know what we want, so that in 5-10 years we get a ISO-standard shading language for GPU's.

(blabbering in the middle of the night http://www.opengl.org/discussion_boards/ubb/smile.gif

JelloFish
05-16-2003, 02:56 PM
This topic certainly has hit many points. It seems like everyone is really dogging on the whole design by comitte thing being so slow. But isnt that the whole point of design by comitte? Slow down the process enough so that when something gets released its FINAL. CG is great for now, but isnt it written into the CG spec that different compilers will support different syntaxes? CG is written in a way that it is not final, which is probably what the industry needs right now. A simple forum to try out all the different possibilities.

There were some points made about positives and negatives to low level languages. My beleif is the one negative thing about low level is: it is usually associated with hardware specificness. Just look at register combiners probably about as low level as you can get, the unfortunate thing is its low levelness ties it to a certain video card. An end result ISV's pretty much have to write their own shading language and have several different ways to run it on different hardware. Sure the D3D8 Psh language didnt support every last thing you could want to do, at least it could run on all the peices of hardware for a certain generation.

At least register combiners is low level enough that future generations of hardware (not necesarily from nvidia) can easily impliment the register combiner extension.(assuming nVidia allows it)