PDA

View Full Version : No New OGL Extensions. Waiting for DX9?



IT
09-04-2002, 05:35 PM
I just got my shiny new ATI 9700 Pro and contacted ATI about where I can get the info regarding new extensions that the 9700 exposes in OpenGL. (I don't care about drivers right now. Just the docs would be fine.)

Well, they won't be ready until maybe the end of this month.

It's curious ain't it. Could it be due to the fact that DX9 is delayed and by exposing new OGL extension now would be giving early insight to those not in the DX9 beta?

Does Microsoft control OpenGL too (all be it indirectly)? Just curious.

NitroGL
09-04-2002, 05:37 PM
Originally posted by IT:
I just got my shiny new ATI 9700 Pro and contacted ATI about where I can get the info regarding new extensions that the 9700 exposes in OpenGL. (I don't care about drivers right now. Just the docs would be fine.)

Well, they won't be ready until maybe the end of this month.

It's curious ain't it. Could it be due to the fact that DX9 is delayed and by exposing new OGL extension now would be giving early insight to those not in the DX9 beta?

Does Microsoft control OpenGL too (all be it indirectly)? Just curious.

I don't think it has anything to do with DX9 (I could be wrong though).

You could try emailing devrel@ati.com and ask for the ATI_fragment_program spec... As of right now, the extension string isn't even exposed.

IT
09-04-2002, 05:42 PM
They basically said the OpenGL specs haven't been release yet for the 9700. So I'm sure this comment includes ATI_fragment_program... I could be wrong though. :-)

Dan82181
09-04-2002, 06:00 PM
Well, you could take wordpad to the OpenGL driver and see if you can find any function entry points that you think might be part of ATI_fragment_program. That'll atleast tell you if they've been working on it, or if they are still using it for internal testing only. Doesn't do you much good if you don't have the specs because you don't know what the functions are looking for, or what they return. The latest Catalyst (2.2) contains the hooks for the ARB_vertex_program extension, although the extension isn't listed in GL_EXTENSIONS. But you can still call wglGetProcAddress(_SOME_ARB_VERTEX_PROGRAM_FUNCTIO N_STRING_HERE) and it will work, but haven't really tried it out yet.

Edit->
Oh, and I have an 8500, not a 9k series
<-Edit

Dan

[This message has been edited by Dan82181 (edited 09-04-2002).]

Humus
09-05-2002, 02:11 AM
I've tried that too and got all the entry points, however, it doesn't seam to work. I haven't tried any deep test, I just HexEdited NitroGL's ARB_vertex_program demo to see if it would run if I let it skip past the extension check. It ran, but didn't produce any output except for the framerate counter. So I guess it just isn't implemented yet.

ehart
09-05-2002, 07:17 AM
Expect more extensions to be made available soon. Fragment program, the one people are probably are most interested in, has been in the driver for some time. It has just not been exported as the ARB working group on the subject is submitting it for a vote on approval this month. It was not exposed to prevent creating two slightly incompatible extensions and add to the tower of Babel problem many developers complain about in OpenGL. We have been working in good faith with the other vendors to get this done on this and other extensions. The reason no entrypoints show up in the driveris that it uses the program framework created by ARB_vertex_program.

Another reason that some of the capabilities are not yet exposed is that ATI does understand that it has a reputation for drivers, and we intend to change that. New functionality that has not received ample testing in the GL driver is not yet exposed.

Finally, I will see what I can do about getting specs out before a driver is available, but I don't think it will be possible due to the confusion it would create. Also, feel free to mail me or the devrel alias if there are certain features you would like to see demos on in the SDK.

-Evan

PH
09-05-2002, 08:07 AM
eHart,

Can you say whether ARB_fragment_program is compatible with "older" hardware like the 8500 and GeForce3 ? And how are scales and bias' handled ? I hope it isn't like D3D's _bx style modifiers http://www.opengl.org/discussion_boards/ubb/smile.gif.

Cab
09-05-2002, 09:27 AM
Originally posted by ehart:

Fragment program, the one people are probably are most interested in, has been in the driver for some time. It has just not been exported as the ARB working group on the subject is submitting it for a vote on approval this month. It was not exposed to prevent creating two slightly incompatible extensions and add to the tower of Babel problem many developers complain about in OpenGL. We have been working in good faith with the other vendors to get this done on this and other extensions.


It is good to hear that.

IT
09-05-2002, 09:29 AM
Yes, thanks ehart!

Korval
09-05-2002, 09:32 AM
I had assumed that the 9700 would still support the older ATI_fragment_shader extension. Granted, it would allow for more textures, operations, and probably more passes, but it should still support that API (which, to be honest, is still very functional).

SirKnight
09-05-2002, 09:51 AM
Now of course I am speculating quite a bit here but looking at how the ARB_vertex_program extension is supported by the Radeon 8500 and GeForce 3 level of hardware, I'm betting ARB_fragment_program will be supported by the same level of hardware too, at least for the first version, just like ARB_vertex_program. I could be way off but that is my guess anyway. http://www.opengl.org/discussion_boards/ubb/biggrin.gif

-SirKnight

ehart
09-05-2002, 10:00 AM
Before this speculation gets out of hand...

No, fragment program will not support previous generation parts. There is just so little commonality there that it is really hard to create a meaningful extension.

-Evan

knackered
09-05-2002, 10:36 AM
This isn't talk of an ARB_fragment_program, is it?

PH
09-05-2002, 11:38 AM
Knack,
Yes. The reason I asked was because of what was in the June ARB meeting notes. I was just wondering what they had decided. I'm looking forward to seeing the spec and I/we promise not to get confused http://www.opengl.org/discussion_boards/ubb/smile.gif.

IT
09-05-2002, 12:35 PM
Originally posted by ehart:


No, fragment program will not support previous generation parts. There is just so little commonality there that it is really hard to create a meaningful extension.

-Evan

Good. A clean start. BTW, this 9700 smokes (not on fire, I mean fast). Good job ATI!

ehart
09-05-2002, 01:42 PM
I obviously can't post the ARB spec before it is approved. This spec has additional issues since there could be a minor change or 2 between now and approval.

I can say that it looks very much like ARB_vertex_program.

-Evan

PH
09-05-2002, 02:19 PM
Of course. You were talking about the possibility of getting ATI specific extensions out before drivers ? Well, I'm in no hurry regarding the specs but I would definitely like to know more about the 9700.
We don't have a lot of details regarding its capabilities. For example, we don't know how flexible floating point buffers/textures are ( the NV30 seems to have certain restrictions ). Anything unusual ( that you can talk about ) ?

Since ATI support the ARB_vertex_program extension, how does that work with VAO ? Do you have a new extension or is an updated VAO spec enough ?

MZ
09-05-2002, 06:00 PM
Originally posted by ehart:
No, fragment program will not support previous generation parts. There is just so little commonality there that it is really hard to create a meaningful extension.
No one needs any commonality for previous generation.
Really, you don't even have to *invent* anything new for previous generation.
Just make use of "OPTION ps_1_4" and adopt DX8 PS syntax.
This could coexist with language for new generation you are working on.
I can't understand what makes it impossible.

ehart
09-05-2002, 06:09 PM
Funny you should ask about the VAO and vertex program, we caught that just a couple weeks ago. The plumbing was all there, just not an entrypoint to make it work since the old ones were just slightly incompatible. It is scheduled be available when the SDK is rolled out shortly.

So, I'll give you guys the breakdown on the 9700 capabilities to expect with the SDK. Floating point textures and render targets are supported. Floating point textures are limited to nearest filtering other than that they are regular textures. (1D, 2D, 3D, cubemap, etc) When rendering to floating point buffers you cannot perform color operations after color sum in the OpenGL pipeline.

If there are other questions, I will try to answer them if I can.

-Evan

PH
09-05-2002, 07:18 PM
That's good news, thanks. Does 'regular textures' include EXT_texture_rectangle ? I mentioned the incompatability of VAO and vertex program in a previous thread ( my suggestion was something like glAttributeArrayObjectATI ). I guess you ATI guys don't read all threads http://www.opengl.org/discussion_boards/ubb/smile.gif.

I have one more question:

Since the 9700 is a DX9 part, how are multiple outputs from a fragment program done ? Will this be with some form of pack/unpack instructions ? You probably can't answer that right now if it's part of the fragment program spec.

ehart
09-05-2002, 07:30 PM
Sorry, don't have time to read all these threads, I have to write an SDK. http://www.opengl.org/discussion_boards/ubb/wink.gif

Texture rectangle will be included in that list. There will be suggestions on which targets will be fastest included in a white paper with the SDK. Indeed the 9700 can handle multiple outputs from the fragment program. No special packing is required. I am not 100% sure that this feature will make the SDK release as it is more of a second tier feature. (You need float targets and fragment programs to make it usable, so its testing naturaly falls behind those other two.)

-Evan

PH
09-06-2002, 05:15 AM
This is not strictly a question about the 9700 but how do you get maximum performance out of ATI_element_array ? In all my tests, the 8500 is slower with element arrays in AGP/video memory than in system memory. Others have had the same problem. Is this a driver problem, a hardware problem or is there something special that needs to be done ?

Dan82181
09-06-2002, 05:57 AM
PH,
I remember reading somewhere about there being possible problems when you put your index array into the VAO with your geometry data. I think you are supposed to create a separate VAO just for the index set. I also don't remember exactly which one it was, but I thought there was something in reguards to the pointer type being either GL_UNSIGNED_SHORT or GL_UNSIGNED_INT that made a difference. Although I wonder about what master Yoda said, "Size matters not", but I think that only counts for the Force, not OpenGL http://www.opengl.org/discussion_boards/ubb/biggrin.gif . If you've already tried those, then I don't know what else to tell you. I personally don't use the element array extension, not enough flexability with the glDrawElementsATI command like with the regular glDrawElements.

About the fragment program extension...
What kind of source and destination modifiers are going to be the same/taken out/new? Also, the floating point targets will be nice for stuff like the HDR, but what kind of performance penality is there going to be (nothing specific, just something like hardly,somewhat,drastic, that kind of thing). And will this OpenGL extension be able to utilize depth buffer depth properties (read/write depth buffer contents) and/or stencil properties (read/write stencil buffer contents).

Dan

PH
09-06-2002, 06:15 AM
I do have the elements in a seperate array ( a static one that's only written once ). Using UNSIGNED_SHORT is not possible with ATI element arrays, so it can't be that. You could use glDrawRangeElementArrayATI if you need more flexibility ( I use the range version of the standard functions too ). Maybe calling glArrayObjectATI lots of times is expensive with element arrays. Hmmm...

[This message has been edited by PH (edited 09-06-2002).]

PH
09-06-2002, 06:21 AM
Arrgh, you are correct. UNSIGNED_SHORT *is* alowed...time to benchmark that.

Nakoruru
09-06-2002, 06:21 AM
Since the entire pipeline is floating point, there should be no penalty for rendering to a floating point buffer. At least theoritically. The reason is that pixels have to go through the floating point pipeline whether they end up in a floating point buffer or an integer buffer.

The reality is more complicated, because floating point buffers can be so much bigger than integer buffers (up to 4x with a full 4 component 32-bit per component floating point buffer) they may take up to 4x memory bandwidth.

However, you should be able to render single element 32-bit floating point and double element 16-bit floating point at the same speed as a 32-bit integer buffer in a memory bandwidth limited situation.

It is imaginable that a 4 component half precision floating point buffer could run at the same speed in many cases. And even if it did not run full speed, it would probably still be faster than a GeForce 4 but with much higher quality.

IT
09-06-2002, 06:49 AM
How is branching handled (in both vertex and fragment programs)? Does branching effectively increase the number of ops allowed, or is the branching limit confined by a static number of ops.

For example, if I have a loop that contains 32 ops inside, can I only loop 4 times IF the maximum number of ops allowed is 128? Or can I loop 300 times with 128 ops per loop (which might be and effective 300*128 ops)? I'm guessing one can't loop forever.

I'm speaking about the 9700 here.

ehart
09-06-2002, 11:57 AM
On the buffer speed question, I haven't benchmarked one against the other, so I really can't say. The assertions that they should be roughly equivalent if you take into account bandwidth should be correct.

As for loops, the only portion that has flow control is the vertex shader. It is all static flow control, with the total number of specified instructions being 256 and the max executed being in the tens of thousands. (I believe the exact answer is 64k, but I might be off) The loops will unfortunately probably not make the initial SDK release. There is an ARB working group going on this under the name ARB_vertex_program2 or some such thing.

-Evan

PH
09-06-2002, 12:08 PM
Evan, would that be as an extension to the existing ARB_vertex_program in the form of a new OPTION ? If so, why not call it ARB_flow_control ?

Humus
09-06-2002, 12:49 PM
Evan, does the 9700 support blending ala glBlendFunc() on floating point buffers?

ehart
09-06-2002, 01:03 PM
No, it does not support blend. See my comment about no color ops after solor sum. That is really the easiest way to describe it. Just look at the OpenGL machine diagram from the blue book, and all the color ops that are after color sum are the ones unsupported on float targets.

-Evan

Nutty
09-06-2002, 01:42 PM
I heard Matt mentioning something about this too.

Let me get this straight.

Does this mean on NV30 and 9700, you can't multipass with floating point color buffers?

I thought the whole point of floating point framebuffers, was to get rid of the precision lost due to extensive multipassing?

Whats the use of floating point buffers then? And what _can_ you do when you use floating point buffers?

Thanks,
Nutty

PH
09-06-2002, 01:49 PM
That's right, no blending. You'll need to do the blending in the fragment program. I think multipass lighting needs to use 2 floating point textures ( alternating between read and write ). Performance wise, I think it will be similar to blending...read,modify,write, swap textures, repeat ?

Korval
09-06-2002, 02:11 PM
As for loops, the only portion that has flow control is the vertex shader. It is all static flow control, with the total number of specified instructions being 256 and the max executed being in the tens of thousands. (I believe the exact answer is 64k, but I might be off) The loops will unfortunately probably not make the initial SDK release. There is an ARB working group going on this under the name ARB_vertex_program2 or some such thing.

Let me make sure I understand this correctly.

The 9700 is 100% a functional superset of the 8500/9000. You could, if you wanted, expose a few 9700 features in the (very extensible) 8500 extensions like EXT_vertex_shader (define new instructions) and ATI_fragment_shader (define new instructions/relax the restions on opcodes, textures, and possibly passes).

However, instead of doing that, and allowing us to use at least some of that power beforehand (or, heaven forbid, define new extensions that expose that functionality), those of us getting a 9700 will have to wait until the ARB gets off their collective butts and release various ARB extensions that support 9700 functionality? Vendor-specific extensions exist for a reason: so we don't have to wait for the ARB to approve extensions before we can start playing with functionality.

Just how long is it going to take to get these ARB extensions? 4 months? 6? This time next year?

BTW, what kind of control is ARB_fragment_program going to give us? The "texture anywhere" control of NV_fragment_program, or something more limitted (like the passes of the ATI_fragment_shader)?

IT
09-06-2002, 02:20 PM
Everyone's probably waiting for resolution of Microsoft IP claims. See June ARB Notes.

davepermen
09-06-2002, 11:07 PM
Originally posted by Korval:
BTW, what kind of control is ARB_fragment_program going to give us? The "texture anywhere" control of NV_fragment_program, or something more limitted (like the passes of the ATI_fragment_shader)?

always check dx9 as reference, basically the 9700 is dx9, so what? dx9 exposes texture anywhere, so it will be stupid to not expose that in gl.

IT
09-07-2002, 04:47 AM
Which brings me to my original point: If I had DX9, I would know the new OpenGL extensions. If I had the new OpenGL extensions, I would know DX9. :-)