PDA

View Full Version : ARB_super_buffer extension



Corrail
01-24-2004, 04:30 AM
Hi all!

According to the ARB Sept & Dec meeting notes ARB_super_buffer extension will be avaiable about June 04.

I've read about the following features in forums:

render to vertex buffer
treat textures as vertex buffer and use them in vertex shader/programs

But does super buffer extension covers additional pixel buffers like defined in early GL2 spec (you can read/write to them in fragment shaders)?

What else does ARB_super_buffer extension include? What are other new features of this extension?

Thanks a lot
Corrail

[This message has been edited by Corrail (edited 01-24-2004).]

Klaus
01-24-2004, 07:33 AM
Have a look at the Siggraph presentation of James Percy:
http://www.ati.com/developer/SIGGRAPH03/Percy_OpenGL_Extensions_SIG03.pdf

- Klaus

Corrail
01-24-2004, 08:13 AM
Cool! Thanks a lot!

tang_m
09-15-2004, 06:41 AM
I see a lot of discussion on super buffer, could anyone give me some hints about how to use the extension? I have read the SIGGRAPH 03 ppt file, but I am still quite confused. are there any sample for this extension?

Korval
09-15-2004, 09:07 AM
ARB_super_buffer is, for the time being, dead. Any render-to-* functionality will be coming from ARB_render_target and other extensions based on this.

idr
09-15-2004, 11:03 AM
I just want to clear things up a bit. There never was a spec proposed called "super_buffers" or anything like that. The working group is called "super buffers", but that's it. The original extension, brought to the ARB by ATI a long, long time ago was called uberbuffers (with an umlaut over the first u), and it is, for all intents and purposes, dead.

In the interim, there were numerous specs proposed to fix or work around various perceived issues with the original uberbuffers proposal. The most famous of which is Nvidia's EXT_render_target spec (http://www.opengl.org/resources/features/GL_EXT_render_target.txt) (see the discussion (http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=011747) too).

The spec that is currently in front of the WG is not uberbuffers and it is not EXT_render_target. I think it's fair to say that it is born of those two specs. It's not quite done cooking, but a yummy aroma is filling the kitchen. ;) Don't worry. Nobody wants to see this completed more than the people that have been working on it for the past 2+ years!!!!

Korval
09-15-2004, 11:49 AM
The spec that is currently in front of the WG is not uberbuffers and it is not EXT_render_target. I think it's fair to say that it is born of those two specs. It's not quite done cooking, but a yummy aroma is filling the kitchen.Considering that there was precious little that was good about superbuffers (and that there was lots good about EXT_RT), I fail to see what good is going to come of taking an excellent spec and adding hints of a lesser one to it. As far as I'm concerned, once the minor points of the spec were finished (interactions with other things, etc), EXT_RT was finished and ready for implementation and use.

More importantly, why not drop a preliminary spec on us, ala EXT_render_target? Even ATi gave us a few peeks at what superbuffers would look like API-wise a while back, which prompted some discussion.

idr
09-15-2004, 12:56 PM
Well, I'm not going to delve too much into the details, but a lot of people, myself included, that actually have to implement this stuff in drivers had some serious concerns about EXT_rt. I'll enumerate the two biggest ones:

Any texture can be renderable. On a lot of hardware this would require either software fallbacks or some evil CopyTexImage-like hacks. There's no such thing as a stencil or accumulation texture, yet both are useful for rendering.
That said, I think the resulting spec is, in a lot of ways, more like EXT_rt than uberbuffers. Like I said in a previous post, the collaborative design process that started after we spent a long, long time trying to "fix" uberbuffers to make everyone happy has been really useful. The current spec isn't perfect by any means, but I think that in a year, after they've forgotten how long it took, people will be happy with it and will think we did the right thing.

Korval
09-15-2004, 03:15 PM
Any texture can be renderable.Yes. That's the whole point of render-to-texture, and one of the principle faults of the superbuffers proposals.


On a lot of hardware this would require either software fallbacks or some evil CopyTexImage-like hacks.OK, that doesn't quite make sense. I presume you're refering to swizzling of textures as to why this would be required (or, at least, one reason why not). Well, it wouldn't be terribly hard to simply change the internal format of the texture when it is bound as the render target (into a linear one). Once the rendering is complete, the texture can be re-swizzled. More importantly, if a particular texture is frequently used as a render target, then it can just be stored linearly by default, and the user simply won't get the benifits of the improved performance that swizzling offers.

Yes, I know that the 1.5 GL spec expressly disallows reallocation of textures by drivers, but the render-texture extension could easily modify this to allow binding a texture to change its format. Extensions are, by definition, allowed to modify the spec.

Given the above, I suspect that this new version has created a bunch of new texture formats that are required-use for render targets. And that this extension forbids the use of just any old texture as a render target, requiring the use of specific textures.

As such, if it should ever arise that an already existing texture needs to be a render target (without foreknowledge. Say, I'm writing a library that someone else uses), then I, the user, must create a new renderable-texture, draw the old texture onto it, and delete the old texture. These are things that a driver has both the right and the responsibility to do for us.


There's no such thing as a stencil or accumulation texture, yet both are useful for rendering.First, I disagree that they are useful as render targets. The accumulation buffer, maybe. The stencil? Not very. Maybe a little, but not terribly often. Certainly not worth damaging a perfectly clean API over.

Second, the EXT_RT extension provided for a depth-stencil texture format. If a separate depth and stencil were needed (not that hardware supported it), then those formats could be exposed by this extension or a future one.

tang_m
09-15-2004, 04:38 PM
Originally posted by Korval:
ARB_super_buffer is, for the time being, dead. Any render-to-* functionality will be coming from ARB_render_target and other extensions based on this.What I am really concerning about is the ability to render to vertex-buffer/vertex-array. I read the specification at http://www.opengl.org/resources/features/GL_EXT_render_target.txt, the feature seems missing. Does anyone have any idea about other method to achieve this? Read back from texturing is too slow, and NV_pixel_data_range extension give me some hints, but it still is a readback process.

V-man
09-15-2004, 06:04 PM
tang_m, EXT_render_target's purpose was to take over the job of p-buffer based textures. The spec is serving as a model for something else it seems according to idr.

After rendering to it, you can always sample it in a vertex program and do what you want from there.

Using a texture as a vertex buffer... I'm not sure how that will work as of yet, but it sounds like a totally new extension is needed and EXT_render_target need not be concerned by it.

tang_m
09-15-2004, 06:45 PM
Originally posted by V-man:
tang_m, EXT_render_target's purpose was to take over the job of p-buffer based textures. The spec is serving as a model for something else it seems according to idr.

After rendering to it, you can always sample it in a vertex program and do what you want from there.

Using a texture as a vertex buffer... I'm not sure how that will work as of yet, but it sounds like a totally new extension is needed and EXT_render_target need not be concerned by it.Thanks, vertex texture is good, but I am using an ATI 9800, which don't support SM3.0 features.

M/\dm/\n
09-15-2004, 08:26 PM
It IS extension, like NV_/EXT_NPOT where it doesn't matter that you have to sample texture in [0, dimension] format, it is fixed in ARB good, but we were using old approach for quite some time. I would love to see RT extension in string so I could use it right now, & as I allready mentioned it's not in core so it may have limitations, fixes to be applied when it goes to ARB. Wasn't that the point of extensions?

ffish
09-15-2004, 08:59 PM
tang_m, if you want to render to vbo have you looked at EXT_pixel_buffer_object? I was messing around with this a couple of weeks ago and got the results I wanted with using a floating point texture , rendering to a floating point p-buffer and reading this into a vbo. Have a look at the specification for EXT_pbo - it gives an example of render to vertex array that I based my work on.

Zak McKrakem
09-15-2004, 10:20 PM
Originally posted by idr:
The spec that is currently in front of the WG is not uberbuffers and it is not EXT_render_target. I think it's fair to say that it is born of those two specs. It's not quite done cooking, but a yummy aroma is filling the kitchen. ;) Don't worry. Nobody wants to see this completed more than the people that have been working on it for the past 2+ years!!!!Does anybody know about the state of the spec that is currently in front of this WG?
It is supposed that this week (or last week) has been the latest ARB meeting.
Has anything been cooked (last version posted of the EXT_rt spec is from this year's April) or are there to many cookers in the kitchen putting their hands in the same bowl?

tang_m
09-16-2004, 12:34 AM
Originally posted by ffish:
tang_m, if you want to render to vbo have you looked at EXT_pixel_buffer_object? I was messing around with this a couple of weeks ago and got the results I wanted with using a floating point texture , rendering to a floating point p-buffer and reading this into a vbo. Have a look at the specification for EXT_pbo - it gives an example of render to vertex array that I based my work on.Thanks a lot, that's what I am really looking for. I will try it.

Corrail
09-16-2004, 01:33 AM
Originally posted by V-man:
Using a texture as a vertex buffer... I'm not sure how that will work as of yet, but it sounds like a totally new extension is needed and EXT_render_target need not be concerned by it.I wrote an extension idea about that at
http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_to pic;f=3;t=012395 (http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=012395)
This would solve the render to vertex array problem quite simple but there's a problem because (as Korval said) mostly texture aren't stored linearly.

idr
09-16-2004, 08:38 AM
Korval,

I think you've missed some the subtle implications in EXT_rt and my post. You don't see how stencil buffer is useful as a render target? With EXT_rt (or uberbuffers or...) you need to provide all the buffers (color, depth, stencil, etc.) that are part of the pixel format in classic, on-screen rendering. Saying the stencil buffer isn't useful as a render target is the same as saying the stencil buffer isn't useful on-screen rendering. I'm sure Carmack would disagree. ;)

The depth-stencil format was an ugly hack. It was a one-off fix that didn't solve the underlying problem. What does it mean to use a "depth stencil" texture for texture mapping? It's nonsense, but the API would have to allow it. When you bring accumulation buffers and multisample buffers into the equation, it falls apart even more.

Since there's so much confusion about it, I'll give a concrete example. With EXT_rt you could say, "I have an RGB332 texture, and 16-bit depth buffer, and an 8-bit stencil buffer that I want to render to." I know of no hardware that can draw to that combination of buffers. However, the way the API is designed, the driver has to do it. The only options are to fallback to software rendering or internally (i.e., behind the application's back) draw to a different buffer and copy the results. Both of which, IMHO, suck. One the one hand you have unacceptable performance, and on the other hand you have invariance issues. For a driver writer, it's a lose/lose situation.

To make it even worse, I seem to recall that you could specify a compressed texture as a render target. I don't even want to go there...

Uberbuffers, on the other hand, has a really complicated, heavy-weight mechanism for the application to describe what kinds of buffers it wanted, then as the driver what it could do. It was something like a radiation mutated ChoosePixelFormat. It was evil and universally despised.

I hope people can now see why I get so irritated when the WG is accused of arguing over petty crap. :(



Given the above, I suspect that this new version has created a bunch of new texture formats that are required-use for render targets. And that this extension forbids the use of just any old texture as a render target, requiring the use of specific textures.

As such, if it should ever arise that an already existing texture needs to be a render target (without foreknowledge. Say, I'm writing a library that someone else uses), then I, the user, must create a new renderable-texture, draw the old texture onto it, and delete the old texture. These are things that a driver has both the right and the responsibility to do for us.
To answer your questions, yes and no. Handling the situation you describe was one of our specific design goals. In fact, that was one of the big gripes people had with uberbuffers, not surprisingly. The only times you are required to use new formats is if you're rendering to something that you can't texture from (i.e., stencil) or if you don't intend to texture from it (i.e., render using a depth buffer that you won't ever use as a depth texture). In the latter case it just provides some optimization opportunities for the driver.

Like I've said a whole bunch of times...the resulting API is quite clean and gives quite a lot of functionality. I really think people will like it.

tang_m, V-man: Our scope has been limited to replacing pbuffers. Any functionality beyond that will be a follow-on effort. I think that's one of the problems we had at the start. We bit off way too much at once.

Zak: The ARB meeting is next week.

Korval
09-16-2004, 12:19 PM
With EXT_rt (or uberbuffers or...) you need to provide all the buffers (color, depth, stencil, etc.) that are part of the pixel format in classic, on-screen rendering. Saying the stencil buffer isn't useful as a render target is the same as saying the stencil buffer isn't useful on-screen rendering. I'm sure Carmack would disagree.For regular on-screen rendering, you don't have to create anything with EXT_RT. The driver created it for you. So you can still have your regular screen stencil buffer.

And what Carmack would agree or disagree with is of little interest to me. I've always preferred depth shadows to stencil ones, so I have little use for the stencil buffer.


What does it mean to use a "depth stencil" texture for texture mapping?It (should) mean the same thing as using a depth texture. The stencil part is ignored.


The only options are to fallback to software rendering or internally (i.e., behind the application's back) draw to a different buffer and copy the results. Both of which, IMHO, suck.Why do they suck? If you ask to render to an RGB332 texture, you deserve what you get in terms of performance. It's just like using the accumulation buffer on hardware that clearly doesn't support it (anything less than R300 and NV30). You get poor performance.


To make it even worse, I seem to recall that you could specify a compressed texture as a render target. I don't even want to go there...The only difference is that the copy back needs to compress the texture.


I hope people can now see why I get so irritated when the WG is accused of arguing over petty crap.Well, by your own admission, the uberbuffers extension was going in very much the wrong direction. Either those proposing it should have seen that, or those working with them should have done something. Both sides failed, even (or especially) if one side failed due to lack of presence on the issue.

I, for one, would prefer the "arguing over petty crap" to what happened with uberbuffers. At least with an actual argument, everyone is aware of the situation. Effectively, you said that a number of people weren't even involved with uberbuffers for a long time, which means that these people were negligent. I prefer petty over negligent.


Like I've said a whole bunch of times...the resulting API is quite clean and gives quite a lot of functionality. I really think people will like it.The same thing was said about uberbuffers 2 years ago. I'm sorry, but just saying it doesn't cut it anymore; the ARB has burned its credibility, and now needs to prove itself.

Jan
09-16-2004, 02:03 PM
Hey Korval, may i ask you, if you are a driver writer? Or do you have experience in it? Because it sounds as if you are not, but still you tell people how to do the job.

I donīt want to offend you, but if you donīt have experience in writing a graphics card driver, you shouldnīt simply assume, that it is so easy to implement a good RTT mechanism.

Of course, i think the same way. I am very impatiently waiting for a render-to-texture mechanism, that is better than p-buffers, but on the other hand i DO want it to be good this time. Therefore i prefer a spec, which is powerful, yet "easy" to implement (well, as easy as such a thing is possible). I donīt want that my app falls back to software rendering on certain cards or on certain drivers. I want to be able to tell the driver to use a format, which is suitable for THAT hardware, so i can be sure to get hw acceleration on any card - as long as i donīt force it to use one format for some reason.

I really think it is a shame, that we donīt have proper RTT support in OpenGL, but i think the shame is, that this issue was taken up so late. Now that there is work in progress, i think they should take their time to make it as good as possible. No one will be happy with another crappy extension, only because it gets released fast.

idr: Nice to get a bit information from an "insider". I think people would sometimes be able to be more patient, if they got some sort of status a bit more regularly. At the moment we donīt get really informed, IF there is work in progress and HOW FAR it is, which makes many people think "what the heck makes them take so long?".

Jan.

Korval
09-16-2004, 03:00 PM
I donīt want to offend you, but if you donīt have experience in writing a graphics card driver, you shouldnīt simply assume, that it is so easy to implement a good RTT mechanism.Ease of implementation is not that big a factor in defining extension specs. If it were, we wouldn't have a full-fledged C optimizing compiler and linker in our drivers. If you can justify glslang-level complexity in drivers, then you can justify making the driver developers do a little trickery with render-to-texture with unusual texture formats.

Jan
09-17-2004, 12:45 AM
Well, from what idr tells, it sounds as if they DO care that it is easy to implement.

And i think, we would benefit from this too, because a complex spec will take a long time until its implementation appears in the driver and it might still be very buggy.

Just look at glSlang. The first implementation came out in November or December (if i remember correctly). But still there is no driver, which implements everything correctly, and at full speed. Also new stuff gets usually added to ARB_fp or NV_fp, and not directly to glSlang, because it is easier to do it this way. Is this really an advantage? Sure, i can PLAY with glSlang, but everytime something gets added or i exceed some limits, i have to switch to ARB_fp, because it is more powerful. So i still have to WORK with ARB_fp.

Although we are waiting for quite a long time now, i think that waiting will pay off in the end. At least i hope so.

Jan.

davepermen
09-17-2004, 01:50 AM
Originally posted by Jan:
I donīt want to offend you, but if you donīt have experience in writing a graphics card driver, you shouldnīt simply assume, that it is so easy to implement a good RTT mechanism.hm.. the driver devs handled it quite fine on the dx side. how can they have such issues on the opengl side?..

idr
09-17-2004, 08:53 AM
Jan,

It isn't so much ease of implementation as it is giving an API that is consistent with the rest of OpenGL and meets the priciple of least surprise. Being able to easilly do things (as an application programmer) that can hit software paths or funky paths with invariance problems violates that in a big way.

My RGB332 example was pretty bad, so let me try a better one. Think about it like this. Right now every card can render to an RGB565 framebuffer, but that may not always be the case. In the future hardware may only do RGBA8888 and floating point framebuffers. How would you like it if you application that ran in hardware today ran in software on the GeForce 1,000,000? This may seem silly, but not too long ago cards had a 15-bit depth / 1-bit stencil mode, and now most don't.

Adruab
09-17-2004, 10:49 AM
Shouldn't there be some way to query what kind of support the device has in those types of cases (renderable pixel formats, etc.)? Automatic stuff like ChoosePixelFormat or whatnot could help (pick a similar target that's not software), but in my opinion it would also be good to have a more informative approach. For example, what happens if you want to do render to RGB332 texture and the card doesn't support it at all? It seems like it would be better to be able to know exactly what the card supports than have it guess (a more relevant example is with nvidia cards supporting float16 blending but not float32... how would you get this info?). What kind of approach does the new spec take to solve this problem?

I know DX is axing it's whole caps system, but at least you can find supported format info in there (albeit it was inside a giant structure... horrible... *shudder*).

Korval
09-17-2004, 10:58 AM
Also new stuff gets usually added to ARB_fp or NV_fp, and not directly to glSlang, because it is easier to do it this way.What "stuff" are you referring to? Are you refering to the platform-specific extensions (like NV_fragment_program_2), or are you refering to something else? The former is purely nVidia's discretion, as they are the only ones who added significant hardware features recently. ATi allows their cards to support the minor feature upgrades through glslang (longer instruction counts and more temporaries/uniforms).


Right now every card can render to an RGB565 framebuffer, but that may not always be the case. In the future hardware may only do RGBA8888 and floating point framebuffers. How would you like it if you application that ran in hardware today ran in software on the GeForce 1,000,000? This may seem silly, but not too long ago cards had a 15-bit depth / 1-bit stencil mode, and now most don't.What happens if I have already created that 16-bit texture, and I suddenly have the need to use it as a render target? Treating render targets differently from regular textures means that this is impossible; I have to now do a copy operation that the driver is perfectly capable of doing for me (and I know this by the fact that I am able to do the copy by creating a valid render target and rendering the old texture to it) The driver could do this for me, and I would not have to set various values that should be 'const' (like the texture object name. Rendering to a texture clearly should not change the texture object name, so why shouldn't it be C++ const?).

Is it doable? Clearly. Is it really annoying? Yes.

idr
09-17-2004, 04:29 PM
Originally posted by Korval:
I have to now do a copy operation that the driver is perfectly capable of doing for meThe driver can't do the copy for you, sorry. Doing so would (in most cases) require two copies and would break invariance. What about that do you not understand? By "invariance" I mean if you had a texture, the driver did a copy "for you", you modified 1 pixel, when the driver converted it back to the texture format other pixels might change due to the bit-depth changes. That is absolutely unacceptable.

In any case, it's not going to be as cumbersome or painful in the case you're worrying about as you think it will. The developer relations groups at several of the involved companies have kept us honest. I'm really hoping that the next time I post to this thread it will be a URL to the spec...

Korval
09-18-2004, 12:04 AM
I mean if you had a texture, the driver did a copy "for you", you modified 1 pixel, when the driver converted it back to the texture format other pixels might change due to the bit-depth changes. That is absolutely unacceptable.As long as you render to higher bitdepths, there is no invariance issue. Also, there should not be a copy back to the original buffer; that buffer should be destroyed and the render target should now become the texture in question. And, as I pointed out, that violates some part of the GL spec, but extensions are all about modifying the GL spec. As such, this spec would be perfectly capable of allowing drivers to actively change the format of the texture at will.


In any case, it's not going to be as cumbersome or painful in the case you're worrying about as you think it will. The developer relations groups at several of the involved companies have kept us honest.Is there going to be a "glTexReFormat" function that takes a texture and reformats it into something that is useful for a render target? Because that is effectively the functionality that we're talking about.


I'm really hoping that the next time I post to this thread it will be a URL to the spec...I'm there with you on that one.

Jan
09-18-2004, 01:28 AM
Originally posted by Korval:
What "stuff" are you referring to?I was actually referring to the draw_buffers extension.

Jan.

zeckensack
09-18-2004, 03:47 PM
Originally posted by Korval:
Also, there should not be a copy back to the original buffer; that buffer should be destroyed and the render target should now become the texture in question.I think that's a very good idea.
And, as I pointed out, that violates some part of the GL spec, but extensions are all about modifying the GL spec.Yes. Let's see:
"A GL implementation may vary its allocation of internal component resolution
or compressed internal format based on any TexImage3D, TexImage2D (see below),
or TexImage1D (see below) parameter (except target), but the allocation and
chosen compressed image format must not be a function of any other state and cannot
be changed once they are established."

The proxy mechanism would suffer. Anything else?

I wouldn't mind. An R2T extension should have the right to relax this restriction IMO. Suggestion:
"... and can only change in reaction to the texture object being bound as a render target. Such a change of texture internal format may only happen if it promotes the texture to a higher precision internal format, and may happen at most once for any given texture object."

This would mean
1)absolutely no change in behaviour for applications that don't use the new functionality.
2)no precision loss

And for implementations
1)a single extra "has_been_a_render_target" bit per texture object that is initially zero, but can only be set, never cleared. Format promotion can only happen when this bit is going to be set but is not, which, by basic logic, can happen only once during the lifetime of the object.
2)they gain the right to pick any format that can be rendered to, as long as it has the same components and at least the same component resolution(s).

Zak McKrakem
09-23-2004, 09:32 PM
Originally posted by idr:

Zak: The ARB meeting is next week.One week has passed.
So, Is there any new info/status about this extension from this meeting?

Thank you.

Korval
09-23-2004, 10:34 PM
One week has passed.He probably meant next weekend.

Zak McKrakem
09-26-2004, 12:40 AM
Originally posted by Korval:

One week has passed.He probably meant next weekend.Weekend? The ARB meetings have never been on weekends.
At this moment, I suspect that if we have not hear anything about it, probably is because there is nothing new.

harsman
09-26-2004, 03:50 AM
As the numbr of held ARB meetings increases, the time until minutes are posted at OpenGL .org approaches infinity.

Korval
09-26-2004, 11:24 AM
As the numbr of held ARB meetings increases, the time until minutes are posted at OpenGL .org approaches infinity.True, but that doesn't stop one of the participants from coming in here and talking about it.

KRONOS
10-07-2004, 10:00 AM
It is the 7th of October of 2004. Where is the spec?

I'm bitching... Someone care to join?!
At least someone from the ARB might say something... :rolleyes:

bobvodka
10-07-2004, 12:14 PM
Bob's Theory Number 24 : No News is Bad News ;)

3k0j
10-07-2004, 03:26 PM
Originally posted by KRONOS:
I'm bitching... Someone care to join?!:cool: Anytime, my friend.

I'm afraid we might soon need 2 new forums:

OpenGL.org Discussion & Help Forums (http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi) >> DEVELOPERS (http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?category=2) >> Switching to DirectX: beginners
OpenGL.org Discussion & Help Forums (http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi) >> DEVELOPERS (http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?category=2) >> Switching to DirectX: advanced

Korval
10-07-2004, 10:06 PM
Well, I can say this: the fact that there has not been a spec released (unlike VBO, which was dropped on us from out of nowhere) shows that the ARB didn't come to a decision. That means another 3 month wait.

For Christ's sake, nVidia, Apple, and 3DLabs should have just released the EXT_RT spec and implemented it back in May (or whenever it was that could have finished it). Who cares if ATi or anyone bothered to implement it? At least we would have something!

Better to be slaves to Microsoft than to the ARB. If the ARB can't treat their own users any better than this, why should we stick with them? Microsoft may not give you the best answer or the right answer, but they, at least, give you a functioning one.

barthold
10-08-2004, 11:59 AM
All,

Back in March I became the workgroup chair of the super-buffers ARB workgroup. I'll give you a bit of history, and where we are today.

The uber-buffers spec that the workgroup had been working on became too complex to be confident we could produce a solid specification. It tried to solve too many things at once (render-to-texture, render-to-vertex array, buffer swaps, generic memory objects). The workgroup could not reach consensus on how to proceed. After some debate we decided to take a step back and solve the most urgent problem first: Adding a solid, futureproof, replacement for pBuffers and render-to-texture functionality to OpenGL without involving the windowing system.

EXT_render_target, which you all are familiar with, was one attempt at reaching this goal. A few other proposals were also made, and submitted to the workgroup. The workgroup took the best parts of those specifications and folded that into a new specification called EXT_framebuffer_object. This spec is now close to completion. Seven companies worked together, and are working, hard on this specification. Those are (in no particular order) SGI, ATI, NVIDIA, IBM, Intel, Apple and 3Dlabs. It might be of interest to know that ATI and NVIDIA are succesfully co-authoring the specification. Please have a bit more patience, there will be a specification soon.

EXT_framebuffer_object offers pbuffer functionality in core OpenGL, where the most important application will likely be render-to-texture. This, for example, means only a single context will be needed to render to many textures. Once EXT_framebuffer_object is finished we'll tackle other problems like render-to-vertex array.

I would encourage anyone of you that is interested in participating in the numerous activities the ARB undertakes to become an ARB participant. As an ARB participant you can attend ARB meetings, join workgroups and participate in the technical discussions. It is your chance to provide feedback and interact with the ARB. Please contact Jon Leech if you're interested in this.

Regards,

Barthold
super-buffers ARB workgroup chair
3Dlabs, Inc.

Corrail
10-08-2004, 12:33 PM
Thanks a lot for this information, barthold! It's good to hear that this extension is still alive.

Korval
10-08-2004, 10:25 PM
Please have a bit more patience, there will be a specification soon.I don't like words like "soon" being spoken by ARB members. "Soon" to them can be (and, in this case has been) up to and including 2 years. I want firm dates. October 31, for example. Or even semi-firm dates, like the end of November. Something that gives us an indication of any progress that you might or might not have made.

At least it has a name: EXT_framebuffer_object. However:


EXT_framebuffer_object offers pbuffer functionality in core OpenGLI really don't like that description, or what it suggests about EXT_FBO. The PBuffer extension, even if you take out the context switch, is a disaster. It looks nothing like the subtle elegance of EXT_RT. You have to enumerate these pixel formats before you can even allocate a framebuffer. Then, once you've allocated it, you have the nonsense that ARB_RT has, where the data for render targets doesn't actually stay with the texture after you unbind it from the buffer. I'm hoping that this was just a poor/misleading choice of words, rather than a true idea of what the extension details.

Honestly, can it be that hard to come up with decent RTT? Microsoft did it. Is there something in OpenGL that makes this exceptionally difficult, or have the voting ARB members simply lost their minds?

V-man
10-09-2004, 02:30 PM
That's what was presented at sigg 2004

http://www.gamasutra.com/features/20040830/mcguire_02.shtml

The only thing it forgets to mention is that it is not public yet, but the rest of the article is dead on.

It also mentions Dual vertex shaders, one for pre tesselated surface and the other for post tesselated surface.
Also mentions geometry shaders. Killing polygons will be possible here.

The ARB has a lot of work to tackle.

PS: there is more!

SirKnight
10-09-2004, 04:34 PM
This is nice something like this is comming along. I would like to see sort of a pseudo example of a render to texture snippet using this new extension. If it's anything like how RTT is done in D3D it's a huge welcome because the RTT method in D3D is pretty darn good.

And you know, advances like this in OpenGL really need to speed the heck up. It's very aggrivating D3D gets good interfaces sooner. :(

-SirKnight

l_belev
10-10-2004, 06:41 PM
Korval, lets not rush ARB like that. Of course we ALL want that extension as soon as possible and so do the ARB members themselves. But it's better to have a well thought and far-sighted one after few more months instead of having it right now but with various nasty problems popping up afterwards.

For any urgent needs we may have we can use the pbuffers, which are a piece of crap but still work for most things. If we lived with them for 2 years it's not a big deal if we put up with them for a few more months.

I'm not familiar with the Microsoft's RTT API, but if it's really good that would be unprecedented. Usually they say "We can't wait because the people might start switching to OpenGL" and they rush the things. An as a consequence their APIs are usually near the border between the useful and the useless with a lot of problems, some of which can be easely avoided, some require ugly workarounds possibly with performance costs, and some with no possible workarounds and rendering that part of the API completely useless.

So if one likes the Microsoft's model, why then he's just don't use DirectX?

Please people, be reasonable! Dont say things like: "I want it and I want it now, I don't care of any rational considerations why to postpone anymore!!". Of course we may badly want many things, but we also have minds, and we should see the reasons behind the things, shouldn't we?

Korval
10-10-2004, 09:52 PM
Korval, lets not rush ARB like that.Rush the ARB? It's been 2 years since the initial discussions of RTT functionality. I'm not interested in their excuses (like the nonsense about Superbuffers spending 1.5 years going in the wrong direction or other crap mentioned in this thread). I'm interested in results. If they aren't going to get the job done, maybe we need to find someone who will.


I'm not familiar with the Microsoft's RTT API, but if it's really good that would be unprecedented.You haven't looked at D3D recently, have you? The only two reasons to rationally not use D3D (as opposed to simple inertia and personal preference) these days are potentially lower performance (due to the D3D runtime doing command marshalling rather than the driver) and if you care about non-Windows platforms. Everything else is just personal preference or sticking with something you're used to.

Nowadays, D3D is a very clean API for doing 3D rendering. Indeed, OpenGL, in its overall API, is less clean and easy to use than D3D these days.


I want it and I want it now, I don't care of any rational considerations why to postpone anymore!!That's the problem: there are no rational reasons why we don't have this yet; only excuses. And, right now, we don't even have that, since nobody from the ARB is willing to consider engaging us.

Name 1 rational reason why we don't already have this functionality that isn't just an excuse (like them wasting 1.5 years doing the wrong thing, or arguments among ARB members).

tfpsly
10-10-2004, 11:24 PM
Even a temp extension where we could just :

* create a texture as a render targer, automatically using the same format as the framebuffer (rgba 32 or 16 bits). No user defined texture format.

* set active render target as the framebuffer or texture x

would be enough.

AdrianD
10-11-2004, 03:23 AM
i agree with Korval. 2 years for a crappy RTT extension sounds like a bad joke to me...
what kind of HW restrictions don't allow to make a simple, clean solution ?
if directX can handle it, why OpenGL shouldn't ?
it's the same hardware.

l_belev
10-11-2004, 03:51 AM
As I said, here's the reason: we don't urgently need to rush since in the meantime we can still use the pbuffers, and so we CAN afford a little more postponing if it means better API at the end. Now doesn't this sound like a reason to you? Also you speak of ARB like they are somewhat cheating us or something. I think that they have no interest in doing so. They just made a big mistake with the super buffers. Besides this, I don't know of other reason why they are taking so long. Possibly there is some inflexibility in the way they're doing their work, like bureaucracy?

You're right, it's been a while since I used d3d. The last I know is d3d8. But I doubt d3d9 shines, because AFAIK it's only new features on top of the d3d8, so I can't believe the base API architecture is less crappy than it's predecessor. In fact I know a thing or two about it's new features and how they are designed. They designed it for the ATI hardware at that time because then they were in bad relations with nvidia, and so there's no support for nvidia specific features as shadow mapping, or depth-to-color copying (in fact in d3d still there's no way to read the depth buffer at all). It looks like their main considerations when designing their d3d9 API weren't technical, but their nasty intrigues to show the hw vendors who has the power.

l_belev
10-11-2004, 05:03 AM
In fact ARB made not one but two big consequitive mistakes: first the RTT by pbuffers and then the super buffers. That's why we are now in this mess. They basically did it for the very same reason that I now tell you not to do: they rushed them.
So please let them finish their job well this time and not make a mistake for a third time.

l_belev
10-11-2004, 05:09 AM
In fact ARB made not one but two big consequitive mistakes: first the RTT by pbuffers and then the super buffers. That's why we are now in this mess. They basically did it for the very same reason that I now tell you not to do: they rushed them.
So please let them finish their job well this time and not make a mistake for a third time.

Korval
10-11-2004, 06:08 AM
if it means better API at the end.It would take any decent API designer no less than 15 minutes to sketch out what any reasonable render-to-texture API would look like. You allocate a texture, either via glTexImage or some new entrypoint. You bind it to a framebuffer, which may or may not be an object. And you render to it. End of story.

After that, it might take a few months to iron out the extension specification (conflicts with other things, modifications to the API to make it more future-proof), but we're well past that 2-3 month time period.


Now doesn't this sound like a reason to you?No. That is an excuse as to why we don't have RTT yet. Saying that, "Well, we have this other alternative we can use in the meantime" does not excuse the fact that this alternative is crappy, nor does it provide a valid reason why we don't have a decent alternative yet.

A semi-valid reason would be something along the lines of, "ATi and nVidia hardware support of texture rendering is so diametrically opposed that it is very difficult to find a common subset that they both support." The semi-valid part is that I refuse to believe that there can be such substantial differences between two implementations of something that is relatively simple, hardware wise.


That's why we are now in this mess. They basically did it for the very same reason that I now tell you not to do: they rushed them.Bull.

There was no signficant pressure on the ARB when they came out with pbuffers and pbuffer-based render-to-texture (which, btw, is a pretty neutered extension, since the data rendered into the texture doesn't really stay there after you unbind it, as I recall). So, if there was rushing, it was their own fault.

As for superbuffers, they took 1.5 years working on it, and there is nothing to show for it. 1.5 years for an extension can't be considered rushing, especially considering that they ultimately abandoned it (ie, they never finished it).

So, in either case, the ARB simply screwed up.

What really needs to happen is this. For new extensions, people should just drop them in a virtually finished state on the ARB on the first day of any meeting. Then, platform-specific issues can be discussed and worked out. After which time, the ARB votes on it. If the spec is rejected, then that's it; no more discussion takes place on that spec. It is clear that the ARB can't actually build a spec from the ground up, so they shouldn't continue to try. A single parties simply present nearly finished specs to them for approval or rejection, with a multi-hour meeting in the middle to iron out details that the single party may have missed.

l_belev
10-11-2004, 07:09 AM
Ops, sorry for my prev comment being posted twice, it was my browser's fault.


Originally posted by Korval:

if it means better API at the end.It would take any decent API designer no less than 15 minutes to sketch out what any reasonable render-to-texture API would look like.Is this a joke? This is not possible neither for the ARB members nor for you or me or any other human being. This is because of the way the human thinking works - it just needs time to settle the conceptions down. The hypothetical api designer you speak of could think that he's got a perfect picture of the things, but if he has any prior experience, he'd know well that always the first (and also the second and the third) idea of how the things should be turn out to be wrong.

You're in a great delusion if you think that if they just approve the first thing they are able think of, you'll be very happy with it afterwards. More likely, it would be a disaster for the OpenGL.

Now we at least managed to find where the misconception comes from - it's from you having no idea what is like to design a future-prove API :-)
It's not obvious, but believe me, it's far from being just a 15-minute entertainment.

V-man
10-11-2004, 08:42 AM
How to create a RTT on D3D9:

D3D has a function called CreateTexture (similar to glTexImage2D)

HRESULT CreateTexture(UINT Width,
UINT Height,
UINT Levels,
DWORD Usage,
D3DFORMAT Format,
D3DPOOL Pool,
IDirect3DTexture9** ppTexture,
HANDLE* pSharedHandle
);

For Usage, set it to D3DUSAGE_RENDERTARGET
and set Pool to D3DPOOL_DEFAULT.

and of course, the format.

EXT_render_target did not have parameters for glTexImage2D and I think this was an issue cause it implies any texture can be a RTT.

So anyway, RTT is not rocket science, unless you choose it to be. The API will have to reflect the hw generation of it's day of course, so some limitations must be present.

l_belev
10-11-2004, 09:06 AM
Honestly, I also cant think of why exactly they are still delaying the extension. Surely it's not a job for 15 minutes, but they really took too long. Probably there are other aggravating circumstances like quarrels between the competting hw vendors, which, of course, is not to be tolerable by us.

jwatte
10-11-2004, 10:05 AM
D3D performance is fine, which leaves the only reason for using OpenGL to be the cross-platform and cross-vendor support. Which is important to most of us! The ability to introduce functionality with extensions is also useful, although the bigger the features get, the less likely a vendor is to strike out on their own and risk having to re-work it later.

Actually, the surrounding goop around D3D (DXUT and whatnot) is rather crufty, but hey, it's a supported SDK rather than an abstract standard.

But in this discussion you're forgetting the major reason to use OpenGL instead of D3D: D3D does not have a QUAD primitive! :-)

Korval
10-11-2004, 07:03 PM
This is not possible neither for the ARB members nor for you or me or any other human being.And yet, I just did. Or did you not read this: "You allocate a texture, either via glTexImage or some new entrypoint. You bind it to a framebuffer, which may or may not be an object. And you render to it."

To be more specific, you may or may not make a glTexImage-like entrypoint. You make an entrypoint for binding a texture to a framebuffer. Maybe you allow framebuffers to be objects, like texture objects. That's the beginning and the end of the API right there.

There are some details as to the workings of this API that are missing: the specific parameters and so forth. But it wouldn't take more than 10-20 minutes of thought to decide what reasonable parameters you might want (binding to different parts of a framebuffer, like depth/stencil or AUX buffers, etc) for these 2-4 functions.


The hypothetical api designer you speak of could think that he's got a perfect picture of the things, but if he has any prior experience, he'd know well that always the first (and also the second and the third) idea of how the things should be turn out to be wrong.However, that hypothetical API designer, also, knows that RTT functionality is already designed. A good designer weeds out problems in the design phase. We know what we need with RTT: the ability to allocate a texture that can be used both as a source texture and as some part of a framebuffer for rendering to. The design is done; the next step is API building, which is very simple when you have a design to work with.

V-man
10-12-2004, 07:19 AM
Originally posted by jwatte:
D3D performance is fine, which leaves the only reason for using OpenGL to be the cross-platform and cross-vendor support. Which is important to most of us! The ability to introduce functionality with extensions is also useful, although the bigger the features get, the less likely a vendor is to strike out on their own and risk having to re-work it later.

Actually, the surrounding goop around D3D (DXUT and whatnot) is rather crufty, but hey, it's a supported SDK rather than an abstract standard.

But in this discussion you're forgetting the major reason to use OpenGL instead of D3D: D3D does not have a QUAD primitive! :-)I think you need to do some more coding.
Haven't you noticed any unpleasentness about D3D that GL does not have?

My favorite is window resizing.

l_belev
10-13-2004, 04:55 AM
Originally posted by Korval:
And yet, I just did.That's OK, if it makes you happy, but fortunately the people at ARB take that problem much more seriously. Do you really think no one else didn't think of this before - "Just create a texture, bind it to some kind of frame buffer and render to it"?
Of course that's exactly the first idea that anyone comes up with. Unfortunately the things in the real-life aren't that simple. Therea are too many details which must be taken care of.


Originally posted by V-man:
My favorite is window resizing.Yes, the OpenGL cooperates with the microsoft's window system much more flawlessly than their own 3d api. Also the QUAD primitive is quite useful thing (for example for particle systems) - if you do the quads with 2 triangles instead you have 6 verts per quad, which means more memory consumption and bandwidth (of course i'm talking about the GL_QUADS mode, not the GL_QUAD_STRIP, which isn't any better than GL_TRIANGLE_STRIP). The quads get decomposed to 2 triangles eventually, but that's done by the hardware and the data per primitive you send to it is less, so it's definitely a win. The new point sprites thing doesn't look very useful since it's too inflexible. For example you can't rotate, translate or scale the applied texture(s) before you get to the fragment processing, but there the cost of doing so is much bigger.

Gorg
10-13-2004, 05:19 AM
[/QUOTE]I think you need to do some more coding.
Haven't you noticed any unpleasentness about D3D that GL does not have?
[/QUOTE]

In D3D, if you don't use D3DX for loading textures, you need to select the perfect matching format. The driver does no work for you.

For anybody asking, I did not use D3DX because I already had the code to load all the image types I wanted.

For example, on my Radeon 9800pro, I cannot specify an RGB format, I need to select RGBX!

After that, I started to get nightmare of video cards that did support RGB but not RGBX. ugh. I have no clue if they are cards like that, but I prefer the OGL way where you pass RGB and the driver does the padding if it wants/needs to.

Korval
10-13-2004, 11:19 AM
Of course that's exactly the first idea that anyone comes up with. Unfortunately the things in the real-life aren't that simple. Therea are too many details which must be taken care of.Those a spec-details, not API details. The API itself is correct. How you define render-targets in OpenGL-ese (ie, how do you change the spec to allow for render targets) is important, but ultimately irrelevant. We all know what the real meaning of it is: to allow for drawing commands to produce images on a texture rather than an image. The extension specification details (like how it interacts with the viewport, how reading/writing works, how the whole get_color_bits thing works out, etc) can be worked out later.

I seriously doubt they're debating how to word what RTT means to the viewport.

jwatte
10-13-2004, 07:45 PM
Haven't you noticed any unpleasentness about D3D that GL does not have?Overall, not really. D3D9 is very similar to OpenGL in many things, in fact. Apart from the lack of QUADS, that is :-) Also, the DXUT and D3DX stuff, while cruftier, is also much richer and more relevant than the GLU and GLUT stuff. Of course, our app doesn't resize during run-time.

Also, hardware compatibility is somewhat better with DirectX than OpenGL, once you step outside the three major vendors (Intel, NVIDIA and ATI by market share).

However, to do high-level shaders for a two-API , I'm now having to look into automatic translation (probably from GLSL to HLSL). Sadness that there can't be one standard. And with the GLSL in the drivers, well, you have to code to the lowest common denominator of NVIDIA and ATI driver idiosynchrasies :-(

idr
10-15-2004, 12:44 PM
Korval,

There's obviously a reason you don't work for any IHV writing graphics drivers, designing APIs, or doing dev. rel. Whether you want to admit it or not, adding first class render-to-texture support to OpenGL is at least the same order of magnitude as adding first class vertex or fragment programs. Guess what? That took a long-ass time too.

The difference is that a lot of the up-front work on vertex and fragment programs was done behind closed door at various companies. People that have been working really hard on this for a long time are taking verbal abuse, not because we're taking longer than other major changes, but because we've been at least somewhat open through the whole process.

I'm an open-source developer. I'm all in favor of sharing information early and often, but I've got to say that the abuse that's been given out on these forums is really starting to make me think twice about it. Is that really what you want to accomplish?

When someone like Barthold does post something about what we're doing, it gets ripped apart based on really bad assumptions. Just because EXT_fbo incorporates pbuffer functionality into core GL implies nothing about the interface. Everyone at all the companies in the ARB knows that pbuffers are a steaming pile. We know what the worst parts of it are. We'd have to be pretty daft to repeat things like pixel format enumeration!

We could have had a stop-gap solution months ago. I really don't think anyone (but I obviously don't speak for anyone but myself) wants that. Anyone with a development cycle longer that 6 months or even a year won't even bother using something like that. Should we have really produced another (almost) worthless spec like ARB_vertex_blend? Sure, it's a fine, complete spec, but it had the life span of a house fly.

Yeah, mistakes have been made. Surprise! We're a bunch of humans. There have been a lot of discussions in the ARB about how to better run the working group process to avoid some of those mistakes in the future, and I think the next iteration will be much better. If not, I'll volunteer to be the first to be publicly verbally abused. ;)

Korval
10-15-2004, 03:02 PM
Whether you want to admit it or not, adding first class render-to-texture support to OpenGL is at least the same order of magnitude as adding first class vertex or fragment programs.That's just untrue.

Having RTT functionality in a driver is pretty simple: you change the pointer that the hardware uses as its base pointer for knowing where the framebuffer is. You probably have to flip a switch telling it that it's rendering to a swizzled texture or something as well. But that's about it.

Even if it is more complex than that, it is nothing compared to writing a compiler, even for an assembly-esque language. And ATi's fragment shader compiler still, almost 2 years after it came out, will sometimes get the dependency graph wrong.

The differences between RTT and vertex/fragment shader compiling are legion, in terms of complexity.


Guess what? That took a long-ass time too.I seem to recall nVidia having NV_vertex_program out the door from day 1 of the GeForce 3. Indeed, I seem to recall test versions available in the GeForce 2 (software, of course). The same goes with NV_fragment_program and so forth. nVidia has been spot-on with exposing shader functionality in terms of having it available when the hardware is out.


I'm an open-source developer. I'm all in favor of sharing information early and often, but I've got to say that the abuse that's been given out on these forums is really starting to make me think twice about it. Is that really what you want to accomplish?No. I'm trying to put what little pressure I can on the ARB to get off their collective butts and get this thing moving. They aren't going to do something unless there is a demand for it.

Also, I'm trying to bring to light a problem with the ARB in terms of their ability to give users what they want/need in a timely fashion. The ARB certainly isn't going to change if we promote the status-quo.


We could have had a stop-gap solution months ago. I really don't think anyone (but I obviously don't speak for anyone but myself) wants that. Anyone with a development cycle longer that 6 months or even a year won't even bother using something like that. Should we have really produced another (almost) worthless spec like ARB_vertex_blend? Sure, it's a fine, complete spec, but it had the life span of a house fly.First of all, ARB_vertex_blend was an extension that ATi pushed because their pre-vertex shader hardware could handle it. Because nVidia launched vertex shader hardware aroudn the same time as vertex_blend came out, people just bought GeForce 3's and used NV_vertex_shader. Nobody expected anyone to get ATi cards, and for those who did, they got software vertex blending. The reason ARB_vertex_blend failed was because there was better hardware, not a better spec.

Secondly, we could have had a stop-gap measure 1.5 years ago. Scratch that; we should have had one 1.5 years ago. That means that every GL game released today could be using that solution.

Third, what would be so bad about whatever the stop-gap solution looked like? As long as it did the job without the huge overhead of ARB_pbuffer/RT, what's the problem? The stop-gap solution could easily become the real solution.

Fourth, you said that "Anyone with a development cycle longer that 6 months or even a year won't even bother using something like that." That's only true if ARB_fbo shows up in the next 6 months or so. You didn't see Carmack retrofit Doom 3 with glslang support, even though it was reasonably widely supported when Doom 3 released.

Fifth, ARB_fbo is as likely to be released next year as much as tomorrow. We have Barthold's assurances of "soon", but we've heard "soon" before about these kinds of things. As RTT functionality becomes more important and viable, for anything from reflections to caustics to shadow mapping to anything, it is going to become more and more imperitive that OpenGL have some solution that could be considered "fast".

3k0j
10-16-2004, 04:26 AM
Originally posted by Korval:
That means that every GL game released today You mean, both of them?

tfpsly
10-16-2004, 05:14 AM
Originally posted by 3k0j:

Originally posted by Korval:
That means that every GL game released today You mean, both of them?Nop he's speaking about RTT, not vertex shader & blend

Jan
10-16-2004, 06:49 AM
If someone, who has no experience, in what i am doing, would criticise my work, or even offend me personaly for being lame or whatever, i would not be so kind to explain my reasons to him or try to excuse it.
I would either kick his butt or donīt talk to him at all.

That is why i understand, that we donīt get so much information about all this, because everytime someone tells us something about the progress, there are a few people who admire it, and a lot of people who get quite impolite, that their dreams are not yet fullfilled.

We all want this functionality, but offending those who are responsible for it, is counter-productive. This is a community, and that means, that people have to respect each other. Even if there are unsatisfying issues.

Jan.

V-man
10-16-2004, 10:06 AM
Originally posted by idr:
Whether you want to admit it or not, adding first class render-to-texture support to OpenGL is at least the same order of magnitude as adding first class vertex or fragment programs. Guess what? That took a long-ass time too.That's diffult to beleive.
Shading languages add the complexity of defining the language and offering the necessary instructions that will typically be used. It involves a lot of research.

Writing bug free compilers is difficult as well. It requires a lot of debug/testing long after version 1.

Let's look at it this way :

How many bugs have been identified in company X's ARB_vp/ARB_fp implementation verses their p-buffer implementation.

They don't need to tell us, Search these forums!

Korval
10-16-2004, 12:21 PM
Nop he's speaking about RTT, not vertex shader & blendI think 3k0j was pointing out that not many games these days are bothering to use OpenGL. However, as ports to Linux and so forth become more important, GL use will rise.


If someone, who has no experience, in what i am doing, would criticise my work, or even offend me personaly for being lame or whatever, i would not be so kind to explain my reasons to him or try to excuse it.That's the thing. I don't have "no experience". I've worked on 3D hardware in a low-level capacity before. I've been involved in a 3D console game project, so I am aware of a few examples of low level 3D programming. I can't say I've written an OpenGL driver, of course, but low-level principles are the same.

As such, I'm not speaking from ignorance when I speak of the relative trivality that RTT is. As long as the hardware can actually render directly to a texture of some sort, then actually implementing RTT in the driver is maybe a 2-week task, at most. And much of that time is probably spent implementing things that OpenGL requires of the RTT (like possibly allocating textures in a new way, having additional state variables, etc), not doing the actual hardware stuff.

No, the problem with RTT has never been the complexity of implementing it. It has been the fact that the ARB has failed to produce a specification that will allow a driver developer to produce it.


We all want this functionality, but offending those who are responsible for it, is counter-productive.Constantly telling them, "No, it's OK that it's taken you 2 years to produce this relatively simple extension," is far more counter-productive. Positive reinforcement for bad behavior produces more bad behavior. Negative reinforcement for bad behavior has at least the chance of producing good behavior.

What about the "next big thing" in terms of functionality? Can we expect another 2 year wait for that mesh instancing stuff that D3D9 has?

Humus
10-16-2004, 07:38 PM
Originally posted by V-man:
That's diffult to beleive.
Shading languages add the complexity of defining the language and offering the necessary instructions that will typically be used. It involves a lot of research.

Writing bug free compilers is difficult as well. It requires a lot of debug/testing long after version 1.

Let's look at it this way :

How many bugs have been identified in company X's ARB_vp/ARB_fp implementation verses their p-buffer implementation.

They don't need to tell us, Search these forums!I don't think he meant all the development time from zero to today's shader compiler, but rather up until you got a working implementation. RTT is a simple concept, but not neccesarily as simple from the hardware and driver's point of view. I don't know why they went ahead with pBuffers in the first place, but I believe one of the major reasons would be that much driver code could be reused if the render target was just assigned it's own rendering context. Relatively quick and easy to implement in the drivers.

selwakad
10-19-2004, 08:21 AM
I have read most of your posts, and I understand the frustration. Frankly, we can't assume that we know why it takes so long for the ARB to come up with specs, or update to OpenGL. but i am sure that they are not happy about it either.
MS is now working on WGF and I suspect that OpenGL will never be able to catch up with anything MS is putting out there. Already OpenGL is far behind DX9. Features like RTT, Instancing, ... are all missing.
My take on it, is that, the ARB may not have enough resources to get the job done. But then again, this is just my guess. But it seems to me that MS has a dedicated paid team that works on its API. while OpenGL has the good-will of its contributors and HW vendors. and that is not much of a resource since most vendors would like to cater to DX first and it is a simple case of economics ... most games/graphics apps are written for windows using DX, thus DX has a higher priority. OpenGL thus takes a back seat.
And for that reason, OpenGL will always trails anything MS is going to put out there.
If this is to change, then solid commitment is required to OpenGL, not only by HW vendors who develop the drivers, but also by ISVs who actually build the software to use it. that means people in EA Games, ....
But so far DX is getting the job done so there is no incentive for the likes of EA Games to switch to OpenGL.
So, if you really nead those features, you would probably be better off using DX. Not that I have anything against OpenGL, but sadly this seems that current state of the APIs.
Personally I use OpenGL, because I feel that DX is cumbersome to use ... as is the case with any MS API. But then again, I am not using for commercial purposes ... yet, if i was asked to develop a commercial ap. I would choose DX and put up with its cumbersome API, simply because it has the features ... not because i like it.
So, after all of this, I see no reason to steam the ARB board ... they don't have the same resources as MS's DX team. Thus it would be unfair to compare OpenGL to DX.
To ease your mind about this, think like that: OpenGL is a hobby API and DX is a Proffessional API. once you have that established in your head(it took me sometime to realize that), then you won't feel so frustrated at the lack of features in OpenGL. :-)

bobvodka
10-19-2004, 10:08 AM
Originally posted by selwakad:
Already OpenGL is far behind DX9. Features like RTT, Instancing, ... are all missing.I find it intresting that people claim that OpenGL is far behind D3D9, yet they only seem to mention one or two things, and in this case RTT is supported just not in a form we would like (thus the need for this extension which is being talked about)...

As for OpenGL being a 'hobby' API.. meh, balls tbh.

l_belev
10-20-2004, 03:23 PM
selwakad,
that instancing thing is one of the few cases of opengl lagging behind d3d for something in the history. usually it's the other way around - the hw vendors firstly expose the features of their new products through the opengl extensions. Moreover the instancing is a feature found only in the newest hardware, so, as always, it will take a while (like one generation of games) before people start using it. By then it will be long present in opengl (if it really turn out to be any usefull). On the other hand how do you do shadow mapping in d3d (which is far more mature thing, and has been around for years)?
The RTT isn't exactly missing from opengl - it's just a little bit crappy and should be replaced soon.

How do you define a hobby API? Is this an api being developed by hobbyists? None of the ARB members are hobbyists. They are pretty commersial companies not doing anything unless they expect some return. If you mean that most game developers use d3d, that's different matter. Basicly the reasons are these:
1) many of the contemporary game developers don't really know opengl, they only know d3d, they see no reason to waste time learning another api which from their point of view won't bring them any better chance for profits.
2) all they hear is the loud propaganda from microsoft about how great is their next version of d3d with a whole new generation of advanced features, and all they see is that indeed all games use d3d. So you guess what? Right, they stick with d3d and do exactly what microsoft wants them to. They are the ants building the microsoft empire. And microsoft is very happy and proud of them.

Most of them never notice or ask questions about things like why should an 3d api be bound to something like the microsoft's COM, or why the point (0,0) isn't the corner of the first pixel, but it's center, and as a consequence it is a tricky business to just draw a texture on the screen with one-to-one correspondance between the texels and the pixels (and that's why most hw vendors provide some means of correction for that madness in their driver's settings)

Korval
10-20-2004, 04:33 PM
OpenGL is a hobby API and DX is a Proffessional API.That is patently false. But not for any of the reasons you may have read in this thread.

OpenGL is a cross-platform standard. If you're even considering a Linux or MacOS port, you're going to need OpenGL. Also, most CAD applications, as well as modelling packages, only support OpenGL. GL is widely used; indeed D3D is rarely used outside of game applications.


Moreover the instancing is a feature found only in the newest hardware, so, as always, it will take a while (like one generation of games) before people start using it. By then it will be long present in opengl (if it really turn out to be any usefull).First, the very fact that you doubt the usefulness of this technique (for even more than just mesh instances) indicates an unawareness of the problems of modern high-performance programming. The feature is clearly, and undoubtedly, self-justifying.

Second, how can OpenGL programmers develop their games on the assumption that some piece of functionality will show up modestly mature in drivers by the time they ship? Until the API gets this functionality, they can't rely on it. And if they're already deeply in development, they can't just suddenly hack it into their game.

Functionality should be provided as soon as it is humanly possible, even if the API isn't quite nice. Microsoft understands this, which is one of the reasons that D3D is more widely used in game development. Game developers care far more about having the ability to use a hardware feature than how ugly the API might be.


On the other hand how do you do shadow mapping in d3d (which is far more mature thing, and has been around for years)?Um, ATi hardware doesn't support shadow mapping, despite the fact that nVidia hardware has done so since the GeForce 4. ATi hardware emulates it via fragment programs. So I would hardly call it "mature".

Considering that it is perfectly possible to emulate shadow mapping in fragment programs (indeed, it can be very useful to do so if your shadow depth computation is atypical, or if you want greater than 24-bit precision (ie, not using depth textures, but instead a luminance floating-point texture)), there is little real advantages to having it in hardware. The only real advantage is fast PCF.


Basicly the reasons are these:You've convieniently forgotten the D3D 8 era. Direct3D had standardized, cross-platform vertex and fragment programs long before GL did. Indeed, there isn't even a cross-platform way to access D3D-8 level functionality in terms of fragment programs. Outside of glslang, there isn't a way to access looping in vertex shaders, and that's been around for years now. How long since ATI_vertex_array_object was it before we even got VBO? A standard mechanism for using driver-side memory for transfering vertices took far too long, while D3D has had it for years.

OpenGL has been behind in critical areas for quite some time. The ARB isn't terribly fast at catching up.


why should an 3d api be bound to something like the microsoft's COMThe obvious reason for binding it to COM is that it now allows you to write DirectX code in Visual Basic and other COM-aware languages. You can write C# and Managed-C++ apps that use DX, partially due to DX being built on COM.

You can just as quickly ask why OpenGL's API is state-based rather than object-based (meaning that you have to do glBindTexture/glTexImage to modify a texture, rather than passing the texture object as a parameter to glTexImage). It is just as valid a question as D3D being bound to COM.

These are relatively minor API issues. The fact that DirectX uses COM does not make it terribly difficult to use anymore than GL's state-based nature makes it difficult to use.


why the point (0,0) isn't the corner of the first pixel, but it's center, and as a consequence it is a tricky business to just draw a texture on the screen with one-to-one correspondance between the texels and the pixelsUm, a number of people have had the exact same problem with OpenGL programs. Getting GL to draw a texture 1-to-1 on the screen requires some fiddling around with numbers, just like with D3D.

Zak McKrakem
10-21-2004, 01:58 AM
I'm not sure if people knows it, but in the 2004 top 10 of best selling games. If you look at 3D games, there are more OpenGL than Direct3D games. This includes Call of Duty and Doom (the first one is on the top ten since its release one year ago).
It is the same of previous years with games like Neverwinter Nights, Jedi Knight (I & II), Starwars: Knights of the Old Republic, Homeworld II, Return to Castle Wolfstein, Medal of Honor, Battlefield 1942 (I'm not sure about the two last), ...

Jan
10-21-2004, 03:57 AM
Thatīs right, but you have to look at the engines:

Call of Duty: Quake 3 (id)
Doom 3: Doom 3 (id)
Neverwinter Nights: NWN (Bioware)
Jedi Knight (I & II): Quake 3 (id)
Starwars: Knights of the Old Republic: NWN (Bioware)
Homeworld II: their own
Return to Castle Wolfstein: Quake 3 (id)
Medal of Honor: Quake 3 (id)
Battlefield 1942: their own

You see, most games use the Quake 3 engine. Not because it is an OpenGL-engine, but because it is one of the best (together with the Unreal-Engines).

And the guy behind that engine now said, OpenGL is (partially) so awfull, he nearly switched to D3D.

The other engines may be good, but they would really have a problem, if the Quake-engines wouldnīt use OpenGL, because then the drivers would be much worse, iīm sure.

I think OpenGL is just as good as D3D. None is better than the other, except in certain areas.

But of course, it would be nice, if one of them was simply perfect :-) and of course i would want OpenGL to be perfect. We all know it isnīt, but well, we can dream, canīt we?

Jan.

l_belev
10-21-2004, 05:27 AM
First, the very fact that you doubt the usefulness of this technique (for even more than just mesh instances) indicates an unawareness of the problems of modern high-performance programming. The feature is clearly, and undoubtedly, self-justifying.I'm only unaware of that concrete feature (I only roughly know what it is about) because I'm not a d3d user. How from this you conclude that I'm not aware of "the problems of modern high-performance programming". Now do you really think before you post something, or your aim just is the pointless ruction?
Besides it's not a-priori obvious that some technique is usefull only because microsoft make a big fuss about it. Do you remember the displacement mapping of the matrox perhelia. Everyone was very excited about it (thanks to microsoft's propaganda machine). But where is it now? It didn't endure the time.


Functionality should be provided as soon as it is humanly possible, even if the API isn't quite nice.Agreed. That's what the opengl extensions do. And since d3d lacks the extension mechanism, microsoft try to include as many features in their new version as they can so that until the api get to the next version, it hopefully won't be lagging too much already. As for the cases when some feature is at first added to d3d, that is caused by microsoft having plenty of money and thus being able to make deals with the hw vendors.


Um, ATi hardware doesn't support shadow mapping, despite the fact that nVidia hardware has done so since the GeForce 4. ATi hardware emulates it via fragment programs. So I would hardly call it "mature".I call "mature" a thing that's been around for years and already avaryone is aware of what it is for, what it's applicable for and what are it's pros and cons. Something that hasn't much unanswered questions left about it.


Considering that it is perfectly possible to emulate shadow mapping in fragment programsOf course it's possible. But with many and big problems, including performance (the post-filtering is very expensive to simulate), complex and costly generation of the "shadow" texture - in d3d you can't use the depth buffer (you can't possibly read it into texture or use it as such afterwards), but instead you have to use the color channels of the render target and then combine then by some fragment shader to increase the precision, which adds complexity and again lowers performance. And the fact that people are willing to do that emulation despite the disadvantages, speaks of the usefullness of the technique and fortifies the question why microsoft didn't include it in d3d.


Direct3D had standardized, cross-platform vertex and fragment programs long before GL did.Yes, it had it since around year 2000 or so. And it took some years before perole really started to use it. By that time opengl long had that functionality as well.


The obvious reason for binding it to COM is that it now allows you to write DirectX code in Visual Basic and other COM-aware languages.Is that really the reason? So you can't possibly use opengl in whatever languages you wish? Currently you can use it in c, java, delphy. There's no problem to use it in whatever language you might wish as long as someone make the needed interface (like headers ot libs or whatever necessary)
because ALL languages out there have a notion for a function, and that's all that opengl needs. In contrast for a language to be able to support d3d there are much bigger prerequirements. In fact the com-based d3d has much limited potential language support since there's the requirement that the language must be com-based. It's possible to work around this, but pretty complex.


The fact that DirectX uses COM does not make it terribly difficult to use anymore than GL's state-based nature makes it difficult to use.GL's state-based nature makes it difficult to use? Well you obviously speak without thinking. No need for comment here.


Um, a number of people have had the exact same problem with OpenGL programs.Well, you obviously don't know what I'm talking about here. Please make sure you really understood what's the question about before making any comment because otherwise you put yourself in silly situation. But of course that's not terribly important for you since you just need to spar, no mattar how and what about.

Korval
10-21-2004, 10:36 AM
I'm only unaware of that concrete feature (I only roughly know what it is about) because I'm not a d3d user. I'm not a D3D user either. But that doesn't stop me from doing some basic research into my field of expertise/interest (ie: performance 3D graphics). If Microsoft and nVidia are going on about some feature, it deserves at least a little investigation to see wether or not it is truly worthwhile. In this case, it is.


Besides it's not a-priori obvious that some technique is usefull only because microsoft make a big fuss about it.I'm not a Microsoft fanboy. I'm not touting the feature because Microsoft or nVidia makes a lot out of it. I'm touting it because I have done the research and discovered that it is a good feature, a useful feature, a feature that I would like to use. Rather than making my decision on a feature based on who implemented it first (ie, deciding it wasn't that great just because it came out on D3D first), I did some legwork and discovered a useful piece of functionality.


Everyone was very excited about itEveryone 'who'? I had no faith in anything Matrox was doing. And I recall no real hype about it; merely that D3D supported it.


That's what the opengl extensions do.No. That's what they used to do. Nowadays, vendors are so frightened of exposing functionality that they wait for ARB versions. This takes forever, so we never get the functionality that matters to us.

If the extension mechanism is supposed to provide us with immediate exposure of hardware features, how can we still not have a decent RTT extension?


As for the cases when some feature is at first added to d3d, that is caused by microsoft having plenty of money and thus being able to make deals with the hw vendors.Just because you don't like Microsoft or Direct3D doesn't mean that they aren't doing some things right, and it doesn't mean that they aren't doing some things better than OpenGL. The mere fact that they have RTT and we don't should make that obvious.

And what "deals" is Microsoft making with hardware vendors? NV40 is the only hardware that provides for instancing. Are you suggesting that Microsoft asked nVidia not to create an instancing extension to OpenGL? nVidia has been one of the biggest OpenGL supporters. Indeed, one could easily say that, without nVidia, OpenGL would be virtually nothing in the consumer arena. nVidia is pusing Linux OpenGL like noone else. So what do they have to gain by weakening OpenGL?


I call "mature" a thing that's been around for years and already avaryone is aware of what it is for, what it's applicable for and what are it's pros and cons.You mean, like render to texture ;)

And some of us don't care about the "maturity" of a feature; we just want the functionality.


Something that hasn't much unanswered questions left about it.Then hardware PCF isn't exactly mature either. The reason I don't use hardware PCF (besides the fact that ATi doesn't support it natively, and I like having my fragment program opcodes where I can see them) is because I can't use an arbitrary texture with it. I have to use a "depth" texture. I would much rather use a 32-bit floating-point luminance texture. I get more precision that way, as well as greater control over the depth values. But, shadow lookups only work on depth textures, for some nonsensical reason. So, yes, there are still questions that need to be resolved in terms of hardware PCF.


including performance (the post-filtering is very expensive to simulate)Allow me to reitereate: ATi doesn't have hardware PCF. They do it in the fragment program. They expand the shadow access into a sequence of operations that performs PCF.


but instead you have to use the color channels of the render target and then combine then by some fragment shader to increase the precisionOr, of course, use a luminance floating-point texture, which they can bind as a render target because their API can do that. One door closed, another opens up.


Yes, it had it since around year 2000 or so. And it took some years before perole really started to use it. By that time opengl long had that functionality as well.And it took just as long for OpenGL developers to start using VBO, too. The difference is that D3D 8 games were released that used this functionality long before we had ARB_VBO.


So you can't possibly use opengl in whatever languages you wish?OpenGL is a specification that is bound to the conventions of C. In order to expose this API to some other language, you must write some kind of C interface code. By contrast, the very fact that DirectX is a COM object means that you can get access to it from any COM-aware language.

So, while you can expose OpenGL to other languages, it's much easier with DirectX. As long as that language is COM-aware. And, if it is not, then you do the same thing you would under OpenGL: you write a library that exposes the API to it. Wrost-case, it is no different from OpenGL. Best-case, it is automatic and invisible.


GL's state-based nature makes it difficult to use? Well you obviously speak without thinking.I've had this argument before, so I'm not really going to follow up on it. But consider this. The proposed GL 2.0 threw out most of the state-based nature of GL in favor of an object-based model. The initial glslang extension even used the object-based model, and only the recent core promotion changed it back to a state-based model (which driver developers haven't bothered to implement yet). The only reason for this change is that there would be no full GL 2.0, so the glslang API was very different from that of the rest of GL. Note that the reason was not because state-based was superior to object-based; the reason was consistency and inertia. So, clearly, some people on the ARB prefer object-based to state-based.

bobvodka
10-21-2004, 10:54 AM
Originally posted by Korval:
NV40 is the only hardware that provides for instancing. Apprently the R300 and up chips can also do instancing, it just wasnt exposed until recently in the drivers.

and yes, Instancing support would be lovely, I can think of a use for it in a game I want to write already...

Korval
10-21-2004, 11:14 AM
Apprently the R300 and up chips can also do instancing, it just wasnt exposed until recently in the drivers.Really? OK, there's no excuse anymore. The ARB (or just ATi or nVidia. I don't really care) needs to get on this.

My prediction, however, is that we'll get this in like 1.5 years or so.

bobvodka
10-21-2004, 11:36 AM
This Driverheaven artical/forum post (http://www.driverheaven.net/showthread.php?s=&threadid=51500) mentions it, and its from July of this year. It mentions the X800 by name and also says 'other DX9 ATI cards', which apprently pans out to all R300 hardware.

So yes, I'm with you on calling for this to be worked on as a well.

V-man
10-21-2004, 11:47 AM
The obvious reason for binding it to COM is that it now allows you to write DirectX code in Visual Basic and other COM-aware languages. You can use non-COM based APIs in Visual Basic.
One reason for COM in DX was COM was fresh from MS and they wanted the developers to use it.

Now let`s fast forward to 2004. COM is dead. It`s all about .NET. Visual Basic .NET flushed COM down the toilet. MS decided that "managed" is cool. They want developers to use "managed" DX.

Managed DX exposes DX as "C" functions. MS documents says this removes the overhead of COM.
Well "duh".

This thread is going off tracks, isn`t it?

V-man
10-21-2004, 11:54 AM
Originally posted by bobvodka:
This Driverheaven artical/forum post (http://www.driverheaven.net/showthread.php?s=&threadid=51500) mentions it, and its from July of this year. It mentions the X800 by name and also says 'other DX9 ATI cards', which apprently pans out to all R300 hardware.

So yes, I'm with you on calling for this to be worked on as a well.Hmm. Isn`t instancing a driver side issue. What hw support does it need?

Korval
10-21-2004, 12:44 PM
Isn`t instancing a driver side issue. What hw support does it need?Well, the hardware certainly appears to be there. It will require an API for changing how indexing works. In OpenGL terms, it would need to change how indexing works on a particular attribute array. Instead of reading from the index buffer, it will now determine the index for a specific attribute in a different way.

It's a little hard to explain exactly how it would work in OpenGL, since the D3D functionality is defined in terms of their notion of streams and buffers. There is no real equivalent to D3D streams in OpenGL. Presumably, a driver would decide what constitutes a "stream" by itself. The "requirement" (which, if you don't follow, you get bad performance) might be that you have to separate any VBO that has data to be accessed by the instance index from VBO's that do not, much like it is strongly suggested that we separate index VBO's from vertex attribute VBO's.

bobvodka
10-21-2004, 12:47 PM
nope, I'm sure it requires hardware support of some type, evident by 2 posts down with the screen shots; 4.3fps and 49.5fps and a hell of a lot less draw calls.

Korval
10-21-2004, 01:34 PM
nope, I'm sure it requires hardware support of some type, evident by 2 posts down with the screen shots; 4.3fps and 49.5fps and a hell of a lot less draw calls.Well of course it requires hardware support. I figured that was obvious. But, for us to get it, it's going to require an... interesting wording of the spec, depending on how the hardware is set up.

zeckensack
10-21-2004, 07:09 PM
Re instancing, it's a significant benefit on the DirectX Graphics side of things, in large part because of how expensive draw calls are. Use instancing => less explicit draw calls => faster.

OpenGL drivers (well written ones, at least) simply don't have that issue. You can certainly get some benefit out of instancing on OpenGL, too, if you have an all-out hardware implementation, but my point is that it's just not as urgent and necessary as it is with DirectX Graphics.

I believe I have read somewhere that ATI's instancing support isn't hardware based at all. They just expand into multiple "draw calls" in the driver, and they get a performance boost simply because there's less twiddling of thumbs related to the thunk layer. This isn't as good as full hardware support, but you get most of the immediate bang for very little effort.

If that's true, we'd simply have to look at how "geometry instancing" code runs on ATI hardware, because that's what OpenGL does by default.

Btw, Korval, if you want a 1:1 mapping of texel to pixel, this is the way:
glViewport(0,0,texture_width,texture_height);
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glBegin(GL_QUADS);
glTexCoord2f(0.0f,1.0f); glVertex2f(-1.0f, 1.0f);
glTexCoord2f(0.0f,0.0f); glVertex2f(-1.0f,-1.0f);
glTexCoord2f(1.0f,0.0f); glVertex2f( 1.0f,-1.0f);
glTexCoord2f(1.0f,1.0f); glVertex2f( 1.0f, 1.0f);
glEnd();No fiddling with numbers as far as I can see.

And I can vouch for the fact that Direct3D must have gotten this issue terribly wrong at some point. I played Final Fantasy VII when DirectX5 was the current version.

sqrt[-1]
10-21-2004, 08:57 PM
FYI: The D3D texel to pixel is described here:
http://msdn.microsoft.com/library/defaul...elstopixels.asp (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/directx9_c/directx/graphics/programmingguide/gettingstarted/direct3dtextures/coordinates/mappingtexelstopixels.asp)
Caused me a few headaches...

Korval
10-22-2004, 09:13 AM
OpenGL drivers (well written ones, at least) simply don't have that issue. You can certainly get some benefit out of instancing on OpenGL, too, if you have an all-out hardware implementation, but my point is that it's just not as urgent and necessary as it is with DirectX Graphics.Well, certainly not to the degree of D3D (which does a switch to Ring0 for each Draw* call), but there is some overhead to glDrawElements than merely copying somethings into a buffer. You can simulate instancing by using really huge vertex arrays; try rendering 100,000 objects in 100,000 calls vs. 1 call. You will find a non-trivial performance difference.

zed
10-22-2004, 12:00 PM
just a thought.
this is the first time ive heard about instancing but from what i can gather it draws the same object (many times) at various places (usually at a greater framerate).
an initial impression is, itll be cool to do this but then again would it? 10,000x the same object onscreen how boring, would of been great 5 years ago when it was acceptable to have a forest where all trees/grass blades/rocks are the same but we've moved on.
its sort of reminiscant of 'point sprites', wow they were great as well, now i can just supply a single vertex for each particle in my particle system, hmmm but wait fire,smoke,fog etc look so much better if each particle is rotated. thus in effect for particle systems 'point sprites' is only useful for a limited number of cases.
i think what im trying to say is, do we really want another extension of dubious benifits, which does seem to be a hack(not right word, copout maybe), same thing with RTT, get it right first time even if it takes a bit longer

*also of interest the framerate of that instancing demo on a radeon X800 25 million tris sec 1024x768 for a simplistic scene, not exactlly jawdropping is it.

Christian SchÞler
10-22-2004, 03:19 PM
Instancing is all about having certain components of your vertex data at a lower frequency than the main components. In addition, the hardware must be able to loop over a fixed index range for the main components.

So basically you would need to set a modulo and a divider for each gl*Pointer().

This way you can place transform matrices, or any other per-instance data like colors, material constants, into a stream with a divider, where the object vertices are put into a stream with modulo.

zeckensack
10-22-2004, 03:46 PM
Originally posted by Christian Schüler:
Instancing is all about having certain components of your vertex data at a lower frequency than the main components. In addition, the hardware must be able to loop over a fixed index range for the main components.

So basically you would need to set a modulo and a divider for each gl*Pointer().

This way you can place transform matrices, or any other per-instance data like colors, material constants, into a stream with a divider, where the object vertices are put into a stream with modulo.Yes. You effectively get new vertex shader "constants" every n vertices. But they'd behave like attributes because that's what they really are, they're just no longer strictly per-vertex.

For a "one million cubes" type of effect the requirements are much simpler. Hardware instancing would -- assuming all-out local vertex buffer support and an efficient OpenGL driver -- be mostly a win in bandwidth distribution between AGP and local memory. True instancing hardware can source new transform matrices from local memory, while non-instancing hardware would need to push them across the AGP (or PCI Express).

Plus, per object (i.e. a batch that uses the same matrix), you'll get some protocol overhead to tip off the actual rendering command, which I believe to be a minor issue in comparison to your typical 64 bytes MVP matrix.

And this "one million cubes" thing is the showcase example for geometry instancing, where it offers the biggest performance gain. If you scale it up to more useful requirements, like, say, ATI's crowd demo, this actually reduces the ratio between "constant updates" and vertices. The win from instancing becomes smaller if the individual instances get larger in terms of vertex count.

It's a nice thing. But it's not a silver bullet. Going forward, I forecast, if I may, that it's relevance will diminish, as individual objects grow more complex and will hit limitations elsewhere.

Korval
10-22-2004, 08:10 PM
this is the first time ive heard about instancing but from what i can gather it draws the same object (many times) at various places (usually at a greater framerate).That is one use, yes.

The idea is that you draw multiple copies of the object, but changing per-instance parameters for each object. You can vary position, orientation, color, texture (depending on how many bound textures you want to use), and any other factor that you can come up with to use in a vertex shader. As long as it can fit into the per-vertex attribute limits of a vertex shader, you can use it.

The "instance data", depending on your vertex shader, could change even the geometry of the object, allowing for some variability between objects.

There is some significant power with this technique.


an initial impression is, itll be cool to do this but then again would it? 10,000x the same object onscreen how boring, would of been great 5 years ago when it was acceptable to have a forest where all trees/grass blades/rocks are the same but we've moved on.We're only recently getting to the point where we have enough trees to deserve being called a forest.

Plus, you can build a distinct forest out of 4 types of trees. Rotation and appropriate staggering of the trees will allow you to create a convincing forest scene.

Being able to have 100,000 of something can get you a lot. Imagine particulate-based fog, made from 100,000 particles that are interacting with some geometry. Or water foam. Things of that nature.


its sort of reminiscant of 'point sprites', wow they were great as well, now i can just supply a single vertex for each particle in my particle system, hmmm but wait fire,smoke,fog etc look so much better if each particle is rotated. thus in effect for particle systems 'point sprites' is only useful for a limited number of cases.It all depends on what your sprite represents. If your sprites are animated sprites, then you can get quite a bit out of them.

Also, point sprites always had problems. The fact that they have a size limitation makes it difficult to rely on them.


also of interest the framerate of that instancing demo on a radeon X800 25 million tris sec 1024x768 for a simplistic scene, not exactlly jawdropping is it.The framerate was worse without instancing. That's a pretty good justification.


It's a nice thing. But it's not a silver bullet. Going forward, I forecast, if I may, that it's relevance will diminish, as individual objects grow more complex and will hit limitations elsewhere.Maybe for structural geometry, it might diminish (however, this is not a valid argument against it. Just because it may be outdated in 5-10 years doesn't mean we shouldn't have it now. Just like we had 16-bit framebuffers, and those are outdated nowadays). But, for rendering lots of little things, like tufts of grass and particles, it remains an excellent idea.

It's no "silver bullet". But it's a useful tool that should be made available to us. Better to let the developers decide whether we want to use something than to simply deny it to us out of hand.

Korval
10-24-2004, 02:20 PM
Sorry for the double-post, but I just remembered why instancing is important (performance-wise), even for OpenGL: state changes.

Instances of objects share a lot of state. But, if you change even one piece of state, you've incurred a big performance hit. Because instanced-rendering puts these "state changes" in per-vertex attributes, you get to avoid any state change overhead, which even under OpenGL is quite significant.

zed
10-24-2004, 07:37 PM
The "instance data", depending on your vertex shader, could change even the geometry of the object, allowing for some variability between objectsok so its more flexible than i thought


The framerate was worse without instancing. That's a pretty good justification.true, i wonder though if the user stuck the thing in a display list, how much fps would they achieve. 25millionfps, hell i could achieve 15+million onscreen textured tris a second on my gf2mx (admidtly at a small resolution), how many 100 million's can the X800 do.

Korval
10-24-2004, 11:26 PM
true, i wonder though if the user stuck the thing in a display list, how much fps would they achieve.A display list is not going to help, for the specific reason I mentioned: state change overhead. You have to change state between display list renderings. This is going to cause a significant slowdown in rendering.

zeckensack
10-25-2004, 04:21 AM
Originally posted by Korval:
Sorry for the double-post, but I just remembered why instancing is important (performance-wise), even for OpenGL: state changes.

Instances of objects share a lot of state. But, if you change even one piece of state, you've incurred a big performance hit. Because instanced-rendering puts these "state changes" in per-vertex attributes, you get to avoid any state change overhead, which even under OpenGL is quite significant.
<...>
In a different post, Korval wrote:
Maybe for structural geometry, it might diminish (however, this is not a valid argument against it. Just because it may be outdated in 5-10 years doesn't mean we shouldn't have it now. Just like we had 16-bit framebuffers, and those are outdated nowadays).Agreed.


Originally posted by Korval
But, for rendering lots of little things, like tufts of grass and particles, it remains an excellent idea.Particles is one thing I'd like to be able to scale up in complexity, too. Say, instead of rendering a screen aligned quad or triangle per particle, I'd rather like to render an icosahedron with opacity being a function of constant*(clamp(0,(transformed vertex normal DOT3 eye vector)-0.2,1), and the whole thing distorted into a more elliptical shape in accordance with its velocity vector. Or something like that.

This doesn't make it any more "right" that there's currently an interesting hardware capability that isn't accessible through OpenGL.

tweakoz
10-25-2004, 09:04 AM
Originally posted by V-man:

Originally posted by jwatte:
D3D performance is fine, which leaves the only reason for using OpenGL to be the cross-platform and cross-vendor support. Which is important to most of us! The ability to introduce functionality with extensions is also useful, although the bigger the features get, the less likely a vendor is to strike out on their own and risk having to re-work it later.

Actually, the surrounding goop around D3D (DXUT and whatnot) is rather crufty, but hey, it's a supported SDK rather than an abstract standard.

But in this discussion you're forgetting the major reason to use OpenGL instead of D3D: D3D does not have a QUAD primitive! :-)I think you need to do some more coding.
Haven't you noticed any unpleasentness about D3D that GL does not have?

My favorite is window resizing.Actually D3D does do window resizing, just not for free. I forget the exact details, but basically it involves regenerating swap chains.

mtm

zed
10-25-2004, 10:38 AM
Originally posted by Korval:
A display list is not going to help, for the specific reason I mentioned: state change overhead. You have to change state between display list renderings. This is going to cause a significant slowdown in rendering.what state changes, all examples ive seen so far, rocks,trees,grass,fog would share the same state. sorry i forgot about the rocks were moving one of myy frequent brainfarts, dl's are out. look its obvious i dont know much about instancing (any links? searching doesnt turn up much hard stuff).
as ppl who know me will say, im not really a man of theory but of results, i threw together a simple demo last night of 16,000 rocks. on slower hardware than the X800 (obviously its not the same code, i couldnt find the original code) i got about the same speed as the x800 with instancing on, now i know d3d is slower than opengl with drawing commands but surely its not *that* much slower. now this is the worse case senerio any half decent program would combine the rocks mehses together so u need fewer drawing calls.
i really are interested in instancing more than most ppl here, as the stuff im working on does consist of thousand of similar objects

zom
10-26-2004, 08:02 AM
I just like to post an idea (maybe already taken into account with new EXT_framebuffer_object ext).

Would it be possible to schedule RTT to be executed when app is waiting for vsync
(or is not doing other priorities)?

ie:


GLuint texobj,lstobj;

// gentextures,lists,etc
// ...

while(renderloop)
{
glFinish(); // ensure RTT completeness

// render scene (make use of texobj)
// ...

glNewList(listobj,GL_COMPILE);
// RTT batch
// ...
glEndList();

// schedule batch to be drawn on texture
glBindTexture(GL_TEXTURE_2D,texobj);
int mipmaplevel=0;
glRTT(GL_TEXTURE_2D,mipmaplevel,listobj);

SwapBuffers(wglGetCurrentDC());
}

zed
10-26-2004, 11:59 AM
ok found the pdf at nvidia that deals with instancing, im sure its more directed to d3d than opengl
because 10,000 draw calls will bring even the beefiest CPU to its knees.it certainly doesnt bring my cpu to its knees, perhaps if i was using d3d it would. also from that pdf it seems with nvidia hardware as soon as your meshes get above 1000tris instancing is a loss. from reading the pdf i can see its benifits in some instances (pardon the pun) but it has a lot of limitations as well, like pointsprites. i mean in a real world app would u be using it that much. another thing ive seen in heaps of nvidia pdfs is they often mention blah blah the graphics card has become so powerful, chuck as much to it as possible, but ive found with moderate fragement programs the card is easily the limiting factor, not the cpu, obviously though they say this to try and get developers to use the gpu for this and that so the better the gpu the better the benchmark, the more joe blow needs to go and get the newest hardware.
back to the main topic sure itll be great if this (and 10,000 other things) could be added to opengl, but each one added adds to the driver complexity, hence less likely to be optimized fully and more prone to bugs. personally this is my major gripe with opengl2, i wanted it to clean away all the dead wood (+ not be backwards capable) with the api + just have a lean mean api (though theyve gotta keep immediate mode, couldnt live without that)

ok i played around with my benchmarking stuff last night. this code is very unoptimized on purpose (and might not be 100% correct), yet my athlon64 2ghz + gffx5900xt run it about 30fps (using meshes similar to the aforemented demo)

float f=0.0;
for ( int i=0; i<16000; i++ )
{
num = i;
rocks[num].mat.rotateX( rand()%100*0.001 );
rocks[num].rX = VECTOR4( rocks[num].mat.a00, rocks[num].mat.a01, rocks[num].mat.a02, 0.0 );
rocks[num].rY = VECTOR4( rocks[num].mat.a10, rocks[num].mat.a11, rocks[num].mat.a12, 0.0 );
rocks[num].rZ = VECTOR4( rocks[num].mat.a20, rocks[num].mat.a21, rocks[num].mat.a22, 0.0 );
rocks[num].pos += rocks[num].rY * 0.1;

for ( int j=0; j<100; j++ )
f += sin( sqrt(float(i)) );

meshA->render_MeshTemplate_BENCHMARK();
}


void MeshTemplate::render_MeshTemplate_BENCHMARK( void )
{
glClientActiveTexture( GL_TEXTURE0 );
glEnableClientState( GL_TEXTURE_COORD_ARRAY );
glTexCoordPointer( 2, GL_FLOAT, 0, texcoords );

glEnableClientState( GL_VERTEX_ARRAY );
glEnableClientState( GL_NORMAL_ARRAY );

glVertexPointer( 3, GL_FLOAT, 0, verts );
glNormalPointer( GL_FLOAT, 0, norms );

glMultiTexCoord4fv( GL_TEXTURE1, rocks[num].rX );
glMultiTexCoord4fv( GL_TEXTURE2, rocks[num].rY );
glMultiTexCoord4fv( GL_TEXTURE3, rocks[num].rZ );
glColor4fv( rocks[num].pos );

glDrawElements( GL_TRIANGLES, polygon_groups[0].num_indices, GL_UNSIGNED_INT, polygon_groups[0].indices );
// ----- vertex shader --------

varying float diffuse;

void main(void)
{
mat4 os = mat4( gl_MultiTexCoord1, gl_MultiTexCoord2, gl_MultiTexCoord3, gl_Color );

gl_Position = gl_ModelViewProjectionMatrix * os * gl_Vertex;

vec3 N = normalize( gl_NormalMatrix * gl_Normal );

vec3 mv_lp = vec3( gl_ModelViewMatrix * os * vec4(0,1,0,0) );
vec3 mv_vert = vec3( gl_ModelViewMatrix * os * gl_Vertex );
vec3 L = normalize( mv_lp - mv_vert );

diffuse = dot( N, L );

gl_TexCoord[0] = gl_MultiTexCoord0;
}

// ----- fragment shader --------

uniform sampler2D tex0;

varying float diffuse;

void main( void )
{
gl_FragColor = vec4( texture2D( tex0, gl_TexCoord[0].xy ) * diffuse );
}

nrg
10-27-2004, 10:44 AM
Originally posted by Corrail:

According to the ARB Sept & Dec meeting notes ARB_super_buffer extension will be avaiable about June 04.

Because we don't have the recent meeting notes, and nothing has happened regarding this extension, I'm quite sure we'll have to wait for the december ARB meeting before we get any more information.. and then six more months before the meeting notes are posted :(

Some ARB member was speaking about "soon" some time ago.. but I think "soon" was mentioned at Siggraph '03 already?

Well, on the other hand, couple of months more don't feel like much time anymore :)

I just hope ARB can finally agree on this.

ffish
10-27-2004, 05:15 PM
If I'm not mistaken, three new ARB extensions have just been posted to the extension registry, but still no framebuffer_object. :(

nrg
11-02-2004, 11:16 AM
Originally posted by ffish:
If I'm not mistaken, three new ARB extensions have just been posted to the extension registry, but still no framebuffer_object. :( Well that's something :)

Now let's just wait for a "xmas gift" from the ARB!

bobvodka
11-02-2004, 12:59 PM
Xmas which year? ;)