what is GL_NVX_conditional_render?

In the glext.h that comes with the NVidia SDK 7.0, there are entry points for the following extension:

#ifndef GL_NVX_conditional_render
#define GL_NVX_conditional_render 1
#ifdef GL_GLEXT_PROTOTYPES
void GLAPI glBeginConditionalRenderNVX (GLuint id);
void GLAPI glEndConditionalRenderNVX (void);
#endif   

Any idea what exactly this would be supposed to do? Which hardware will support this (or a similar) extension?

NVX_conditional_render is not yet fully spec’d.

We’ll have more information on this extension (or more likely, the non-experimental version) soon.

Thanks -
Cass

Any idea what exactly this would be supposed to do? Which hardware will support this (or a similar) extension?
It’s pretty obvious the answer to both questions:

1: Clearly, it provides the ability to conditionally render batches of primitives. The ‘id’ parameter on glBeginConditionRenderNVX is likely the ‘id’ of an ARB_occlusion_querry querry.

2: Well, since nVidia loves to expose new hardware features to OpenGL users, and only exposes them when either hardware support is imminent or avaliable, I’d venture a guess that we’re looking at a previously unpublished feature of NV40.

I will like if it is something similar to this (written more than a year ago): http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=7;t=000343

It will be a real performance win.

Originally posted by Korval:

1: Clearly, it provides the ability to conditionally render batches of primitives. The ‘id’ parameter on glBeginConditionRenderNVX is likely the ‘id’ of an ARB_occlusion_querry querry.

Ok, that seems rather obvious. Then rendering in the enclosed block would occur only if the result of the occlusion query had not been zero fragments.

However, it would be also nice to have a more fine-grained control: e.g., conditionally render only if the result of the occlusion query is greater than 100 fragments. But maybe the hardware will not allow this … or the interface for the functionality is still not fixed.

So what the main idea behind this?

Send GL commands while the occlusion query result is not ready?

Originally posted by V-man:
[b]So what the main idea behind this?

Send GL commands while the occlusion query result is not ready?[/b]
I had hoped that we would have a public spec before we talked about it, but oh well. :slight_smile:

Yes, you can issue occlusion queries followed immediately by drawing that is conditional upon that query.

This is useful because the command buffer for rendering can be quite large. As an application developer you don’t wait for those results before issuing further rendering. You just want to say, “don’t bother if it’s occluded”. Fire and forget.

Hiding the latency of occlusion queries was always one of the principal hurdles to their effective use. This removes that hurdle for a significant class of occlusion query usage.

There will be more information on this subject later, and I probably won’t comment again until there is. I hope this is good enough for now. :wink:

Cass

sounds cool. i know of one who could like that…

hope that it will not be an nv-only thing…

Hiding the latency of occlusion queries was always one of the principal hurdles to their effective use. This removes that hurdle for a significant class of occlusion query usage.
What happens in this case

glDepthFunc(GL_LESS);

//The result is not ready so this does not get executed now
glBeginConditionalRenderNVX(SomeQueryID);
//draw stuff
glEndConditionalRenderNVX();

glDepthFunc(GL_GREATER);
//draw other stuff

//The result is ready, conditional gets executed, but GL state machine changed at this point

Ok so if the evaluated condition is false, the objects specified between the beginConditionalRender and endConditionalRender are not rendered.

How is that condition implemented?

Specifically, will the test be handled at the driver level, or will the geometry still be sent to the video card and wait there (so that the bandwidth cost will still be here) ?

I’m assuming the specified geometry will lie in a temporary buffer somewhere until the result of the condition is known, what size this buffer be ? What happens if it’s full and new data arrives ?

Y.

Originally posted by Ysaneya:
[b]
How is that condition implemented?

Specifically, will the test be handled at the driver level, or will the geometry still be sent to the video card and wait there (so that the bandwidth cost will still be here) ?

I’m assuming the specified geometry will lie in a temporary buffer somewhere until the result of the condition is known, what size this buffer be ? What happens if it’s full and new data arrives ?

Y.[/b]
I suppose that this will be implementation specific. If you are bus bounded it is no doubt that not transferring the geometry will be a win.
If you are vertex transform limited (you have complex vertex programs/shaders or you have high detailed geometry wisely using static VBO in a wise implementation that can decide to have it in video memory), it will be a win.
If you are fillrate limited (I think that this is the most common problem, and it will be worst with complex fragment shaders and massive stencil shadows) then, even transferring the geometry, you can have a huge benefit from a extension like this one: you can decide which silhouette edges to extrude (you don’t have to extrude edges from objects occluded from the light point of view), or what objects with complex materials to render, or when there will be geometry shaders… (those are just some examples).

There are some things that can be difficult with this extension, for example to have your own copy of the OpenGL state to reduce state change calls…

Just thinking out loud.

One typical use of the occlusion query extension that will not benefit from a extension like this one is for rendering flares.
I’m drawing one small quad in the centre of the flare (without updating the color buffer) to know the amount of the flare that should be visible using the occlusion query extension. Depending on the resulting value (that I usually have one or two frames after the query was post) I decide the intensity of the flare.
It is a pity that you can not track the value that the query returns to use it. It can be difficult to have a gl function for it, but maybe it can be more easily to have access to it (as part of the gl state) within a vertex or fragment shader. This extension will be more helpful with something like this.

Originally posted by cass:
There will be more information on this subject later, and I probably won’t comment again until there is. I hope this is good enough for now. :wink:
Cass[
No problem. I’m just happy to get anything that makes occlusion queries more useful. :slight_smile:

I’d like to throw a related suggestion out there, not in any way diminishing the value of this work, but hopefully adding to it. This is from something I posted in the OGL2.0 forum a while back. And if someone is solving the problem of conditionals in the pipeline, then this may soon be feasible, though people can debate the desirability.

Assuming geometry is still even partially batched by state (don’t want to open that debate again, just assume for a moment), the spatial coherency of a VBO may be low and sharing a single occlusion conditional across a batched drawElements call may not be ideal.

On the other hand, if there existed an API that allowed us to specify multiple occlusion conditional IDs per call, then we can have many fine-grained queries going at once without too much overhead. This is an issue both in terms of issuing the occlusion query begin/end commands and subsequently in evaluating the conditionals tied to them.

Practically speaking, it doesn’t make any sense to have a parallel array of IDs per vertex or index, but I can imagine something like the multiDraw API that might take an array of begin/count index pairs along with an occlusion ID per group.

[I hope that terse description makes sense. If not, let me know and I’ll be more verbose. :]

Anway, just a suggestion. Thanks.

Avi

Hi

I just want to share my excitement about this :smiley:
I really think this extension can cause huge speedups, if one uses it correctly.

BTW: I am quite certain, that “nested” conditional renders are possible, no? This would make it even more useful.

Jan.

Nice.

Originally posted by V-man:
[b] [quote]Hiding the latency of occlusion queries was always one of the principal hurdles to their effective use. This removes that hurdle for a significant class of occlusion query usage.
What happens in this case

glDepthFunc(GL_LESS);

//The result is not ready so this does not get executed now
glBeginConditionalRenderNVX(SomeQueryID);
//draw stuff
glEndConditionalRenderNVX();

glDepthFunc(GL_GREATER);
//draw other stuff

//The result is ready, conditional gets executed, but GL state machine changed at this point[/b][/QUOTE]You cannot out-of-order execute the conditional rendering. It either gets rendered or not, depending on the result of the occlusion query, but if that result is not ready you cannot go on and execute other commands in the command buffer before returning to the conditional rendering.
But as long as the occlusion query is not ready, the GPU is still working on it, so you waste no cycles while the conditional rendering is waiting for execution.

It has been a long time since last post and the extension is still a NVX extension in latest drivers.
But I have found that there is an example usint this extension in NVSDK 8.0. The example is not in the browser but it is in the directory ‘nvsdk80’\DEMOS\OpenGL\bin\Release\conditional_render.exe (executable) and ‘nvsdk80’\DEMOS\OpenGL\src\conditional_render for the source code

I prefer to have this extension over the instancing one. :stuck_out_tongue:
But I prefer to have a proper render2texture over this one :smiley:

Hope this helps.

Cool :slight_smile:

I’d say i’m waiting equally for ARB assembler langages update as the framebuffer object extension… One of the thing i regret the most compared with direct 3d is the lack of standard assembler style langages for all generations of hardware… We have to do microprogramming with vendor specific extensions to get the equivalent of pixel shaders 1, this is really pathetic…

Let’s hope this extension will be an ARB standard soon… It looks promising.

regards,

I’d say i’m waiting equally for ARB assembler langages update as the framebuffer object extension…
You’re going to have to wait a long time (re: forever) for that. The ARB is hard-pressed to keep up with modern functionality; they have no time to add functionality for previous hardware. Also, the ARB is pushing glslang as the single API for hardware programmability. That’s why you don’t see ARB_vertex/fragment program going into the core for GL 2.0. You’re not going to see an update to ARB_vertex/fragment program (though nVidia will, of course, update their NV_*_programs for their own chips).

Let’s hope this extension will be an ARB standard soon… It looks promising.
As nice as it would be, I don’t think it’s going to happen. Much like nVidia’s excellent primitive-restart extension, this requires direct hardware support in order to get any real benifit out of it. ATi probably isn’t going to bother implementing this in hardware unless developers really start wanting it (and if Direct3D starts requiring it). And this isn’t likely to happen, partially because of lack of hardware support. It’s kind of annoying, the catch-22 of extensions like this. You can’t get other vendors to support them without developer use, and you can’t get developers to use it because of lack of vendor support.

Did you run that demo? Does it run on all (nVidia) hardware, or only on the Gf 6?

I don´t have a Geforce, can´t try it out myself.

Jan.