BeginIf, EndIf

What I am proposing here, is the creation of two generic extensions that will be useful for current and future OpenGL: ARB_Async_Value and ARB_If_Command.
(I don’t like either of the names but I am going to use them in this proposal).

ARB_Async_Value
This extension is useful for values that we will need to be referenced in other extensions (for example, values that are set in one extension and used in another one), even if, when defining the extension that sets the value, we can not predict its use in other future extensions for its own purposes.
With this extension we can query for a number of values that can be accessed by different extensions that change the value of a variable (for example, Fence, Occlusion_Query, …)

The function for querying those values can be:
glGenAsyncVal(GLsizei nvalueswanted, GLvoid *pHandlearray, , GLEnum arraytype)
with arraytype=GLfloat, GLbool, … (they can be always GLfloat without loss of generality)

For example:
Glhandle handleval;
glGenAsyncVal(1, &handleval, GLbool)

And this value, can be useful for extensions like FenceARBX, so it can change the state of this value and can be used as a query from other extension (vertex_program, the proposed arb_if_command, etc). For example, using it with a Fence extension and Vertex_Program you can build and async performance bar at the end of the frame to view the amount of time the GPU is idle waiting for the CPU…
Another example,
glBeginOcclusionQueryARBX(GL_PERCENTAGE_DRAWN, handleval);
… draw commands
glEndOcclusionQueryARBX()
(Note that the FenceARBX and OcclusionQueryARBX don’t currently exist, that is why I marked them as ARBX)
This way, the OcclusionQuery extension is setting the value referenced by handleval with the result of the percentage of pixels draw/total pixels.

Also there can be the typical glGet functions to obtain the value reference by its handle but, as with all the glGet functions you will lose parallelism. (glGetAsyncValf(handleval, &floatval));

I know that this extension is very general but it allows future connectivity between extensions based on values (something that can be very useful if they are integrated with future vertex_program and fragment_program extensions. And allows extensions like Occlusion_Query, Fences and future ideas to be connected with vertex or fragment programs or other extensions that can be produced in the future)

ARB_If_Command
With this extension, OpenGL can determine if it has to execute the commands that lie inside the BeginIf, EndIf par.
The syntax can be something like:
glBeginIf(ConditionArray, nConditions)
The conditions should be two values (or handles to values) with a condition (GL_GREATER, GL_LEQUAL, …). The amount of conditions supported can be hw dependent.
Here, we can use handles to values generated with the previously proposed extension (ARB_Async_Value).

For example:
At the beginning of the program
GLint occlrefvalhandle;
glGenRefVal(&occlrefvalhandle, 1)

In the render loop:
glBeginOcclusionQuery(GL_PERCENTAGE_DRAWN, occlrefvalhandle)
…DrawBoundingVolume
glEndOcclusionQuery()

GLconditionArray cond;
cond.val0=occrefvalhandle;
cond.comp=GL_LOWER;
cond.val1=0.25f;
glBeginIf(&cond, 1);
	...SetStates
	...BindTextures
	...SetArrays
	...DrawElements…
glEndIf();

The commands inside glBeginIf and glEndIf are executed only if the conditions are passed.
It can be made complicated with things like glElseIf and others like it. But I think the idea is clear.

What do you guys think about these ideas?

Well, for the first proposed extension, I fail to see how this really helps with anything. If I understand you correctly, you want to create an extension that allows you to declare variables within OpenGL. For every possible use that you listed, the desired result can be achieved by just having the user use the first extension, get the result to a variable that they have created, then pass this to the second extension.

I know that this breaks the asynchronous feature that the proposed extension has, but the problem is that as soon as you use one of your asynchronous handles to transfer information from one extension to another, you have to cause a flush to be sure of what’s in the variable, and it isn’t asynchronous anymore.

For the second extension, I basically see it the same way. I can easily do this sort of thing in my own code, so what benefits will the extension give me?

j

If i understand you correctly and also the OpenGL 2.0 specs, than the first thing you propesed will be available in OpenGL 2.0 via an advanced fence mechanism.

Jan.

Originally posted by j:
Well, for the first proposed extension, I fail to see how this really helps with anything. If I understand you correctly, you want to create an extension that allows you to declare variables within OpenGL. For every possible use that you listed, the desired result can be achieved by just having the user use the first extension, get the result to a variable that they have created, then pass this to the second extension.

Yes. A generic solution.

I know that this breaks the asynchronous feature that the proposed extension has, but the problem is that as soon as you use one of your asynchronous handles to transfer information from one extension to another, you have to cause a flush to be sure of what’s in the variable, and it isn’t asynchronous anymore.

No. This is why it is a handle. When glQueryOcclusion(…, handle) is executed, it puts, in the OpenGL command queue, a command that says ‘put the result of this operation in the value referenced by this handle when you reach glEndQuery’ (everything in server side). And when you put ‘glBeginIf(condition)’ with condition referring to the handle, a command is added telling ‘if the condition with the value referenced by this handle is true, execute the next commands until you find the next glEndIf or discard them. OpenGL don’t need to know the value at that moment because it is just adding commands to its queue.

And you can continue putting OpenGL commands, and they can be executed without the CPU intervention, because when you put the glBeginIf command it doesn’t need to know the value of the variable referenced by the handle. And it don’t have to wait until the value is known put the command in the queue and continue with the rest of the program or calling another OpenGL functions.


For the second extension, I basically see it the same way. I can easily do this sort of thing in my own code, so what benefits will the extension give me?

Do you understand how NV_OcclusionQuery works?
You put call the api to begin a query, then draw objects, and then finish the query. Those commands are added to OpenGL command queue. OpenGL can execute those commands a couple of frames after you have called the api functions (because when you call them it is rendering the commands from two frames before). So if you check the query results in your program you can say ‘wait until the end query command is reached’ so you can see the results of you query and breaking the parallelism between GPU and CPU (by synchronizing them) or asking if the query has been reach. Here you don’t break parallelism but you can have the result two frames after you asked them, and it can be useless at that moment (useful for flares and maybe to do some prediction of what can be occluded).
What I’m proposing (I suppose that more people have thought about it has it is a natural solution) is everything on the server side (the GPU in this case), and doesn’t need any intervention of the client (CPU) because it doesn’t need the values referenced by the handles.

I don’t know if my explanation is clear. It is difficult for me to write in English.

Thanks.

Originally posted by Jan2000:
[b]If i understand you correctly and also the OpenGL 2.0 specs, than the first thing you propesed will be available in OpenGL 2.0 via an advanced fence mechanism.

Jan.[/b]

It is good to hear that.

I remember talking with a developer relations of a hw company (and I think that I see something about that or something similar in a DirectX conference) that told me that they are thinking in a DrawIf series of commands (something like glDrawElementsIf)
But I think it is a limited point of view and a explosion of api functions (normal and if versions) and a big overhead if you want to do a big succession of bindtextureif, drawif, … commands. I think it is something impossible to do right. Or very limited if you just include DrawIf commands.

After a quick view (really quick) of Direct3D9’s help, I found nothing about that. Just asynchronous notifications (something you can do with nVidia’s occlusion query, nv fence, …)

I think that what I’m proposing can be simple of being implemented in hw where nv_occlusion_query or similar extensions can be implemented. Don’t you?

An asynchronous occlusion query / begin if can be a real win in most common situations. If you are using complex lighting/materials/effects (where each pixel is expensive) you can have a real win rejecting characters/objects. More if you have a previous pass for updating the zbuffer (for stencil shadows for example).

Thanks.

And you can continue putting OpenGL commands, and they can be executed without the CPU intervention, because when you put the glBeginIf command it doesn’t need to know the value of the variable referenced by the handle. And it don’t have to wait until the value is known put the command in the queue and continue with the rest of the program or calling another OpenGL functions.

Yeah, you’re right.

I don’t know what I was thinking when I said that the asyncronous handles would fail when they are being used by more than 1 extension. Probably I was thinking in terms of when the commands are sent to OpenGL instead of when they are executed, like you said.

So, now that I understand the benefits that extensions like this would bring, I like the idea. The glBeginIf/glEndIf instructions seem to me to be a lot like the predicated instructions in the Itanium instruction set.

About the only downside that I can think of now would be that all the commands still need to be buffered, even if they won’t be executed. So some memory and possibly bus traffic may be wasted in exchange for better parallelism between the CPU and GPU. I think it still definitely would be a win overall, though.

j

Originally posted by j:
I think it still definitely would be a win overall, though.

Yes, I think so. I am putting this idea here because I think it is more or less simple and maybe it can open the mind of a good engineer from an IHV company. (This way maybe we wont have to wait until Direct3D to have something similar for having an extension to do it )

IHVs usually say: “Vertex are not the bottleneck, we can process much more vertex than people are sending in current games” and I think: “yes, and we are not going to send more because you have to process all of them. My models will be always low/medium detailed (relative to hw power) because I prefer to draw more models than a few with more vertex.” Many of the models are occluded or partially occluded and it will be worst every day with more complex scenarios, with trees, hills, complex buildings, complex indoors, … And a good occlusion query will help you to don’t draw occluded objects or select a simpler material with less passes, texture shaders, a simpler vertex program for lighting calculations, etc.

Thanks.

The only real problem with the “BeginIf/EndIf” paradigm is that the ability of implementations to store GL commands is, at best, limitted. This is why display lists aren’t the best win in terms of performance: too many GL commands require a CPU-GPU sync operation.

A bind texture call, for examply, is not a GPU-only call. If that texture isn’t resident, and only the CPU-driven driver likely knows this (hence, the GPU will have to interrupt the CPU or otherwise halt CPU operations for a time), then you’re going to have to upload a texture. Add to that the fact that, anytime you upload a texture, you may need to invalidate the on-chip GPU texture cache (lest the cache contain old data that was in the memory when the new texture came in). Admittedly, a cache-invalidate could be handled, depending on the hardware, without a stall, but I wouldn’t bet on it.

Not only that, but the CPU must get involved so that it can manage texture memory. It must decide where the new texture gets uploaded, and which textures get forced out.

Not to say that it is a bad idea, mind you. I’m just pointing out things you should be aware of. It would certainly be nice to see GL commands executed based on internal GL state, but that has to be difficult to implement in hardware.

Of course, with a programmable command processor…

Oh, and there’s no way you’re getting a 2-frame overlap using OpenGL as the spec currently exists. There’s simply no way for any known implementation to draw that much without a synchronization event (which happens every time you touch certain state [hw-dependent], like vertex programs, pixel programs, etc). If you aren’t using NV_VAR, you definately aren’t getting a 2-frame overlap.

Not only that, even if the implementation could buffer every GL command you send, there’s no way the driver would be willing to set aside the several MB of space that the commands for 2 frames of rendering might require. Hence, you will have to sync and stall at some point.

I’d say you’d be very lucky to be able to send half of your GL commands to the GPU without a CPU/GPU sync operation.

about BeginIf, EndIf http://www.3dlabs.com/support/forum/thread.jsp?forum=32&thread=2398&tstart=15&trange=15

Korval:
I’m sorry I didn’t check this thread again. So this answer is more than a year late. This is not true. I’m (was) using current occlusion query extension. I’m drawing a real scene of a game with some different materials/textures/passes per light, etc and, as I remember, I don’t get the results of the occlusion queries before two frames of the frame where the querie was post.

Anonymous~:
The provided link does not work (I’m sorry I’m checking it more than a year later).