PDA

View Full Version : Setting a vertex base index in a glDrawElements call



knackered
07-14-2004, 10:32 AM
I'm a little puzzled about why there isn't the ability to specify an origin index in the glDrawElements/glDrawRangeElements functions.
It is possible in d3d9 (DrawIndexedPrimitive has it as a parameter), so why not in GL?
It would seem a better way of squeezing more out of a VBO without switching binds.

Won
07-14-2004, 12:23 PM
Doesn't BUFFER_OFFSET do just that? Am I misunderstanding you?

-Won

V-man
07-14-2004, 12:31 PM
You can do this in both VA and VBO.

VA:
glDrawRangeElement(....., &index[start]);

VBO :

glDrawRangeElement(....., START_ADDRESS_IN_BYTE);

Yet again, GL is more efficient by having less parameters per function :)

knackered
07-14-2004, 01:46 PM
no, that specifies a different starting position in the index array, what I'm talking about is specifying a different ZERO index in a VA. So if you specify, say, 4 in the index array with a buffer origin of 10, the index will be interpreted as 14 rather than 4.

Humus
07-14-2004, 02:27 PM
For that you need to use an offset in the vertex buffer rather than the index buffer. Just offset 10 vertices into the vertex buffer and you're set. As simple as that. :)

knackered
07-14-2004, 02:41 PM
No, you don't understand - I know there are ways around it, I'm saying that to change the 'offset' into the VBO you have to do a glVertexPointer, which, if you read the various documents scattered around, is the most expensive operation you can do in the VBO extension...it is emphasised that you should only do that once per buffer object, so doing it several times while rendering a single model is obviously not going to win awards for efficiency.
HOWEVER, direct3D9 has a parameter for its version of glDrawRangeElements (called DrawIndexedPrimitive) which enables you to effectively do what glVertexPointer does but without the expense (it's *not* an optional parameter, therefore I have to assume it's done more efficiently than the GL workaround).
So, it's in the drivers/hardware, why isn't it in our favourite API?

V-man
07-15-2004, 06:47 AM
Which parameter does that?
Is it the second one --> BaseVertexIndex

minIndex and numVertices are like the range in glDrawRangeElements

StartIndex is like an offset into the index buffer.

...so I was guessing it's BaseVertexIndex.

I'm not an authority on GL, but my guess is the reason why this isn't available is that glDrawRangeElements predates DX9 (maybe even DX8, I don't remember) and nobody bothered to add the feature.

Are you doing a kind of animation or something that requires this?
Do you have multiple geometries in a single VBO with a single index buffer for them all?

knackered
07-15-2004, 07:09 AM
Originally posted by V-man:
Which parameter does that?
Is it the second one --> BaseVertexIndex
Correct.


Originally posted by V-man:
I'm not an authority on GL, but my guess is the reason why this isn't available is that glDrawRangeElements predates DX9 (maybe even DX8, I don't remember) and nobody bothered to add the feature.
I realise that. My question was supposed to provoke the the wondering as to why an extension to glDrawRangeElements wasn't introduced at the same time as VBO - to add this important parameter.


Originally posted by V-man:
Do you have multiple geometries in a single VBO with a single index buffer for them all?Not a single index buffer, but a single vertex buffer for a particular sector isn't an insane proposition, is it?

idr
07-15-2004, 09:31 AM
To be honest, it wasn't added in VBO because, AFAIK, nobody on the WG had ever heard anyone ask for that functionality in OpenGL. :) This is certainly the first time I've ever heard anyone ask for it.

Would adding a pair of functions like ElementIndexBase( uint base ) and an associated 'get' do the trick? I think I'd prefer that to adding yet another DrawElements entry point. The advantage being that setting the base index would automatically apply to all the various drawing functions (ArrayElement, DrawElements, DrawRangeElements, MultiModeDrawElements, etc.), while adding just one new entry point.

The disadvantage being that it would apply to all calls. That might make things complicated for display lists. Imagine calling ElementIndexBase in a display list and calling it before calling the display list, for example.

Thoughts?

knackered
07-15-2004, 10:57 AM
It depends on whether there is the same cost as changing the VBO offsets - in other words, if calling the proposed ElementIndexBase just disguised a sequence of gl*Pointer calls, then there's no point, although it would be implementation dependant I suppose.
I guess what I'm trying to ascertain is whether d3d9 actually IS paying some price for this handy extra parameter, or is it a driver shortcut that isn't exposed in OpenGL simply because of semantics.

idr
07-15-2004, 08:30 PM
There are a couple differences between making a bunch of gl*Pointer calls, at least from the app's perspective. If an app is going to make a bunch of gl*Pointer calls, it has to know which ones to make. Depending on the GL state, that may be non-trivial. It would certainly be easy to miss one and wonder why things looked screwy. :) Additionally, the app would have to make multiple GL calls in the gl*Pointer method versus one call in the glElementIndexBase method. In the vast majority of cases that wouldn't cause any performance in pact, but it could. Finally, we could spec it so that glElementIndexBase could be compiled into display lists, where as gl*Pointer calls cannot.

Given the way that VBOs are supposed to be cached in video memory, there may be other performance wins with glElementIndexBase. I'll have to look at the spec again and look through the Mesa code. Since I'm going to OLS next week, I might not get to it for awhile...

idr
07-15-2004, 08:35 PM
Given the way that VBOs are supposed to be cached in video memory, there may be other performance wins with glElementIndexBase.I should elaborate on that a bit. For a VBO, gl*Pointer is the moral equivalent to glBindTexture. It's the point when the VBO really gets bound into the state vector. As such, it can be a fairly expensive operation. Like I said, I'll have to think about it and look through the spec & code.

SeskaPeel
07-16-2004, 02:08 AM
It would seem a better way of squeezing more out of a VBO without switching binds. Could you explain what could be done this way, please ?
The only thing I can think of would be to store 2 near to identical meshes in a single VBO (consecutively), and sharing for these two models the same index buffer.

SeskaPeel.

knackered
07-16-2004, 07:17 AM
Basic example = the vertices for an entire game level are stored in a single vertex buffer (say you trust the driver to do a good job at memory managing, like some crazy fool), while each sector of your world is drawn using a local set of index arrays connecting up vertices found in the global vertex buffer. If ushorts are used for the indices then you're obviously going to be limited to 0xFFFF vertices for the WHOLE level, unless you do one of the following: 1) split the big vertex buffer into smaller ones (err..). 2) re-specify your buffer offsets (gl*Pointer) for every bloody attribute before your glDrawRangeElements (very inefficient). 3) hassle the vendors or whoever to get with the programme and add the ability to specify an origin index.

V-man
07-16-2004, 08:31 AM
Originally posted by SeskaPeel:
Could you explain what could be done this way, please ?
The only thing I can think of would be to store 2 near to identical meshes in a single VBO (consecutively), and sharing for these two models the same index buffer.

SeskaPeel.Yes, if both models can share an identical index buffer. You could use it to do keyframe animation and all the keyframes could be in the same VBO.

This should be cheap. Why would applying an offset be expensive for the card?

The big question is : Why did MS decide to put the Base parameter in DrawIndexedPrimitive in D3D9.
D3D8 has a SetIndices function.

SeskaPeel
07-19-2004, 04:30 AM
Both example are correct. But ...

1/ Entire level :
The weak experience I have on such batching is that you'll have to switch a lot of parameters between each call, that *should* hide glVertexPointer() latency. This may be level specific, and I suppose you are thinking of a test application, with low rendering options and heavy geometry.

2/ Keyframe animation :
Then you won't have any interpolation between keyframes.

I still think that the correct answer to such problems would be to expose an even more programmable pipeline. Like a memory shader ...

SeskaPeel.

l_belev
07-19-2004, 05:40 PM
Probably this difference between OpenGL's draw-indexed-primitive function and it's directx counterpart has something to do with their performance difference - the microsoft's variant is famous whith it's slowness and high cost when compared to the OpenGL's one. I don't know what are the details of the different hardware implementations, but i can imagine that it is the case that specifying new vertex origin forces the driver to re-set-up the hardware's vertex pullers to new start address (and probably all other states like the data formats/types/etc, if these operations can not be separated). After all that's exactly what glVertexPointer/etc do, and they are heavy-weighted operations. So probably the lack of such parameter of glDrawElements (and the fact that no extension adds one) has a good reason.

zeckensack
07-19-2004, 06:18 PM
Originally posted by idr:
Finally, we could spec it so that glElementIndexBase could be compiled into display lists, where as gl*Pointer calls cannot.Would this be desirable? I think it wouldn't make much sense ...

The element base state would be closely tied to the "stride" argument(s) to gl*Pointer and, obviously, to the "pointer" argument(s). It could be a new source of confusion if one would compile into DLs, and not the other.

If you have your vertex data in an array, you shouldn't need to copy it into a display list. VBOs make geometry in DLs rather redundant anyway. I think the element base should be purely client state, and not compile into a DL.

idr
07-26-2004, 03:23 PM
I have a proposed extension spec (http://dri.sourceforge.net/cgi-bin/moin.cgi/MESAX_array_element_base) available for review. If this looks about right, I'll whip up an implementation for Mesa, and I'll see if one of the other DRI developers can get it supported in some of the hardware drivers. Thoughts?

V-man
07-26-2004, 06:23 PM
You forgot to define what happens when base + index overflows.

Keep in mind the user can use ubyte(8), ushort(16), uint(32) for the indices.

OPTION 1 : let it wrap back to 0
OPTION 2 : flag GL_OVERFLOW
OPTION 3 : promote to higher precision

evanGLizr
07-26-2004, 11:29 PM
Originally posted by V-man:
You forgot to define what happens when base + index overflows.

Keep in mind the user can use ubyte(8), ushort(16), uint(32) for the indices.

OPTION 1 : let it wrap back to 0
OPTION 2 : flag GL_OVERFLOW
OPTION 3 : promote to higher precisionRather than what happens when base+index overflows, what is missing is what happens when the rebased index is outside the range, in which case I assume it just does what the current spec does for glDrawRangeElements and indices out of range, which is nothing (i.e. implementation dependent):



It is an error for indices to lie outside the range [start, end], but implementations may not check for this situation. Such indices cause implementation-dependent behavior.
( glDrawRangeElements man page (http://www.die.net/doc/linux/man/man3/gldrawrangeelements.3.html) )

Doing any other thing would be bad performance-wise (returning GL_OVERFLOW would force the CPU to verify the array of indices) or compatibility-wise (may not be compatible with what the current graphics cards support so might again force the CPU to verify the indices).

Re base+index overflow, the size of the indices does not affect how the base works: a straight-forward implementation of this array base is to rebase the memory pointer of the vertexarray, which will be a memory address. Using ushort or uint as indices is independent of the memory address the array is mapped to and shouldn't cause any kind of overflow.
There's no "array addr" + "base" + "index" sum going on, what happens is that "array addr" is rebased by "base" at vertexarray mapping time, and then, when indices are specified, they are added to the new array address as usual (which just happens to be rebased).

Maybe it would be nice to specify how setting the base index affects the range of DrawRangeElements (is the range an absolute value or is it relative to the current array base?).

knackered
07-27-2004, 03:45 AM
Does anyone have any info on how this is handled in a d3d driver? Surely all these issues were addressed there....

idr
07-27-2004, 10:01 AM
Those are both good issues. I'll add them to the list. :) Here are my thoughs.

The overflow case of (base+i) can't really happen due to the way the extension is defined. It is defined by modifying the behavior of ArrayElement, which takes an int as a parameter. The value of (base+i) would have to overflow a signed int. In that particular case, some words should be added to paragraph 5 on page 25 saying that the behavior in that case is undefined. Perhaps some words should be added to the DrawRangeElements paragraph on page 27 stating that start and end are also offset by base. This seems like the most logical behavior.
I'll update the extension spec with these two issues. I'll leave them marked as unresolved until we decide.

V-man
07-27-2004, 03:45 PM
It is an error for indices to lie outside the range [start, end], but implementations may not check for this situation. Such indices cause implementation-dependent behavior.
That paragraph does not address the issue of overflow. It seems to reference min max range.

Since with this extensions, one number is added to another, the issue of overflow is raised, and overflowing CAN happen with ubyte and ushort indices.

Would it not be a problem with Geforces that prefer ushort?

idr
07-27-2004, 04:03 PM
It depends on the hardware. For example, the R200 has a register to hold a base index value. If you set the base index and call DrawElements, it wouldn't have to modify any of the element data. It would just modify the value of that one register. In that case, there is no overflow since the chip had better be able to internal add the index and the base. :)

In any case, all of the DrawElements type functions are defined in terms of calling ArrayElement. This extension changes the behavior of those functions implicitly by changing the behavior of ArrayElement. Therefore, the driver just has to make it work.

For DrawRangeElements, it should be easy enough to tell if you hit this case. If the hardware doesn't work like that, it may hit a slightly suboptimal path, but it shouldn't be too bad. If the hardware has to modify the index values on upload, it can look at end+base and decide if it needs to use a large datatype. It would be more painful for DrawElements type calls, however.

idr
07-27-2004, 04:12 PM
I just thought of this after hitting 'Add Reply'. If the hardware does support a specific base register, it may be faster for the caller to use this extension and make multiple draw calls. For example, if a mesh has 150,000 points, but the model can be broken into groups where all the points indicies line within 64k of each other. The app could divide up the mesh so the ushort indicies could be used and make multiple DrawRangeElements calls separated by ArrayElementBase calls.

l_belev
07-27-2004, 05:43 PM
I don't see why a new extension is necessary for this matter. Such functionality is already available in the OpenGL (i.e. call glVertexPointer/etc with new pointer parameter).
The implementation could check if the other parameters of these functions are unchanged (i.e. optimize the case when the caller wants just to set a new base index).

If you are concerned about the high-cost of these funcs, it is just the cost of the actual operation performed (i.e. there's no cost associated with the functions names themselves, of course), so if the operation is optimized, there's no need to worry about the cost.

If this operation can be faster with a new parameter to the glDraw* functions (or whatever other way to do this thing), I see no reason for the operation not to be as fast with the gl*Pointer functions.

If this is not optimized in some current implementations (probably nvidia or ati), this is their problem, but not a problem of the API.

On the other hand, if you provide a new means of setting base index, that would be a duplicate functionality - one time we have a base set via the gl*Pointer funcs and then we have a second base set some other way and neither of them have independent meaning by itself - only their sum is meaningful. This does not seem quite elegant to me.

Korval
07-27-2004, 07:26 PM
I_belev has a point. There is no specific reason why driver developers should make gl*Pointer calls heavy-weight under offset-like circumstances (unless they're only uploading part of the data, in which case, offsetting will also be heavy-weight). The best way to get driver developers to optimize for these circumstances is to use them. They optimize for the path that is used, so if we use this path alot, they will optimize for it.


This does not seem quite elegant to me.Actually, I consider the other method more inelegant. After all, what I want to do is offset each pointer by some amount. Rather than having to loop over each bound pointer and changing it (taking the chance that I miss one or bind another later), I just have a one-stop section to do the offsetting from.

To me, the big question is why gl*Pointer operations (in general, not even in an offset-like case) are so heavy-weight to begin with. Is there any specific reason why a change of VBO has to provoke a performance hit? After all, if the VBO(s) are all in their respective memory sections, why would it cause a performance problem? It's no different from changing textures from one video-resident texture to another.

evanGLizr
07-27-2004, 10:57 PM
Originally posted by V-man:

It is an error for indices to lie outside the range [start, end], but implementations may not check for this situation. Such indices cause implementation-dependent behavior.
That paragraph does not address the issue of overflow. It seems to reference min max range.

Since with this extensions, one number is added to another, the issue of overflow is raised, and overflowing CAN happen with ubyte and ushort indices.

Would it not be a problem with Geforces that prefer ushort?My point - and what I unsuccessfully tried to explain before - is that the rebasing is not done by adding base+index every time you send an index, it's done by setting the vertexarray address to "old address"+base. And because "old address" is a memory address, the size of the indices being used is irrelevant. There's no need for a base register in the hardware.

idr
07-27-2004, 11:07 PM
l_belev,

The problem is that to optimize for all the various special-case usages of the gl*Pointer functions, the driver has to have checks and tests to determine when to activate the special-case fast paths. Those tests have cost. We can avoid the need for those tests by having a way to directly tell the driver "I want you to do this" instead of making it try and figure it out for itself.

Korval,

The gl*Pointer calls are expensive because, particularly with VBOs or CVAs, they have to do a fair amount of data validation. The validation costs don't always show up in the gl*Pointer call, but instead show up in the next draw call. The cost isn't just the data upload. Also, as you pointed out, making one API call is better than making many API calls.

l_belev
07-28-2004, 05:10 AM
Originally posted by idr:
l_belev,

The problem is that to optimize for all the various special-case usages of the gl*Pointer functions, the driver has to have checks and tests to determine when to activate the special-case fast paths. Those tests have cost. We can avoid the need for those tests by having a way to directly tell the driver "I want you to do this" instead of making it try and figure it out for itself.what various cases are you talking about? All that has to be done is the comparison of the other function parameters and the bind vertex buffer object with their current values from the OpenGL context. This is just a few compare/test instruction for the CPU, which is exactly <nothing> when compared to the actual work that has to be done.


Korval,

The gl*Pointer calls are expensive because, particularly with VBOs or CVAs, they have to do a fair amount of data validation. The validation costs don't always show up in the gl*Pointer call, but instead show up in the next draw call. The cost isn't just the data upload. Also, as you pointed out, making one API call is better than making many API calls.
Can you be more specific. What data validations are talking about, that can not be avoided when the implementation determines that the application just wants to set a new base index. Also what are these data uploads. The data is supposedly already in the right place (say, the video memory), so when the driver sees that the application just means to set a new base index, no more uploads/whatever should take place. Here I speak only about the VBO case. If one uses classic (userspace) vertex arrays, then the performance is bad anyway, so there's no point of much efforts to optimaze this case. About the compiled vertex arrays - who cares about them anymore as we have VBOs.

knackered
07-28-2004, 05:36 AM
You don't think there's a lot of driver mangling with this little lot? Not to mention tempting fate.

glbindbuffer(...)
glColorPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE0)
glTexCoordPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE1)
glTexCoordPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE2)
glTexCoordPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE3)
glTexCoordPointer(...)
glEnableClientState(...)
glAttribPointer(0,...)
glAttribPointer(1,...)
glVertexPointer(...)
glEnableClientState(...)

l_belev
07-28-2004, 06:18 AM
Originally posted by knackered:
You don't think there's a lot of driver mangling with this little lot? Not to mention tempting fate.

glbindbuffer(...)
glColorPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE0)
glTexCoordPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE1)
glTexCoordPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE2)
glTexCoordPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE3)
glTexCoordPointer(...)
glEnableClientState(...)
glAttribPointer(0,...)
glAttribPointer(1,...)
glVertexPointer(...)
glEnableClientState(...)OK, here you are right. This could be enough to justify such new extension if the people really often need to set a new index base and if this operation is relatively cheap.

skynet
07-28-2004, 07:04 AM
I also want to strongly vote for a separate baseindex.
As knackered pointed out, using the various glXXXPointer methods is cumbersome and may be slow. It also relieves the codepart that sets the baseindex of knowing
a) what arrays are currently enabled or will get enabled before the next draw-call
b) what specific datalayout those arrays have (stride, offset, interleaved/nointerleaved etc.)
c) what VBO the array (to be offsetted) is currently bound to

Also, it is very important to NOT interpret the
proposed base as offset in terms of bytes-into-the-arrays, but to actually add the offset to the indices stored in the element array (again, think of differet strides in different arrays).

I also want to vote for negative baseindices to be possible. Restricting them to >=0 is rather arbitrary.

Third, I want to vote for naming the function "glBaseElement()" or "glElementOffset()" (or something similar). "ArrayElementBase" sounds a bit misleading, since its primary use is not in conjuction with glArrayElement() (where the programmer can already easily do the offsetting by himself), but rather glDrawElements() or glDrawRangeElements().

All problems with hitting unspecified memory adresses or disallowed indices can be just put to the programmer┤s responsibility. The specs should just note that hitting vertices outside the specified range result in undefined results :)

Last, a question related to that topic.
What if I specify a negative buffer offset into a VBO in one of those glXXXPointer() calls, but ensure that the used indices actually hit valid adresses inside the VBO ? Is it allowed?

idr
07-28-2004, 07:51 AM
I also want to vote for negative baseindices to be possible. Restricting them to >=0 is rather arbitrary.It is somewhat arbitrary, but it does match hardware functionality. Again, think about hardware that has a base-index register. I don't know for sure that all hardware will handle that smartly. I will add this to the issues list. If this extension ever got promoted to EXT or ARB, this is something that could be discussed again.


what various cases are you talking about? All that has to be done is the comparison of the other function parameters and the bind vertex buffer object with their current values from the OpenGL context.Again, think about the case of hardware that has a base-index register. The gl*Pointer functions would have to detect that all pointers were advanced in such way, taking stride into consideration, to adjust the index by some fixed value. That sounds like more than a couple easy comparisons to me. ;)

While I'm updating the spec, I'm going to mark issue #3 as resolved. I'm voting in favor of not having a per-array base because that defeats part of the optimization potential in the driver. It also defeats the single API call niceness.

I'm also going to resolve issue #6. It turns out that the wording on page 27 of the spec is such that we don't need to make any changes. WRT to DrawRangeElements it says:


...with the additional constraint that all values in the array indices must lie between start and end inclusive.Since it's pretty clear that it's only the actual values in the array that are constrained, I don't think we need any special wording changes for ARRAY_ELEMENT_BASE_MESAX.

In any case, I posted version 0.3 of the spec.

l_belev
07-28-2004, 08:53 AM
idr,
I agree with you, this extension would probably be a good thing.
But when considering the details about it, please don't tune it for the currently available hardware possibilities. For example if allowing negative base index is considered a good thing in principle, let it be so. Look at the GL_ARB_texture_non_power_of_two - currently it has no hardware support at all, but that's going to change in the future. I think that is the OpenGL style - to be far-sighted.
Note that here I'm not arguing that allowing negative base index is a good idea - I'm not convinced that's the case. I just gave it as an example.

zeckensack
07-28-2004, 10:19 AM
I'd like to propose a somewhat different approach. Let's start with the name: glElementIndexBias. Technically I'd prefer glIndexBias, but that may be confusing with respect to color index functionality.

First of all, I don't think the new functionality is useful at all to users of glDrawArrays (or glArrayElement). The base index is easily "emulated" while calling these entry points, so there's no need to add all that nasty interaction with proper vertex array state.

I rather think that what we have here is a tool for indexed geometry. That's why I'd like to see the functionality restricted to Draw{|Range}Elements. The behaviour change to ArrayElement would need to be removed then, and instead there would only be an "inline" effect to array indices sourced from an array.

I.e. I propose this change (against the 1.5 spec):

DrawElements (mode, count, type, indices);

is the same as the effect of the command sequence
if (mode, count, or type is invalid ) generate appropriate error
else {
int i;
Begin(mode);
for (i=0; i < count ; i++)
ArrayElement(indices[ i] +bias); //<<
End();
}... and reverting the ArrayElement spec back to its original form.

With this restricted functionality, I'd prefer "bias" to "base" or "offset", because I think the latter two are potential sources of confusion. They are easily mistaken for offsets to the indices pointer. "bias", I think, is a term that's widely understood to be an in-line modifier.

l_belev
07-28-2004, 11:40 AM
First of all, I don't think the new functionality is useful at all to users of glDrawArrays (or glArrayElement). The base index is easily "emulated" while calling these entry points, so there's no need to add all that nasty interaction with proper vertex array state.
The fact that index offset would not be too usefull for glArrayElement is not enough reason to cut if off. In fact glArrayElement itself is rarely usefull in practice, so this whole issue isn't of much importance, but what's more important here is to preserve the specification clarity and consistence. I think the original variant is better - it does not introduce unnecessary discrimination between glArrayElement and glDrawElements. Otherwise one would have to remember one more unnecessary rule - when the index base is applied and when not. For the glDrawArrays the base index isnt applied anyway, so it is out of question.

Obli
07-28-2004, 11:53 AM
Originally posted by SeskaPeel
The weak experience I have on such batching is that you'll have to switch a lot of parameters between each call, that *should* hide glVertexPointer() latency.This is good news for me because I always wondered how to workaround it. There are some cases in which I need to call VertexPointer to set some different parameters such as number of components in a attribute array. I'm somewhat off topic however.

Originally posted by l_belev
But when considering the details about it, please don't tune it for the currently available hardware possibilities. For example if allowing negative base index is considered a good thing in principle, let it be so. Look at the GL_ARB_texture_non_power_of_two...I'm not sure this could be really useful. I mean, ARB_npot has to do with artists. Indices are usually managed by the app... I'm not really sure I can think at a problem in which a negative index (in respect to $base) would help. Say I'll have to index vertex $base-1. Then I would review the way I choose $base to be at that value and recompute some indices accordingly.
I can't figure out a scenario in which this really helps. Also, considering the kind of functionality, I'm not really sure it's good to drop old hardware. I would rather like a distinct extension which allows negative offsets.
Obviously, the whole thing holds when $base+@index[i] is actually pointing in the array, i.e. at offset > 0 from base address.
I agree that negative $bases are a bad thing.

By the way, this thing recalls me ListBase.
Fine, this is probably much more performance critical and I'm not really used to it so I could be wrong but I'd like to recall the conventions used. So, I vote for ElementBase. As for the added 'Array' just like it is in the spec, I'm not aware of what other kinds of 'elements' one could think at.
As for the behaviour at overflow, I also agree on having a implementation-specific result when accessing out of bounds memory.

l_belev
07-28-2004, 01:22 PM
]I'm not sure this could be really useful. I mean, ARB_npot has to do with artists. Indices are usually managed by the app... I'm not really sure I can think at a problem in which a negative index (in respect to $base) would help. Say I'll have to index vertex $base-1. Then I would review the way I choose $base to be at that value and recompute some indices accordingly.Neither can I think of such problem. As I said, I just gave this as an example and didnt mean to push for it. But you didn't read the entire my post, did you :)

zeckensack
07-28-2004, 01:44 PM
Originally posted by l_belev:
The fact that index offset would not be too usefull for glArrayElement is not enough reason to cut if off. In fact glArrayElement itself is rarely usefull in practice, so this whole issue isn't of much importance, but what's more important here is to preserve the specification clarity and consistence. I think the original variant is better - it does not introduce unnecessary discrimination between glArrayElement and glDrawElements.Which was, in a way, my point. idr's preliminary spec creates a new dependency on vertex array state, and affects all entry points referencing vertex arrays.

Look at was asked for in the thread topic. We don't need an extension/change to ArrayElement behaviour per se, because ArrayElement and DrawArrays behaviour already have this functionality built in. I find it undesirable to insert another hook there, because the whole purpose is extending DrawElements functionality. ArrayElement is not the right place to extend. DrawElements itself is the right place.

It might even be better to not have a state value for this at all, and just insert a new argument into an extended glDrawElements entry point (with whatever name seems appropriate). This would eliminate all potential confusion, at the cost of a few entry points.

Otherwise one would have to remember one more unnecessary rule - when the index base is applied and when not.If you just think of it as extending DrawElements and friends, I'd think that's easy enough to remember. An "Overview" section might word it as affecting element indices sourced from an index array -- VBO ARRAY_BUFFERs or pointers to client memory, as handed to DrawElements, MultiDrawElements and DrawRangeElements. Nothing else. This is an overall smaller change to the GL spec.

For the glDrawArrays the base index isnt applied anyway, so it is out of question.As of idr's preliminary spec, it would affect DrawArrays. Again, I find this redundant.

l_belev
07-28-2004, 02:53 PM
Originally posted by zeckensack:

Otherwise one would have to remember one more unnecessary rule - when the index base is applied and when not.If you just think of it as extending DrawElements and friends, I'd think that's easy enough to remember. An "Overview" section might word it as affecting element indices sourced from an index array -- VBO ARRAY_BUFFERs or pointers to client memory, as handed to DrawElements, MultiDrawElements and DrawRangeElements. Nothing else. This is an overall smaller change to the GL spec.I can understand your point. This is one of the cases when different people look from slightly different angles at the same thing. It is hard to say which is the <right> point of view since for such kind of matter there is no universal truth. But I still think that using the index base for the glArrayElement as well as for glDrawElements is somewhat more-consistent.

Here is an example when it would be useful: Imagine that normally you use glDrawElements(or friend) for drawing something, but then you write another path for the same thing, using glBegin/glEnd and glArrayElement for debugging purposes, because you want to do something specific for every vertex drawn (can't think what exactly at the moment). And you want to separate the setting of the index base and the actual drawing to be located in different places in your software (a question of organization). Then obviously it would be cleaner if the index base works for the glArrayElement too.

On the other hand I can't think of a case when the applying of the index base for glArrayElement would be bad thing or obstackle for something.



For the glDrawArrays the base index isnt applied anyway, so it is out of question.As of idr's preliminary spec, it would affect DrawArrays. Again, I find this redundant.Well, I didn't see it. Probably that would be unnecessary.

But don't worry. If this extension reach to the ARB for consideration, they would clear away any such little issues.

idr
07-29-2004, 10:44 AM
I have to admit, I specified things the way I did for somewhat selfish reasons. I found that just modifying the behavior of ArrayElement was very easy to specify. That makes the extension spec short and sweet.

Additionally, just modifying the behavior of ArrayElement makes it much easier for me to implement the extension in the drivers that I maintain. Not only am I concerned about the software paths in Mesa, but I also work on the hardware TNL paths for the open-source R100 and R200 drivers. There are certain cases in those drivers where the DrawElements / DrawRangeElementes implementation can fall back to actually calling ArrayElement. It turns out to be much easier for me to either poke the hardware's base-index register or, in the fallback case, just let ArrayElement do the work. Modifying the various DrawElements type calls would require modifying more code in places. To me, that equates to more chances for bugs.

I am strongly opposed to adding another drawing entry-point. To keep the spec consistent, we'd have to add at least 3 new entry-points: one for each of DrawElements, DrawRangeElements, and MultiDrawElements. I'm sure some sick puppy would also ask for a modified version of MultiModeDrawElementsIBM. ;) If a function like MultiDrawRangeElements is added at some point, a special version of that would also be needed. Having watched and taken part in a few ARB votes, I can say that most people are opposed to APIs that may lead to an "explosion" of entry-points.

We'd also have to specify the interactions with ATI_element_array (http://oss.sgi.com/projects/ogl-sample/registry/ATI/element_array.txt) , APPLE_element_array (http://oss.sgi.com/projects/ogl-sample/registry/APPLE/element_array.txt) , IBM_multimode_draw_arrays (http://oss.sgi.com/projects/ogl-sample/registry/IBM/multimode_draw_arrays.txt) , and (probably) SUN_mesh_array (http://oss.sgi.com/projects/ogl-sample/registry/SUN/mesh_array.txt) . The current spec lets us just say that this extension implicitly affects any function that does vertex-array rendering in the "obvious" way.

Additionally, if you think about code that has to support the cases where the extension is and isn't supported, resetting the base index is a very natural way to do it:


if ( MESAX_array_element_base_supported ) {
glArrayElementBaseMESAX( new_base );
}
else {
/* Adjust the vertex-array pointers to account
* for the new base.
*/
glVertexPointer( ... );
...
}

glDrawRangeElements( ... );I think I'm going to implement the spec pretty much as-is in Mesa and let people play with it. Implementation experience (both driver and application) will tell us where to go from there.

Obli
08-03-2004, 01:44 PM
Originally posted by idr:
We'd also have to specify the interactions with ATI_element_array (http://oss.sgi.com/projects/ogl-sample/registry/ATI/element_array.txt) , APPLE_element_array (http://oss.sgi.com/projects/ogl-sample/registry/APPLE/element_array.txt) , IBM_multimode_draw_arrays (http://oss.sgi.com/projects/ogl-sample/registry/IBM/multimode_draw_arrays.txt) , and (probably) SUN_mesh_array (http://oss.sgi.com/projects/ogl-sample/registry/SUN/mesh_array.txt) . The current spec lets us just say that this extension implicitly affects any function that does vertex-array rendering in the "obvious" way.Maybe it's the case to consider NV_primitive_restart (http://oss.sgi.com/projects/ogl-sample/registry/NV/primitive_restart.txt) ? It works with indices after all and it doesn't seems to take a lot of time to specify.

Originally posted by l_belev:
Neither can I think of such problem. As I said, I just gave this as an example and didnt mean to push for it. But you didn't read the entire my post, did youMy apologies but it's more likely I've read and forgot it while writing (whoops :rolleyes: my fault).

gmeed
08-29-2004, 06:52 PM
Here's another vote for the extension. I envision a slightly different use for it than the examples that were mentioned here. See this thread (http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=2;t=017083#000004) .

knackered
08-30-2004, 09:50 AM
No, I don't believe the extension will help too much with that. Specifying the buffer offsets once a frame, or even once every patch wouldn't impact on performance too much - respecifying all the offsets once per object is the real hurt.
Your question could be answered with simply respecifying your VBO offsets for each patch, or whatever scheme you're using (haven't read your thread too carefully).
But you're right, this extension would be extremely useful even for you, I suppose - in fact, the whole vertex/index array mechanism looks distinctly odd without it (see d3d9 again).

Korval
08-30-2004, 10:45 AM
The usefulness of this extension depends on this: will the driver simply do the respecifying for you? If so, then you don't gain any performance by it. Considering nVidia's more unusual implementation of VBO, it is entirely possible that the same performance penalty that one would encure from calling a gl*Pointer would appear from setting the offset.

V-man
03-10-2005, 06:51 PM
*Bump*

I'm still interested. Any reason not to have this?

tamlin
03-12-2005, 12:08 PM
If first limiting the scope to VBO, the last argument/parameter for e.g. glVertexPointer is basically the same thing as this proposal, is it not? As such, in the scope of VBO, is it really useful?

For non-VBO, I can see some, limited use for this (perhaps the negative index is a bit enthusiastic, but who am I to judge :-) ). The question for me is however, would this be of enough utility for enough users, both today and in the future (where "in the future" holds way more weight I think) to warrant a "real" proposal? In the light of VBO? After all, this proposal would be added at a time where VBO is already in place.

Having read the thread, and given it a bit of thought, I currently lean towards not.

Perhaps the currently largest argument against it is that, as has been noted in the thread, this seems to not have been a very frequent request. That does not invalidate the request, it might even just indicate that noone ever considered it previously and there might be uses for it. It would however create ripples all through the spec for all array types.

My personal feeling is however, currently, that this is something I as the application programmer, or the scene graph software, or whatever role I'm playing can handle better. It is state I need to keep track of anyway (and way more, in addition to the driver needing to do it too), and it might be "expensive" to send a 64K short array to the server today - but it's getting cheaper (time-wise) by the day. Would we want to encumber the spec with this for all times, just to possibly get some speed-up today?

Just add a few wrapper functions and be done with it. :-)

Korval
03-12-2005, 02:43 PM
If first limiting the scope to VBO, the last argument/parameter for e.g. glVertexPointer is basically the same thing as this proposal, is it not? As such, in the scope of VBO, is it really useful?The usefulness comes from implementation-specific issues. Namely that gl*Pointer calls are "expensive". Why they are expensive in the scope of the same buffer object (I understand why an actual change of a buffer would be somewhat expensive), I don't know. But the evidence seems to point to some cost associated with gl*Pointer calls. So, they want a lightweight way to tell the system to just shift the read pointer X entries over.


Would we want to encumber the spec with this for all times, just to possibly get some speed-up today?Sure. Why not? Isn't programming today important?

skynet
03-13-2005, 12:41 AM
I┤m still voting for such an extension. As I said earlier, changing the Pointer(s) (or offsets in case of VBO) is cumbersome, since you have to remember what arrays are activated and how they are laid out (the app has to "log" its own glXXXPointer calls).
Using this extension allows to changes _all_ pointers in a single call, all of them with the correct stride. Also, it would allow to easily store more than 64k of vertices in a single VBO, but still use 16bit indices for the glDrawElements() calls.
It is possible to do these things today already, but this new extension would give us a way to achieve the same with less effort and possibly enables the driver to do it more efficiently. Even if it boils down to doing the glXXXPointer calls in the driver, it wouldn┤t be much of a problem (no loss in speed, more convenience for us)

V-man
03-14-2005, 04:18 AM
The other guys mentioned the essential stuff.

I my case, I would either have to make many gl*Pointer calls, or use 32 bit indices, both of which can drag down performance.

With offsetting, my programming becomes simpler and I would be making much less GL calls.

I think offsetting is a basic need. It need not be just about performance.

Korval
03-14-2005, 10:40 AM
It need not be just about performance.While I agree that this is a useful feature, I don't see how it is about anything but performance. The number of gl*Pointer calls is relevant only in terms of performance, as is the use of 32-bit indices (except for the memory overhead, but that is only a problem in terms of performance). This is a feature like VBOs that exists only for the purpose of performance.

michagl
03-24-2005, 09:47 AM
as i read this proprosal, i'm strongly for it. especially if the offset switch can be compiled into a display list and the offset is absolute.

the only fault is it encourages or requires programmers to preallocate their memory in large continuous blocks... so that you have to guess your eventual memory needs rather than allocating on the go.

if you have a static environment like knackered's as i read him, that is probably not a bother at all. but if the 'chunked' geometry is streamed dynamicly, you risk having unused memory, or worse having to reallocate memory. but if i offers a performance gain i would take it. anything that can be accepted and gives driver/bus a head up i'm for it.

if someone plans to ever take this matter to task, and wants to collect signatures or emails for notification, i would pledge my name.

sincerely,

michael

Christian Sch├╝ler
03-26-2005, 09:26 PM
Some posters seem to undervalue the point of performace a bit, like in "it's only an optimisation".
Let's not forget, hardware acceleration is all about performance in the first place.

An equivalent to a "base vertex index" in OpenGL would be a very useful addition!

It would especially help people who have to design an API independent renderer (= me!) when one API has an index offset semantic and the other has not.

cheers

knackered
08-15-2006, 05:30 AM
I'm going to bump this topic again, seeing as though OpenGL is in the process of being re-specified.
It would be a crime for this proposed mechanism not to be implemented in the new API.

Overmind
08-15-2006, 06:03 AM
As I understand it, in the new API there will be no current vertex array, you specify one (encapsulated in some kind of object) as parameter to the draw call.

This implies that using a different vertex array for each draw call is cheap.

So do we really need a base index any longer? Because the problem this feature is trying to solve (gl*Pointer is expensive) no longer exists in the new API.

knackered
08-15-2006, 06:40 AM
Mmm, fair point. But the driver must be comparing the previous vertex array to the current one set up on the card and switching if necessary. Just because this is exposed to the user as a transparent 'use this vertex buffer' doesn't necessarily mean it will be cheap to use a different one in every call.
Therefore the argument for an index buffer offset is still valid.

Michael Gold
08-15-2006, 08:59 AM
Nobody working on GL-next has stated that draw calls will take a vertex array object as a parameter. This is pure speculation, as no such decision has been made.

I'm not going to commit to a specific solution to vertex base issue raised here, but it will be taken into consideration.

There is an underutilized forum for new feature requests: the "Suggestions for the next release of OpenGL" section. Ideas buried in threads in the Advanced forum are less likely to be noticed.

knackered
08-15-2006, 10:01 AM
Ok, I've put an entry on the "Suggestions for the next release of OpenGL".
http://www.opengl.org/cgi-bin/ubb/ultimatebb.cgi?ubb=get_topic;f=7;t=000581

Overmind
08-15-2006, 03:13 PM
Ok, I mixed something up here. However, the SIGGRAPH slides still say there will be a "vertex array object" for fast state changes. How exactly this object is going to be selected, as parameter to a call or as current state, is a minor detail.

My point is not that I dislike this "base offset" in principle. For the current API I'm all for it.

But it seems like the whole thing is just a workaround for some shortcoming in the API. Now we're in the process of getting a new API anyway, so I would prefer if Khronos gets this right in the first place, and not make the same mistake plus provide an immediate workaround ;)

Korval
08-15-2006, 04:23 PM
But it seems like the whole thing is just a workaround for some shortcoming in the API.Possibly. Maybe. It took me a while to remember what this discussion was all about.

It all depends on just what the implementation problem is that makes binding a buffer (even if its the same buffer) time consuming. State changes will still be state changes, no matter what API you get. It may be more efficient than changing individual state one at a time, but if the problem is systemic, having to do with changing the state at all, then the new API isn't going to solve the problem.

Overmind
08-15-2006, 06:05 PM
Changing the array is functionally equivalent to packing the data into a single array and changing the base offset (or changing the start pointer inside the same VBO). So if the hardware supports one of the two, the driver can provide the other.

Assuming packing two objects into a single buffer is really faster because avoiding the buffer change is a costly operation for the hardware, but setting a base index is cheap. Then the driver could pack the contents of all VBOs into a single buffer for us transparently, and automatically set the base index.

If on the other hand the existence of a state change is the performance problem, then a base index is not going to solve the problem, because that's a state change, too.

One of the reasons why changing the vertex array base pointer is expensive is because of the validation overhead. This overhead goes away with the new API.

Korval
08-15-2006, 07:15 PM
I'm not going to commit to a specific solution to vertex base issue raised here, but it will be taken into consideration.Fair enough. As long as the ARB/Kronos/whoever is aware of the issue and is working on a performant solution of some kind.


Changing the array is functionally equivalent to packing the data into a single array and changing the base offset (or changing the start pointer inside the same VBO). So if the hardware supports one of the two, the driver can provide the other.Is it? Are you an IHV who is equipped to make statements of that nature?

There could be, in theory, a number of hurdles that have to be overcome when binding even an already resident buffer obect. What these are, I can't say; I'm not an IHV. However, actual live tests, as shown on this thread, do seem to bear out the idea that there is a difference between binding a buffer and just sliding around where the vertices come from. This could be due to driver "idiocy", or to a substantive issue with changing said state.

There's no guarentee that any specific vertex array object construct will solve these issues. It may, but there's no evidence one way or another, because we don't know enough about why it's happenning.


Assuming packing two objects into a single buffer is really faster because avoiding the buffer change is a costly operation for the hardware, but setting a base index is cheap. Then the driver could pack the contents of all VBOs into a single buffer for us transparently, and automatically set the base index.First, even under the new API, I'm really sure hardware isn't allowed to do that. The whole "immutable" object thing and all.

Second, how would the driver know that you intend to draw from buffer B immediately after buffer A? And that you will always be doing this? Are you expecting the drivers to now be adaptive to the specific usage patterns of data to this degree?

Third, even assuming that this was enough to solve the problem, and that drivers did it, it's not something you can easily document to a user. How exactly are you going to tell your users how to write their code? Try to render buffers consistently that use the same vertex format in sequence, while also minimizing shader/texture state changes, while... etc?


If on the other hand the existence of a state change is the performance problem, then a base index is not going to solve the problem, because that's a state change, too.I didn't say a state change; it would be the particulars of changing that actual state: the base address state. The offset state would be a separate register that gets used in the computation phase. Maybe changing that state induces a stall in the pipeline, because the driver has to wait for the current vertex data to be sent before it can validate your changes. Who knows? Certainly neither of us does.

The facts we have are these: testing has shown that there is a substantive performance difference between calling gl*Pointer and doing a simple offset (with indices). It therefore may be possible to have an offset that is implemented in hardware such that it does not incur the penalties associated with gl*Pointer calls. Or maybe there's some other solution available, one that alieviates buffer binding overhead.


One of the reasons why changing the vertex array base pointer is expensive is because of the validation overhead.That makes no sesne. Validation can't be the problem (unless it's just driver "idiocy"), because if you're changing the pointer to a location in the same buffer, there's no need to validate anything.

Overmind
08-16-2006, 04:06 AM
Is it? Are you an IHV who is equipped to make statements of that nature?No, but I'm a programmer. And if you give me the base index extension, I can write you an API on top of it that implements fast switching between multiple vertex arrays. If I can do it, I don't see why the driver can't.

I don't need to guess usage patterns here, I just put everything in a single big buffer, and make my own memory manager (VAR anyone?). Then I never have to switch the current vertex array.

This memory management business is something the driver should do for me. That was the whole idea behind VBOs, I don't want to be forced into putting multiple objects into a single buffer when the driver can do that for me.


testing has shown that there is a substantive performance difference between calling gl*Pointer and doing a simple offset (with indices)Of course you have a performance bonus when you statically change the indices :p

The question is:
If gl*Pointer has really no validation overhead, and it just changes a poiner, then what's the difference between changing the base pointer and changing the base index?

knackered
08-16-2006, 04:32 AM
Well in the current version of OpenGL, it's this:-

glVertexOffset(0);

versus this:-

glbindbuffer(...)
glColorPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE0)
glTexCoordPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE1)
glTexCoordPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE2)
glTexCoordPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE3)
glTexCoordPointer(...)
glEnableClientState(...)
glAttribPointer(0,...)
glAttribPointer(1,...)
glVertexPointer(...)
glEnableClientState(...)

you think there'd be no difference to execution speed when this is done 10,000 times a frame? Even if there's no performance difference when the commands reach the card, there's a fair amount of difference on the CPU having to call these non-inlined functions.
Consider what happens when each function is called - if you were writing a driver, and given all the current states and extensions in place, what code would you write behind each of these functions?
The CPU has better things it could be doing.

Overmind
08-16-2006, 01:08 PM
If gl*Pointer has really no validation overhead, and it just changes a poiner, then what's the difference between changing the base pointer and changing the base index?That was a rhetorical question :p

Perhaps I should clarify what I want to archieve.

I'm not against having a base index. But I want an API where I'm not forced to code my own memory manager. If there is another use for a base index, then let's have it. If you want you can even put me on the list in the other thread :rolleyes:

But I refuse to believe that manually packing multiple objects into a single buffer is the only way to get reasonable performance. I'm sure this is a problem that can be solved with a better API.

Korval
08-16-2006, 02:06 PM
And if you give me the base index extension, I can write you an API on top of it that implements fast switching between multiple vertex arrays. If I can do it, I don't see why the driver can't.And exactly how would you do that?

If the hardware doesn't have a register for that value, you need to call the internal equivalent of gl*Pointer to change the offsets. And if this function has some performance impediment for reasons that we have no idea about, then you will have gained precious little.


If gl*Pointer has really no validation overhead, and it just changes a poiner, then what's the difference between changing the base pointer and changing the base index?You assume that "validation overhead" is the only possible reason for the performance impediment. It could be something else. Like I said, maybe changing the base indices requires a pipeline stall, while changing the offset may not. I don't know; I'm not on nVidia's driver team.

My point is that there's no way to know whether or not this offset API would be a performance improvement, or whether the proposed new API would correct the problem the problem by itself.


Even if there's no performance difference when the commands reach the card, there's a fair amount of difference on the CPU having to call these non-inlined functions.True. But the point Overmind is making is that the new API will eliminate this overhead, thus turning it into a single function call.

Therefore, the offset stuff is only viable in that context if there's more going on behind the scenes than making a lot of function calls.

MZ
08-16-2006, 04:07 PM
Originally posted by knackered:
Well in the current version of OpenGL, it's this:-

glVertexOffset(0);

versus this:-

glbindbuffer(...)
glColorPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE0)
glTexCoordPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE1)
glTexCoordPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE2)
glTexCoordPointer(...)
glEnableClientState(...)
glClientActiveTexture(GL_TEXTURE3)
glTexCoordPointer(...)
glEnableClientState(...)
glAttribPointer(0,...)
glAttribPointer(1,...)
glVertexPointer(...)
glEnableClientState(...)
What OGL really needs, is Vertex Declaration. This would naturally include solution to your "vertex base index" problem. And a couple of others too.

In the report posted on gamedev (http://www.gamedev.net/columns/events/gdc2006/article.asp?id=233) it was said that "Non-VBO vertex arrays" are considered to be layered. However the siggraph papers don't seem to confirm it (anyone has info about this?).

I'm mentioning this, because the index offset you are proposing would be acceptable solution only if the pointer arrays (which are outdated as hell, IMO) were to stay.

Overmind
08-16-2006, 05:07 PM
If the hardware doesn't have a register for that value...If the hardware doesn't have a register for it, then an extension for setting a base index won't give you any speedup either.

For the sake of this discussion, I'm assuming that the proposed extension where I can say glVertexOffset(1234) exists and is hardware accelerated. If the hardware can't do this, then we can immediately stop discussing, because the whole thing just can't be done faster.

But if this glVertexOffset call works, it's trivial to provide a mechanism on top of it that accelerates array changes.


Like I said, maybe changing the base indices requires a pipeline stall, while changing the offset may not.What's the difference between a base index change and an offset change?

If the base index change is slow and the offset change is fast, why should I ever change the base index? I just leave it constant and change the offset only. Or the other way round, whatever is faster.

That's the reason why I doubt that there is a high overhead in changing the vertex array aside from validation and tons of seperate calls.

Except of course some overhead that will always be there and that won't go away, no matter what you do to the API. I don't say there is no overhead, I just say that the glVertexOffset can't be better than a properly designed "vertex array object" API.

Korval
08-16-2006, 05:07 PM
In the report posted on gamedev it was said that "Non-VBO vertex arrays" are considered to be layered. However the siggraph papers don't seem to confirm it (anyone has info about this?).Well, it did say "Immediate mode". This doesn't include specifically gl*Pointer, but given the design of the new object model as seen so far, I seriously doubt that we'll be using gl*Pointer calls with it.

It did mention something described as:

* Vertex Array objects
- Encapsulates all array state for fast state changes
- Immutability removes validation overhead

But there were no further details than this.

MZ
08-16-2006, 05:45 PM
Originally posted by Korval:

In the report posted on gamedev it was said that "Non-VBO vertex arrays" are considered to be layered. However the siggraph papers don't seem to confirm it (anyone has info about this?).Well, it did say "Immediate mode". This doesn't include specifically gl*Pointer, but given the design of the new object model as seen so far, I seriously doubt that we'll be using gl*Pointer calls with it.

It did mention something described as:

* Vertex Array objects
- Encapsulates all array state for fast state changes
- Immutability removes validation overhead

But there were no further details than this. I was meaning this part:

Drawing Geometry

What features to consider layering:
Immediate mode Current vertex state Non-VBO vertex arrays Vertex array enables (with shaders, this should be automatic) glArrayElement() glInterleavedArrays() glRect() Display lists
Removing these would reduce the number of ways to get data to OpenGL, which would simplify drivers and hardware. VBOs would be the preferred method for transferring data.

Korval
08-16-2006, 10:16 PM
But if this glVertexOffset call works, it's trivial to provide a mechanism on top of it that accelerates array changes.See below.


If the base index change is slow and the offset change is fast, why should I ever change the base index? I just leave it constant and change the offset only. Or the other way round, whatever is faster.Because the base index applies to all active components equally. It's not a per-component thing; it solves a very specific problem.

Maybe one mesh only has 1 texture coordinate, while another has 2. One mesh has skinning data coming in through an attribute, while another one doesn't. Maybe one mesh has interleved data, and another doesn't interleve color. Maybe you're rendering the same mesh but with a different color array.

These things require changing the base pointer. And they can't be implemented with a simple offset from one base address.


I was meaning this part:I know what you were talking about, but I don't see your point.

Maybe they just didn't mention all the vertex array commands in the slides; there's only so much room on them. I would not assume that an API, who's primary purpose is to unify rendering to one fast path, is going to leave a half-dozen mechanisms for sending vertex data. Do you think they'd just forget that part of the API?

Overmind
08-18-2006, 02:24 AM
Ok, changing the vertex array layout is another story. This of course can't be accelerated easily.

But I still think the driver should optimize the case where the layout does not change. Then instead of packing multiple objects into a single "array object", I just have to sort my geometry by array layout to be fast.

Fitz
08-22-2006, 07:27 PM
I don't understand why simply calling glVertexPointer again with a different offset would be slow, if the same buffer is still bound why would the driver not just basically do what this planned extension will do already?

Overmind
08-23-2006, 02:03 AM
Because you have more than one array. When you just call glVertexPointer, you shift just the base index of the vertex array. The driver can't know if you're going to do the same with the other arrays or not.

knackered
08-23-2006, 04:28 AM
yes, you're effectively creating a new vertex declaration (to use d3d speak). I imagine it's quite a complicated procedure under the hood, because you're using the same mechanism to specify an offset as you do to instruct the cards memory controller about which vbo to take its data from.
This offset has been a mandatory part of every DrawIndexedPrimitive call for years now, so it should be exposed in OpenGL.

Mars_999
09-04-2006, 11:02 AM
I would like to add to this topic, that as of now I am trying to find new ways to speed up my terrain renderer and 16bit IBO's are supposed to be faster than using 32bit IBO's. So with my terrain mesh I have it patched and these patches are e.g. 33x33 sizes. So I want to use a single IBO and a single VBO to render the whole terrain. The problem is I need to as knackered stated in the beginning a way to reassign the origin point(starting Point) so I can start drawing the mesh from that point with the same indices layout from the IBO. I am assuming this could be a decent speed boost for you can shave off 16bits per index and only use one IBO where as of now I have to use 32bit and a IBO per patch with the values going over 65k since the terrain mesh is 513x513... Just thought I would point out a use for this extension....

Jan
09-04-2006, 02:42 PM
http://www.opengl.org/discussion_boards/ubb/ultimatebb.php?ubb=get_topic;f=3;t=014659

You are absolutely right, i mentioned the same thing just a week ago.

Jan.

Mars_999
09-04-2006, 10:08 PM
Originally posted by Jan:
http://www.opengl.org/discussion_boards/ubb/ultimatebb.php?ubb=get_topic;f=3;t=014659

You are absolutely right, i mentioned the same thing just a week ago.

Jan. Oops, I didn't even look. Good to see others are having the same thoughts and issues as me. I just wish I knew if this would lead to a speed increase or just a smaller memory requirement... Either way both would be great, but less memory requirements is a nice bonus also...