PDA

View Full Version : VAO and Bindless



Aleksandar
10-08-2011, 08:44 AM
What is the proper way to setup VAO in order to use Bindless?

I didn't expect that introduction of mandatory VAO in NV R280+ drivers would create such a headache. A naive approach: "create, bind and forget" simply doesn't work. The application crushes at first glDrawElements() call with NULL pointer assignment. There are no GL errors or anything else that debug_output can catch. The same code with standard VBO access works perfectly with bounded VAO. Without binding VAO, Bindless works fine (furthermore it is a code that has been working correctly for years).

Alfonse Reinheart
10-08-2011, 02:47 PM
A naive approach: "create, bind and forget" simply doesn't work.

That sounds like a driver bug to me.

kRogue
10-08-2011, 02:53 PM
(Shudders). If memory serves correctly, bindless performance gits hurt a touch with VAO (I have memory of reading that somewhere in these forums). But on the other hand, since you are using bindless anyways, that means you must be using NVIDIA hardware, NVIDIA who have gone on record stating that core profile is slower than compatibility profile. Why not just make a compatibility profile GL context and call it a day? [On a side note, for my own code, apparently I accidentally avoided the issue by only creating and binding a VAO at startup if bindless was not present. I wonder if under core profile a future(or current!)driver revision will make that break?]

Aleksandar
10-08-2011, 03:52 PM
That sounds like a driver bug to me.
Yes, it sounds like a bug, but I cannot claim anything.
The problem is in the fact that the presented behavior is exposed on older (pre-R280) drivers. Even if it is a driver bug, adding support for R280+ means breaking up support for older releases. Till now I was quite happy since there was no need to use VAO on NV, and I haven't tried to combine VAO and Bindless.

I hope that there is something that I have overlooked, and VAO and Bindless can coexist after all. That's why I've asked if anyone has succeeded to use them together.


If memory serves correctly, bindless performance gits hurt a touch with VAO (I have memory of reading that somewhere in these forums).
Yes, the memory serves you well. The bindless offers much superior access to resident buffers than any other methods. Using VAO can hurt performance. But the problem is that now I have no choice, at least for the core profile.


But on the other hand, since you are using bindless anyways, that means you must be using NVIDIA hardware, NVIDIA who have gone on record stating that core profile is slower than compatibility profile.
That is what Mark said, and there is no reason not to believe him. :)
It would be interesting to hear from the community what the real-life experiences are. I've just tried the same scene with both core and compatibility profile on R266 drivers and Vista 32-bit, and depending on the scene setup; results vary from 0.07% to 3.7% in favor of the core profile.


Why not just make a compatibility profile GL context and call it a day?
I'll try it for R285. Well, I have believed the core profile was the right path to follow all these years. ;)

Alfonse Reinheart
10-08-2011, 04:33 PM
Yes, it sounds like a bug, but I cannot claim anything.

OK, allow me to rephrase: it is a driver bug.

The set of state stored by a VAO is defined by a reference to certain state tables. Bindless is also defined by expanding this state table. Therefore, the new state that Bindless defines is automatically encapsulated by VAOs.

You should be able to just bind a VAO and continue working as normal. If you can't, then that is a driver bug.

Aleksandar
10-08-2011, 04:53 PM
OK, you have convinced me. :)
I'll try to make a repro-case sample and send it to NV.

kRogue
10-09-2011, 02:47 AM
I've just tried the same scene with both core and compatibility profile on R266 drivers and Vista 32-bit, and depending on the scene setup; results vary from 0.07% to 3.7% in favor of the core profile.

Admittedly the numbers are not that big, but that real world results goes against what has been stated! Are the code paths of the code identical? [I would assume so, but, you never know!]

Aleksandar
10-09-2011, 03:29 AM
Admittedly the numbers are not that big, but that real world results goes against what has been stated!
My naive tests should not be the reference for stating anything. What Mark said is quite reasonable, but several years have pasted since that presentation and who knows what exactly is happening inside the drivers. Probably there are additional checks for core profile, but also maybe core profile excludes something else. That's why it would be interesting to gather as much real-life experiences from the other people to create a snapshot of the current situation in the drivers development and performance.

Ludde
10-09-2011, 03:44 AM
After reading that statement from Mark(or from this forum) I also tested with my scene last month and got something around 0.5%-1% in favor of core(v4.1) if I remember correctly.

Aleksandar
10-09-2011, 04:29 AM
Thanks Ludde!

Please post manufacturer and driver version, because there are differences between AMD and NV, as well as among different versions of drivers.

Ludde
10-09-2011, 04:37 AM
Sorry...
NV GTX580 with driver 280.26.

kRogue
10-09-2011, 10:28 AM
On a NV 465, driver version 280.26, Windows Vista 32-bit, creating a VAO at application startup and forgetting about it works for me to use bindless (and not creating the VAO makes bindless fail now) under core profile. Compatibility profile does not care (as expected).

What driver versions have you found this issue on?

Aleksandar
10-09-2011, 01:33 PM
The problem exists on the whole range of drivers.

Using VAO and Bindless breaks the application on:
R275.33/Win7 64-bit Pro SP1/GTX470
R266.58/Vista 32-bit SP2/8600GT

On R280.47 (WinXP 32-bit SP3/9600GT) the situation is more complicated:

1. In compatibility profile, without VAO, everything works perfectly.

2. In core profile, without VAO, it breaks on the first glVertexAttribFormatNV() call with GL_INVALID_OPERATION exception.

3. In core profile, with VAO, it breaks on the first glDrawElements() with NULL pointer assignment (just like in all previous driver releases).

So, the problem is not in a particular driver. I'm doing something wrong (although it works with core profile in earlier releases and in compatibility profile now), or NV has some hidden problem that has emerged with R280+ (and VAO mandatory usage).

kRogue
10-09-2011, 02:00 PM
Hmm... some questions to try to narrow down what I do differently (though I am already using a different driver then you):

1. I call glEnableClientState(GL_VERTEX_ATTRIB_ARRAY_UNIFIED _NV) only once just after I create the one VAO that I "create, bind and forget about".

2. I call glEnableClientState(GL_ELEMENT_ARRAY_UNIFIED_NV) also only once, just before my first glDraw call. This is done after glEnableClientState(GL_VERTEX_ATTRIB_ARRAY_UNIFIED _NV) .

As a side note are you 100% sure you are calling glMakeNamedBufferResidentNV on the buffer objects?. I do this immediately after buffer object creation and nly once. As a side note, I do not have multiple GL contexts... I have a memory that glMakeNamedBufferResidentNV needs to be called on a per context basis [I think].

Lastly, though I strongly suspect you are not doing this: are you using bindless for attributes, but not for index data or visa-versa?

Aleksandar
10-09-2011, 03:04 PM
Enabling/diseabling bindless through glEnableClientState()/glDisableClientState() for both GL_VERTEX_ATTRIB_ARRAY_UNIFIED_NV and GL_ELEMENT_ARRAY_UNIFIED_NV can be done multiple times if bindless and non-bindless drawing interleave. In the presented case there is no interleaving.


As a side note are you 100% sure you are calling glMakeNamedBufferResidentNV on the buffer objects?. I do this immediately after buffer object creation and nly once. As a side note, I do not have multiple GL contexts... I have a memory that glMakeNamedBufferResidentNV needs to be called on a per context basis [I think].
:) Buffers must be made resident in order to use bindless. Although I'm not using "Named" version of the function, it serves the purpose.


glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeInBytes, indVect, GL_STATIC_DRAW);
#ifdef _BINDLESS_GRAPHICS
glGetBufferParameterui64vNV(GL_ELEMENT_ARRAY_BUFFE R, GL_BUFFER_GPU_ADDRESS_NV, &(indexBufferAddr[i]));
glMakeBufferResidentNV(GL_ELEMENT_ARRAY_BUFFER, GL_READ_ONLY);
#endif


Yes, buffers have to be made resident in all contexts in the sharing group. The memory serves you well, once again. If you check mails you can find what Jeff told me about this. And I really got the artifacts if the buffers were not made resident in all contexts in the group. In the presented case there are no sharing contexts using resident buffers.

Lastly, though I strongly suspect you are not doing this: are you using bindless for attributes, but not for index data or visa-versa?
I'm using bindless for both. Even with the flavor of pointers in shaders (#extension GL_NV_shader_buffer_load : enable). Don't worry; I'm not a beginner in the bindless stuff. ;)
But I am beginner in VAO usage (although I learnt to use it long before bindless, I stopped to use it since there was no purpose).

Thank you for trying to help me!

Aleksandar
10-12-2011, 02:25 PM
My apology!
There is no problem in drivers when mixing VAO and Bindless, but in my code.

There was a part of the code with setting VertexAttribFormatNV before VAO was bound. I'm using that to preset formats before any of the attributes are even known in order to decrease number of changes. Well, obviously, I had forgotten that.

Sorry for the false alarm! :(

kRogue
10-18-2011, 04:34 AM
In core profile, without VAO, it breaks on the first glVertexAttribFormatNV() call with GL_INVALID_OPERATION exception.


was the only clue that the VAO was not bound and GL_INVALID_OPERATION is not exactly a very helpful message!

Dan Bartlett
10-19-2011, 03:00 AM
Well if the next version of the spec is just a rewording of the existing spec (see OpenGL BOF (http://www.khronos.org/developers/library/2011-siggraph-opengl-bof): OpenGL Ecosystem) with no functional changes - perhaps making it look more like the OpenCL spec, then at least there should be all the error messages that can be caused by a function listed with that function.

kRogue
10-19-2011, 05:04 AM
I would like to go beyond glGetError(), to perchance having glGetError() and glGetErrorString() where the latter returns a string of the current error on the error stack.. or maybe better yet:


const char* glGetErrorAndErrorString(GLenum *errorCode);
void glDeleteErrorString(const char*)


where glGetErrorAndErrorString returns an implementation dependent string message of the error on top of the error stack and sets errorCode to the value. NULL is an acceptable return value, but NULL is not an acceptable value for errorCode.

and glDeleteErrorString() informs GL that a client is no longer referencing the given string and can safely delete it.

Naturllay we could be sick and expand it to that glGetErrorObject() just returns a name for a GL object holding the GL error and that object can be queried. Returning zero indicates there is no error.

Maybe the last suggestion is best.. time to put it at the suggestion jazz.