NV_vertex_array_range not working ..

Hello OpenGL guys,

I am trying to use NV_vertex_array_range and NV_fence extensions

to make my program faster. But after using them, my program became

much slower than using normal vertex arrays or display lists.

Below is the profiling result. The first is the result when using VARs.

The second is display list. And the last is the result of normal vertex array.

As you can see, glDrawElements function is 5 times slower when used with VAR

comparing to the case when used with normal vertex arrays.

(Unlike my first guess, memcpy to video memory was fast enough.)

I used it with glInterleaved array and the vertex format was GL_T2F_C4F_N3F_V3F.

I also read doc’s from NVIDIA developer site, but I couldn’t find the reason.

And I also saw the sample NV_vertex_array_range code by Mr. Cass Everitt.

There was no big difference. Please help me.

Kevin

----- PROFILE RESULT BEGIN -----
PartAnimation_DrawCurFrame_VertexBufferRange [callCount: 184, avgTime: 7832984]
VertexBuffer_CopyBuffer_memcpy [callCount: 92, avgTime: 70198]
Mesh_DrawToVertexBufferRange_glDrawElements [callCount: 92, avgTime: 15349309]
Terrain_Draw_VertexBufferRange [callCount: 920, avgTime: 2242205]
VertexBuffer_CopyBuffer_memcpy [callCount: 920, avgTime: 15296]
Mesh_DrawToVertexBufferRange_glDrawElements [callCount: 920, avgTime: 1951688]
----- PROFILE RESULT END -------

----- PROFILE RESULT BEGIN -----
PartAnimation_DrawCurFrame_DisplayList [callCount: 1106, avgTime: 1543583]
Terrain_Draw_DisplayList [callCount: 5530, avgTime: 300557]
----- PROFILE RESULT END -------

----- PROFILE RESULT BEGIN -----
Mesh_DrawIndexRange_glDrawElements [callCount: 88, avgTime: 147602]
PartAnimation_DrawCurFrame_VertexBuffer [callCount: 632, avgTime: 1920438]
Mesh_DrawIndexRange_glDrawElements [callCount: 316, avgTime: 3810593]
Terrain_Draw_VertexBuffer [callCount: 3160, avgTime: 344315]
Mesh_DrawIndexRange_glDrawElements [callCount: 3160, avgTime: 325972]
----- PROFILE RESULT END -------

Have you checked whether your VAR is valid?

If it’s 5x slower, I’m betting you’re getting an invalid VAR and forcing us to do CPU reads from uncached memory.

  • Matt

How can I check if the VAR is valid or not ?

Of course, I called glVertexArrayRange function before using VAR.

There’s a Get for it. See the spec.

  • Matt

Also, make sure you don’t put your actual indices in the VAR – they need to be in regular system RAM.

The kind of RAM you allocate for the VAR matters, too. If you allocate priority in the 0.75-1.0 range, you get on-card RAM, which is slower to write than AGP RAM (even if you’re doing proper cache-line-sized write-combined streaming). If you write to it every frame, you probably want AGP RAM (priority 0.5 or so).

Now, if you could have TWO VAR ranges, one for on-card RAM for seldom-changing geometry, and one for AGP RAM for streaming geometry… Mmm… [/me wipes drool from chin]