NVidia Forceware drivers and VBO

I made a small test app for testing the speed difference with vbo and standart vertex arrays.
Application generates a grid which is made of 2500 vertexes and 15000 indexes.

GL State is:
ZWrites OFF
ZTest OFF
Culling ON with FRONT_AND_BACK
No SwaBuffers called
No Buffer Clear

So no drawing actually occurs,

Two buffers created at startup( 1 for Vertex data with STATIC_DRAW, 1 for Index data with STATIC_DRAW )
, binded, called BufferData with vertex data in system memory, arrays enabled( Vertex and Color ) and never touched again.

Vertex type is
float x,y,z
byte r,g,b,a

buffers are aligned on 64 bit boundaries

When drawing

just a plain glDrawRangeElements call is made

no Vertex Array pointer update or enable/disable is made

When using driver 43.51, the results are
Standart Arrays: 400 FPS
VBO : 3100 FPS

When using driver 53.03, the results are
Standart Arrays: 400 FPS
VBO : 400 FPS

What do you think about the 53.03 speed?

My system is:
MSI Neo2 Motherboard
Pentium 4 2.8c
1 GB RAM
Windows XP Professional
GeForce2 GTS

Test apps:
Standart: http://www.3tegames.com/gl_standart.exe
VBO: http://www.3tegames.com/gl_vbo.exe
Source: http://www.3tegames.com/vbo_src.rar

[This message has been edited by Cem UZUNLAR (edited 11-25-2003).]

In NVidia ForceWare 52.16, it seems that to enable VBO you have to call
glMapBufferARB(GL_ARRAY_BUFFER_ARB, GL_WRITE_ONLY_ARB); with GL_WRITE_ONLY_ARB as second argument. If you use GL_READ_ONLY or GL_READ_WRITE_ARB that do not work : no crash, but no acceleration.
Otherwise, you could use GL_READ_ONLY or GL_READ_WRITE with old driver (45.23 and less), it works.

I am not using glMapBuffer. I am providing the data with glBufferData.
Now i am trying mapping the buffer with WRITE_ONLY flag and copying data to the buffer.

I tried that but nothing changed…

Could you please post the source?

Here is the source
http://www.3tegames.com/vbo_src.rar

All the important stuff is in example.cpp

I think your problem is your measurement
technique. I messed around with your program
for a little bit and got the following:

GeForceFX 5900
53.03
VBO: 100 Mtri/sec
no VBO: 17 Mtri/sec

My changes basically consisted of making
the mesh larger and calling
glDrawRangeElements several times
for each call to Draw().

What’s your email address? I’ll send you
the modified code.

My e-mail is
cemuzunlar at 3tegames.com

I also tried making the mesh bigger and making several DrawRangeElements call but never achieved more than 20.1 Mtri

[This message has been edited by Cem UZUNLAR (edited 11-25-2003).]

I think that 20 Mtris/sec is very close to maximum vertex troughput of Geforce 2, so probably that is reason why you can’t achive better vertex troughput.

This 20.1 is tris per second and i dont use stripped geometry.

I get 10 MVert/sec.

GeForce2 GTS has a 25 MVert/sec maximum vertex troughput. So there is still much room for improvement.

20Mtris can be up to 60MVertices if you don’t use the vertex cache very well.

20Mtris is a good result on a GF2GTS.

See my first post on the thread.
when i try the same program with 43.51 drivers, there is a major speed difference.

and the key point is not the MTris that are transfered, it is that there must be a difference between the vbo and non-vbo case.

I think when they calculate triangles per second for a card’s statistics they switch off back face culling and count each triangle drawn (every 3 indices in this case) as two triangles.

FYI I have a Radeon 9800 (not Pro) running Cat 3.9 and my results were:
FPS, Vert, Tri
VBO: 58, 56, 110
NVBO 4.5, 4.5, 9

Matt

[This message has been edited by MattS (edited 11-26-2003).]