VBO Performance Test

tooltech · July 15, 2003, 3:51am

Hi !

I have made a app that uses VBO and on my HW i get very low FPS. I have a GForce 4 Ti 4600 with drivers 44.03.

I would be happy if you would like to test it on Raden HW with VBO support and newer NVidia drivers with HW >= GForce4…

Here is the URL…
http://www.tooltech-software.com/downloads/gizmo3d/binaries/win32/VBO%20Test.zip

Thanx ahead !!!

BTW. You can see som of my IBR stuff in it…

PH1 · July 15, 2003, 4:01am

Tested this on Radeon 9700 Cat 3.5. The VBO version runs very slow( 0fps when I press the ‘f’ key ), the non-VBO is quite fast ( 35fps ). The output seems messed up in both versions. Parts of the teapot is missing.

tooltech · July 15, 2003, 4:05am

The Teapot is just rendered from one image + depth map. Therefor it is missing a lot of “non visible” patches…

However. you get the same result as I do. The VBO version is SO SLOW !! Strange…

Mazy · July 15, 2003, 4:14am

I havent tried that program yet, but im using vbo:s in my own programs on a radeon card ( and its been tested on gf4 and gffx aswell) and there we get a pretty nice preformance boost, so i wouldnt blame the drivers just yet.

_Fishman · July 15, 2003, 4:16am

“This application has failed to start because MSVCP60D.dll was not found.”
No, I don’t use Visual Studio 6, I use Visual Studio .NET.

tooltech · July 15, 2003, 4:20am

You can find the missing files here …
http://www.tooltech-software.com/downloads/gizmo3d/binaries/win32/win32_runtime.zip

I get 3 FPS using VBO and 30 FPS using the non VBO version.

PH1 · July 15, 2003, 4:24am

It isn’t a driver issue as ToolTechs test app runs slow on both NV and ATI hardware in VBO mode.

One thing that generally gives me bad performance is when I mess up and get gl errors per frame, but you probably already checked that.

Robert_Osfield · July 15, 2003, 4:28am

Originally posted by ToolTech:
[b]Hi !

I have made a app that uses VBO and on my HW i get very low FPS. I have a GForce 4 Ti 4600 with drivers 44.03.
[/b]

Hi Anders,

I implemented a VBO path in the OSG a couple of weeks back and found up to 50% peformance boost on coarse grained high polygon models.

However, on models that were composed of then of thosands of small peices of geometry the peformance of VBO is slower than using display lists. I think this is largely down to OpenGL calling overhead swamping the gains from VBO. The use of extensions and having to querry for them at runttime makes doing lots of extension calls expensive

The drivers that I am using are Nvidia’s 43.63 release under Linux. Results will obviously vary on different drivers/OS’s/graphics hardware, but in general my findings have been positive, save crashes reported on Geforce2Gp laptops.

Robert.

tooltech · July 15, 2003, 4:34am

Hi Robert.

I get the scary feeling that my usage of shorts for vertex coordinates and mixing VertexAttrib with normal VertexPointer slows it down. In my other apps I also do get a gain but in this case it runs really messy. 10X slower !! How could I detect that using VBO is 10X slower on a HW ?? I mean… VBO should be faster in ANY case right ?

tooltech · July 15, 2003, 4:41am

Here is the code used to render

gzVoid gzIBRGeometry: reTraverseAction( gzTraverseAction *actionclass , gzContext *context)
{
if(actionclass->isExactType(gzRenderAction::getClassType())) // Exact a graphic action
{
if(!gzGraphicsEngine::has_vertex_program())
return;

  //gzDepthFunc(GZ_LESS);

  gzPushMatrix();

  gzMultMatrixr(&m_transform.v11);

  if(gzGraphicsEngine::has_vertex_buffer_object())
  {
  	gzULong offsetToDepth=m_width*m_height*sizeof(gzShort)*2;

  	if(m_rebindIndex)
  	{
  		m_rebindDepth=TRUE;

  		if(m_bufIndexID)
  		{
  			gzDeleteBuffers(1,&m_bufIndexID);
  			m_bufIndexID=0;
  		}

  		gzGenBuffers(1,&m_bufIndexID);

  		gzBindBuffer(GZ_ELEMENT_ARRAY_BUFFER,m_bufIndexID);

  		gzBufferData(GZ_ELEMENT_ARRAY_BUFFER,2*m_width*sizeof(gzULong),m_indexSet->getIndexAddress(),GZ_STATIC_DRAW);

  		if(m_bufID)
  		{
  			gzDeleteBuffers(1,&m_bufID);
  			m_bufID=0;
  		}

  		gzGenBuffers(1,&m_bufID);

  		gzBindBuffer(GZ_ARRAY_BUFFER,m_bufID);

  		gzBufferData(GZ_ARRAY_BUFFER,m_width*m_height*(sizeof(gzShort)*2+sizeof(gzFloat)),0,GZ_STATIC_DRAW);

  		gzBufferSubData(GZ_ARRAY_BUFFER,0,offsetToDepth,m_indexSet->getXYAddress());

  		m_rebindIndex=FALSE;
  	}
  	else
  	{
  		gzBindBuffer(GZ_ELEMENT_ARRAY_BUFFER,m_bufIndexID);

  		gzBindBuffer(GZ_ARRAY_BUFFER,m_bufID);
  	}

  	if(m_rebindDepth)
  	{
  		gzBufferSubData(GZ_ARRAY_BUFFER,offsetToDepth,m_width*m_height*sizeof(gzFloat),m_depthMap->getArray().getAddress());
  		m_rebindDepth=FALSE;
  	}

  	gzEnableClientState(GZ_VERTEX_ARRAY);

  	gzEnableVertexAttribArray(1);

  	for(gzULong i=0;i<(m_height-1);i++)
  	{
  		gzVertexAttribPointer(1,1,GZ_FLOAT,FALSE,0,(const gzVoid *)(i*m_width*sizeof(gzFloat)+offsetToDepth));

  		gzVertexPointer(2,GZ_SHORT,0,(const gzVoid *)(i*m_width*sizeof(gzShort)*2));

  		gzDrawRangeElements(GZ_TRIANGLE_STRIP,0,2*m_width-1,2*m_width,GZ_UNSIGNED_INT,0);		
  	}

  	gzDisableVertexAttribArray(1);

  	gzBindBuffer(GZ_ELEMENT_ARRAY_BUFFER,0);
  	gzBindBuffer(GZ_ARRAY_BUFFER,0);

  }
  else
  {

  	gzEnableClientState(GZ_VERTEX_ARRAY);
  	
  	gzEnableVertexAttribArray(1);

  	for(gzULong i=0;i<(m_height-1);i++)
  	{
  		gzVertexAttribPointer(1,1,GZ_FLOAT,FALSE,0,((gzFloat *)m_depthMap->getArray().getAddress())+i*m_width);

  		gzVertexPointer(2,GZ_SHORT,0,m_indexSet->getXYAddress()+i*m_width);

  		gzDrawRangeElements(GZ_TRIANGLE_STRIP,0,2*m_width-1,2*m_width,GZ_UNSIGNED_INT,m_indexSet->getIndexAddress());		
  	}

  	gzDisableVertexAttribArray(1);
  }

  gzPopMatrix();

  //gzDepthFunc(context->depthFunc);

}

Robert_Osfield · July 15, 2003, 4:43am

Originally posted by ToolTech:
[b]Hi Robert.

I get the scary feeling that my usage of shorts for vertex coordinates and mixing VertexAttrib with normal VertexPointer slows it down. In my other apps I also do get a gain but in this case it runs really messy. 10X slower !! How could I detect that using VBO is 10X slower on a HW ?? I mean… VBO should be faster in ANY case right ?[/b]

Eventually I’d hope VBO to efficient for all vertex formats supported by the hardware, but I think its still early days for the driver support. The spec mentions that float for vertex storage being optimized… cut and pasted from the vertex_buffer_object.txt :

2.8A.1 Vertex Arrays in Buffer Objects
--------------------------------------

Blocks of vertex array data may be stored in buffer objects with the
same format and layout options supported for client-side vertex
arrays.  However, it is expected that GL implementations will (at
minimum) be optimized for data with all components represented as
floats, as well as for color data with components represented as
either floats or unsigned bytes.

tooltech · July 15, 2003, 4:46am

Robert…

In my case I have just tested with floats instead of shorts and I get the same results

Hmm. Anyoine using VBO with VertexAttrib mixed with VertexPointer ?

Mazy · July 15, 2003, 4:54am

How often do you rebind the depth?

the ‘if(m_rebindDepth)’

since youre defining static buffers you shouldnt rebind them at all, or very seldom.

tooltech · July 15, 2003, 4:57am

Just once. The firt time the depth is uploaded and then for each depth update but in the sample app that only occurs once…

tooltech · July 15, 2003, 5:02am

I have use Quantify on the app and it shows that glClear and glVertexPointer does all the stalling (97%). They might want to do some flushing etc…

Anyway I don’t get any GL errors in the code

tooltech · July 15, 2003, 5:30am

Ok. Just found something VERY interesting. The stall occurs when I mix VBO rendering with normal vertex arrays. The moving lamp in the demo is rendered by normal vertex arrays. When I remove the lamp geometry, the FPS goes up to 65 FPS ??

Is it forbidden to mix vertex arrays and vertex buffer objects ???

PH1 · July 15, 2003, 5:38am

Now, that’s strange. No, you’re allowed to mix and match as you please. I’ve done that myself with no problems ( vertices, texcoords with VBO, TS vectors using normal arrays ).

tooltech · July 15, 2003, 5:45am

Is it ok to do the

gzBindBuffer(GZ_ELEMENT_ARRAY_BUFFER,0);
gzBindBuffer(GZ_ARRAY_BUFFER,0);

to enable the usage of “normal” vertex arrays ?

orhunbirsoy · July 15, 2003, 5:47am

Do not test performance with debug builds. Make a release build and then compare the results.

zeckensack · July 15, 2003, 5:56am

Originally posted by ToolTech:
[b]Is it ok to do the

gzBindBuffer(GZ_ELEMENT_ARRAY_BUFFER,0);
gzBindBuffer(GZ_ARRAY_BUFFER,0);

to enable the usage of “normal” vertex arrays ?[/b]
AFAIK yes, but you must resupply all pointers. The GL doesn’t maintain extra vertex, texcoord, etc pointers to kick in when you turn VBOs off.

(that’s how I understood it anyway)