ATI VBO Performance problems

Hi,

the last weekend i’ve played with VBOs and all works fine. On my first System, with an nVidia
GeForce Ti4800 i get more than 300 Frames per
second by rendering an object with more than 72000 polygons …

When i test the same code with the same model on my second System, with an ATI Radeon 8500 i only got 140 Frames per Second. A second test the next generation of this ATI card reduces the Frames again by 10 and more FPS …

Knows anyone something about bottlenecks by using ARB_Vertex_Buffer_Objects on ATI based Systems? I can’t belive, that an older nVidia Card should be better (more than 40%) than ATI cards on the same or better generations …

Thx,
Christian

What vertex format are you using?

Make sure your drivers are up to date. I got a 4x performance boost with vbos after I updated my ATI 9200 drivers.

The 8500 is an old card; it is not the ATi equivalent of the Ti4800 in performance. Technically, it’s more around a Ti4200 (or even a GeForce3) in performance, so your numbers are hardly surprising.

When you say “the next generation of this ATI card,” what card are you specifically refering to?

@ Humus
I using an array of vector elements, based on a special class with public float coords. All datas are sorted as triangle tripletts and are redered with glDrawArray (glDrawElements will not work currently) as triangles. Later, this array should hold the datas for a trianglestrip render process to increase the performance.

@ Roco and Korval
I use the newest ATI and NVidia drivers. With the next generation i mean the 9000 ATI card series.

When i test the Performance of the cards with the Aquamark benchmark, the ATI have everytime better results, more than doubled values. Only on my rendering process the ATI laggs …

How much data (megabytes) is being put into the buffer? Is it even a megabyte? If it isn’t, I’m stumped. Otherwise, I think I read right on these forums that there is a limit to how many vertices one should try to draw from a buffer at once. Look here:

http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=012324

I had suggest to the original poster in that thread to check their drivers as well hehe.

I use the newest ATI and NVidia drivers. With the next generation i mean the 9000 ATI card series.
I meant specific cards. A 9200 is running on the R200 core that comprises the 8500. A 9500 and above runs on the far more advanced R300 core.

@Korval
i’ve tested both, a 9200 and a 9600, both with the same result, i got 50% less performance compared to the nVidia Cards with correspondents generations…

@Rocko
it’s more than one megabyte (circa 2,5MB), and yes both are the newest drivers (Catalyst and Forceware). I dont’t think that i have reached the vertices limit (the GPU gems book tells something about 7000+ vertices are nessesary but nothing about a limit), and why should be the ATI limited and the nVidia not?

I’ve checked out the performance increase vs. normal Rendering and i have more than doubled the performance, but all nVidia cards have an additonal increase of 30-50% …

ATi cards are known to be sensitive to the alignment of the incoming data. How is your vertex data aligned?

Since no one else is having any problems similar to this, it must be your rendering code. Could you post a representative sample for us to look at?

I couldn’t belive that the failure is in the code, but as u wish, here is the render code :

// enable all
glEnableClientState(GL_VERTEX_ARRAY);				
glEnableClientState(GL_TEXTURE_COORD_ARRAY);

// get Offset-ID 
glBindBufferARB(GL_ARRAY_BUFFER_ARB, *m_pObject->getVBONameID());
 
// Setting des Vertex-Pointers to Vertex-Buffer.
glVertexPointer( 3, GL_FLOAT, 0, (char *) NULL );

//render texture
mapTexture();		

// render VBO
glDrawArrays(GL_TRIANGLES, 0, m_pObject->getVertexCount());	

// disable all
glDisableClientState( GL_VERTEX_ARRAY );
glDisableClientState(GL_TEXTURE_COORD_ARRAY);

And here is the part of setting the id’s

// build and bind offset-ID 
glGenBuffersARB( 1, m_pObject->getVBONameID());
glBindBufferARB( GL_ARRAY_BUFFER_ARB, *m_pObject->getVBONameID()); 

// load up datas
glBufferDataARB(GL_ARRAY_BUFFER_ARB, m_pObject->getVertexCount()*3*
		sizeof(float), m_pObject->getVertices(), GL_DYNAMIC_DRAW_ARB );

// do the same for textures
glGenBuffersARB( 1, m_pObject->getVBOTextureNameID());					
glBindBufferARB( GL_ARRAY_BUFFER_ARB, *m_pObject->getVBOTextureNameID());	

glBufferDataARB(GL_ARRAY_BUFFER_ARB, m_pObject->getVertexCount()*2*
		sizeof(float), m_pObject->getTextureCoords(), GL_STATIC_DRAW_ARB);

What does mapTexture()?
Why is the vertex buffer GL_DYNAMIC_DRAW_ARB?

Map texture does map the texture, i’ve for this process a separated method for better handling, nothing more… This have no influence, when i disable texture-mapping, i’ve the same problems…

The buffer is dynamic only for test to check the performance reduce by this. I know, it’s better to use static, but the nVidia whitepaper for VBOs says, that i should use dynamic when the geometric is often change… at the moment, i test this.

Try setting it to static and see if it changed performance.

It works !!

I’ve set the buffer to static, this was not the problem, i got only 10-30 frames more…

But i think there was a driver problem as u have told me. I’ve set up my system new for the XP Service Pack 2 and installed only the new drivers, and now the ATI works like it should be ;o)

Thanks a lot for all hints !

Now i have a second problem, perhaps some of u can help me again. After all works fine, i’ve tried to use glDrawElements.

This works fine with the geometric datas and i got a perfomance increase of 20-30 frames again. Fine, but i’ve many troubles with the textures…

In the glDrawArray mode all works fine, now with glDrawElements and the same call of the texture rendermode, i got ugly textures, where all looks like fractals …

I’ve checked many net sources, but i found nothing which tells, that i must change my way of texturing when i use glDrawElements instad of glDrawArrays …

Here is the way of binding multipaged textures i use:

glActiveTextureARB (GL_TEXTURE0_ARB); 
glBindTexture(GL_TEXTURE_2D, *m_pObject->getTextureIDReference()); 

glBindBufferARB( GL_ARRAY_BUFFER_ARB, *m_pObject->getVBOTextureNameID());
glTexCoordPointer( 2, GL_FLOAT, 0, (char *) NULL );

And this is the way, how i set it up for the GPU:

glGenBuffersARB( 1,m_pObject->getVBOTextureNameID());					
glBindBufferARB( GL_ARRAY_BUFFER_ARB, *m_pObject->getVBOTextureNameID());	

glBufferDataARB(GL_ARRAY_BUFFER_ARB, m_pObject->getVertexCount()*2*
		sizeof(float), m_pObject->getTextureCoords(), GL_STATIC_DRAW_ARB);

I’ve checked it, the number of Vertices and Texturecoords are equal…

Make sure all glTexCoordPointer calls go to the correct glClientActiveTexture setting and don’t forget to glEnableClientState(GL_TEXTURE_COORD_ARRAY) for them, but if it worked with glDrawArrays…?
If this happens with only one texcoord array this should be glClientActiveTextureARB(GL_TEXTURE0_ARB) by default, and you might have found another driver bug.
Post some more rendering code.

GL_TEXTURE_COORD_ARRAY is enabled (done in the render call see above) and normaly i use a switch-statement for the TextureConstants against to the numbers of used textures. Only for the understanding, i’ve set GL_TEXTURE0_ARB. But i use actually only one texture …

Thx, i’ve forgotten the glClientActiveTexture call, but this was not the failure. On both cards, ATI and nVidia i got the failure of texture mapping…

The same mapping failure occurs in all render methods i use (ATI Objects, Imidiate ect.) where i call glDrawElements instad of glDraw Array. Here is the Element call :

glDrawElements(GL_TRIANGLES, m_pObject->getVertexIndicesCount(), GL_UNSIGNED_INT, m_pObject->getVertexIndices());