PDA

View Full Version : VBO problem



red_gfx
04-01-2009, 11:28 PM
Hello everyone

I am working at a terrain, which I have generated from a heightfield. In the beginning I used immediate mode rendering and then vertex arrays, which worked fine and now I passed to VBOs, but the problem is that it doesn't render anything. I use a vector(vertexData) of VertexData structure which holds my position, color, normal and texture coordonates.

Here is my code: - initialize VBOs



void Terrain::buildTerrain()
{ glGenBuffers(6,bufferObjects);

// vertex
glBindBuffer(GL_ARRAY_BUFFER,bufferObjects[VERTEX_DATA]);
glBufferData(GL_ARRAY_BUFFER, nrVertices * 3 * sizeof(GL_FLOAT),vertexData, GL_STATIC_DRAW);


// color
glBindBuffer(GL_ARRAY_BUFFER, bufferObjects[COLOR_DATA]);
glBufferData(GL_ARRAY_BUFFER, nrVertices * 4 * sizeof(GL_FLOAT), (float*)vertexData + 3, GL_STATIC_DRAW);

// normals
glBindBuffer(GL_ARRAY_BUFFER, bufferObjects[NORMAL_DATA]);
glBufferData(GL_ARRAY_BUFFER, nrVertices * 3 * sizeof(GL_FLOAT), (float*)vertexData + 7, GL_STATIC_DRAW);


// 1'st texture coordinates
glBindBuffer(GL_ARRAY_BUFFER, bufferObjects[TEXTURE_DATA_1]);
glBufferData(GL_ARRAY_BUFFER, nrVertices * 2 * sizeof(GL_FLOAT), (float*)vertexData + 10, GL_STATIC_DRAW);


// 2'nd texture coordinates
glBindBuffer(GL_ARRAY_BUFFER, bufferObjects[TEXTURE_DATA_2]);
glBufferData(GL_ARRAY_BUFFER, nrVertices * 1 * sizeof(GL_FLOAT), (float*)vertexData + 12, GL_STATIC_DRAW);

//generate the index array
ptrIndexArray = new unsigned short[length * width * 2];
unsigned short *ptrTemp = ptrIndexArray;

for (int y = 0; y < length - 1; y++)
{
for (int x=0; x < width; x++)
{
*ptrTemp++ = y * width + x;
*ptrTemp++ = (y + 1) * width + x;
}
}

// index array
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, bufferObjects[INDEX_DATA]);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, length * width * 2 * sizeof(GLushort), ptrIndexArray, GL_STATIC_DRAW);

}


- here I draw


void Terrain::renderTerrain()
{
glEnableClientState(GL_VERTEX_ARRAY);
glBindBuffer(GL_ARRAY_BUFFER, bufferObjects[VERTEX_DATA]);
glVertexPointer(3, GL_FLOAT, sizeof(VertexData), vertexData);

glEnableClientState(GL_COLOR_ARRAY);
glBindBuffer(GL_ARRAY_BUFFER, bufferObjects[COLOR_DATA]);
glColorPointer(4, GL_FLOAT, sizeof(VertexData), (float*)vertexData + 3);

glEnableClientState(GL_NORMAL_ARRAY);
glBindBuffer(GL_ARRAY_BUFFER, bufferObjects[NORMAL_DATA]);
glNormalPointer(GL_FLOAT, sizeof(VertexData), (float*)vertexData + 7);

glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glBindBuffer(GL_ARRAY_BUFFER, bufferObjects[TEXTURE_DATA_1]);
glClientActiveTextureARB(GL_TEXTURE0_ARB);
glTexCoordPointer(2, GL_FLOAT, sizeof(VertexData), (float*)vertexData + 10);

glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glBindBuffer(GL_ARRAY_BUFFER, bufferObjects[TEXTURE_DATA_2]);
glClientActiveTextureARB(GL_TEXTURE1_ARB);
glTexCoordPointer(1, GL_FLOAT, sizeof(VertexData), (float*)vertexData + 12);

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, bufferObjects[INDEX_DATA]);


for(int z = 0; z < length - 1; z++)
{
glDrawElements(GL_TRIANGLE_STRIP, width * 2, GL_UNSIGNED_SHORT, ptrIndexArray + z * width * 2);
}

}


Anyone knows what I'm doing wrong ?

Many thanks

Jan
04-02-2009, 01:55 AM
You do use an index buffer, but in glDrawElements you still seem to pass a pointer as last argument. When you use an index-buffer with VBOs the last argument to any drawcall is a "buffer-offset", that is an offset in bytes, that says where to start reading from the index-buffer. It is NOT a pointer anymore. For example, if you only wanted to skip the first 3 elements and your indices are unsigned shorts, the offset would be 6.

IIRC the VBO spec has an example at the end explaining it in more detail.

Jan.

Ilian Dinev
04-02-2009, 02:06 AM
It's all wrong >_<. Here's what you do:
You initially have an interleaved array (your "vertexData" var). But then you create 6 VBOs. You're passing pointers correctly, but hey - you copy interleaved data onto the buffers, not contiguous triples or quads of floats. Then, on drawing, you still specify the "vertexData" pointer, but you shouldn't. Once a buffer is bound to GL_VERTEX_ARRAY, the pointer you pass should be just the offset from buffer-start, not a pointer in memory. And oh.. you use triangle-strips. ( Where do I find the tutorial that leads so many newcomers to disastrous paths? ). Thus, you call glDrawElements a THOUSAND times T_T, practically killing any performance boost VBOs would give.

So, here's the corrected code:


unsigned int theVBO; // a member of "class Terrain"
unsigned int theIBO;
static const int OneVertexSize = (3+4+3+2+1)*4;


void Terrain::buildTerrain()
{

glGenBuffers(1,&amp;theVBO);
glBindBuffer(GL_ARRAY_BUFFER,theVBO);
glBufferData(GL_ARRAY_BUFFER, nrVertices * OneVertexSize,vertexData, GL_STATIC_DRAW);


ptrIndexArray = new unsigned int[...]; // unless you'll be drawing terrains only up to 256x256 ...

.. bla bla
.. hey, here actually create indices for no [censored] triangle-strips!


// index array
glGenBuffers(1,&amp;theIBO);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, theIBO);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, length * width * 2 * sizeof(GLushort), ptrIndexArray, GL_STATIC_DRAW);

delete vertexData;
delete ptrIndexArray;
}



void Terrain::renderTerrain()
{

glBindBuffer(GL_ARRAY_BUFFER, theVBO);

glEnableClientState(GL_VERTEX_ARRAY);
glVertexPointer(3, GL_FLOAT, OneVertexSize, (void*)0);

glEnableClientState(GL_COLOR_ARRAY);
glColorPointer(4, GL_FLOAT, OneVertexSize, (void*)12);

glEnableClientState(GL_NORMAL_ARRAY);
glNormalPointer(GL_FLOAT, OneVertexSize, (void*)28);


glClientActiveTextureARB(GL_TEXTURE0_ARB);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glTexCoordPointer(2, GL_FLOAT, OneVertexSize, (void*)40);

glClientActiveTextureARB(GL_TEXTURE1_ARB);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glTexCoordPointer(1, GL_FLOAT, OneVertexSize, (void*)48);

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, theIBO);
glDrawElements(GL_TRIANGLES, numTriangles*3, GL_UNSIGNED_INT,NULL);

glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_COLOR_ARRAY);
glDisableClientState(GL_NORMAL_ARRAY);
glClientActiveTextureARB(GL_TEXTURE1_ARB);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
glClientActiveTextureARB(GL_TEXTURE0_ARB);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);

glBindBuffer(GL_ARRAY_BUFFER,NULL);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER,NULL);
}

red_gfx
04-02-2009, 04:45 AM
Ilian I have read again about VBOs and with your explanations I have understood VBOs.
Now it works fine with some minor modification. Instead of using:

glDrawElements(GL_TRIANGLES, numTriangles * 3, GL_UNSIGNED_INT,NULL);

I used:

for(int z = 0; z < length - 1; z++)
{
glDrawElements(GL_TRIANGLE_STRIP, width * 2, GL_UNSIGNED_INT, (void*)(z * width * 2 * 4));
}


And here are just a few links that told that using Triangle Strips is more efficient that Triangle:
http://www.lighthouse3d.com/opengl/terrain/index.php3?heightmap
http://www.codeproject.com/KB/openGL/OPENGLTG.aspx
http://www.videotutorialsrock.com/opengl_tutorial/terrain/text.php

What do you suggest to use? I'm asking because beside the terrain I have made animated water also with Triangle Strips and I really want to take advantage of VBOs.

Thanks for your help

Ilian Dinev
04-02-2009, 07:28 AM
If your vertices have more attributes than a simple position, tristrips take-up almost 2 times more memory, and use 1000 (height) times more draw-calls.
Consider this: a frame from Crysis consists of 3000 draw-calls :).

videotutorialsrock.com hmm. The site that makes you install the first ever software implementation of OpenGL1.0 .. from 1992 :P ?
Or the other two sites, that use glBegin/etc :)

Jan
04-02-2009, 09:50 AM
Erm what?

I agree that you usually need a lot more drawcalls to render the same thing with tri-strips, but how should it take up that memory? You store the same vertices, only different indices (and those should be LESS). And 1000 times more drawcalls? Where did you pull that number from?? It depends on your usage and your data, but if you make it a 1000 times more drawcalls you are doing something heavily wrong.

And 3000 drawcalls per frame in Crysis? Any slide or something which explains that? Would be very interested to know more about it. 3000 drawcalls in OpenGL is slow, not sure whether D3D(9) would run Crysis with "decent" (haha) framerates, at all.

Jan.

Ilian Dinev
04-02-2009, 09:03 PM
You're right, Jan. I was thinking about tristrips naively.
P.S.: mostly because I hate tristrips and heightmap terrains...

scratt
04-02-2009, 09:32 PM
Well that's a relief. I was wracking my brains trying to understand where you were coming from too! ;)

Ilian Dinev
04-03-2009, 12:39 AM
And about those 3000 drawcalls... there's Pix and the result is 1640 draw calls from a frame I keep; but my first test was 3k draw calls.

Let's do some tests with GL3.0, several different meshes but same materials on my PC. I will show how many draw-calls max we can do to keep 60fps.
Before each draw-call we upload 1kB uniforms.
viewport is made to cull the triangles

1) Naively binding same vbo and attribs before each draw-call:
- 350 draws of ~21,000 tris
- 3500 draws of ~2,100 tris
- 35000 draws of 215 tris
- 45000 draws of 43 tris
- 45000 draws of 10 tris
- 45000 draws of 1 tri

2) Naively switching between 128 vbos with similar attributes:
- 350 draws of ~21k tris
- 3500 draws of ~2.1k tris
- 22000 draws of ~200 tris
- 22000 draws of ~20 tris

3) Naively switching between vbos with different attributes:
- 300 draws of ~21k tris
- 3000 draws of ~2.1k
- 18000 draws of ~200 tris
- 19000 draws of ~20 tris

4) one display list: (reverted to gl2.1 for this)
- 350 draws of 21k tri
- 3500 draws of 2.1k tri
- 35000 draws of 200 tri

5) several display lists:
- 350
- 3500
- 35000

6) binding vbo and its attribs, calling glDrawElements many times:
- 350
- 3600
- 37000 draws of 215 tri
- 92000 draws of 43 tri
- 93000 draws of 21 tri
- 93000 draws of 10 tri
- 93000 draws of 1 tri

7) binding vbo/attribs, using instancing and using only one call to glDrawElementsInstanced:
- 350
- 3500
- 38000 instances of 215 tri
- 140000 instances of 43 tri
- 140000 instances of 10 tri
- 140000 instances of 1 tri

I guess the limit of my gpu is 7.5 million triangles/frame.
c2d E8500 @3.8GHz (6MB L2), DDR3 @1.6GHz timing 7-7-7-20, GF8600GT @stock clocks. WinXP SP2. Drivers: 182.47
(programmers' dream machine, I might say :) )


Vtx shaders are as simple as:


varying vec4 varColor: TEX0;

attribute vec4 inColor : ATTR1;
attribute vec4 inColor2 : ATTR2;
uniform mat4 mvpx : C0;
uniform mat4 mvpx2 : C4;

void main(){
gl_Position = mvpx * gl_Vertex;
varColor = inColor+inColor2;
}

Making the shaders simpler or a bit more complex did not change benchmark results with more than 3%

Ilian Dinev
04-03-2009, 01:17 AM
P.S. it's understandable why my gpu's limit is ~7.5 million tris/frame: afaik geforces can only setup one triangle per cycle. And at a 540MHz clock, the limit is 9 million tris/frame.
GeForce GTX 285 runs at 648MHz core freq.
That 3k DX9 frame consisted of 750k tris iirc.

Jan
04-03-2009, 04:13 AM
You are saying at 200 triangles per DC you can make ~20000 drawcalls? Wow, not bad. I never benchmarked it, but i have an application were above 2000 drawcalls per frame the framerate went down considerably with every additional drawcall. Of course it was much more complicated, a deferred renderer with heavy shading operations and a big mesh (about 2.2 million unique triangles) and usually running at high resolutions (about 1200*something) at ~30 FPS. That's why i doubted that Crysis could run well with 3000 drawcalls per frame, because in my experience in a real application with all the other computational overhead drawcalls beginn to become a limiting factor at some point. And as we all know D3D9 (which Crysis also runs on) has a much higher drawcall overhead.

Jan.

red_gfx
04-03-2009, 04:18 AM
I made a change and used GL_TRIANGLES so I called glDrawElements only one time. The result (for a terrain of 512*512) was 92 FPS, while when using GL_TRIANGLE_STRIP with many calls I got 108 FPS (even if it's more logic that when using triangle strips it takes up more memory).

Ilian Dinev
04-03-2009, 04:54 AM
red_gfx, ignore my comment about tristrips; it was erroneous, as already pointed-out. Tristrips: faster and less memory.

Jan: just do mind that my PC is of the second-best type for single-threaded stuff, highly tuned-up ;)

I noticed on a dozen latest and older NV drivers I've tested, no matter whether I bind the same object/attribute-array/etc, the speed is always the same. glDrawElements: i.e 1800 cycles, glEnableClientState: 250 cycles, etc etc. So, you ought to cache the stuff yourself.

red_gfx
04-03-2009, 07:38 AM
Hey guys,

I have applied VBOs to my water also ( water is a grid with sinus function and Perlin noise) and my frame rate was only 29FPS, while with vertex arrays was 36FPS, so I stick to the vertex arrays (but all my scene is dropping down to 36FPS). Does anyone has any ideas how can I improve the performance ?

Here is a screenshot: http://image74.webshots.com/74/7/14/70/2780714700105011880pvoGwm_ph.jpg

ZbuffeR
04-04-2009, 01:24 AM
I get "403 forbidden" when clicking your link.

Do you still use GL_STATIC_DRAW ? Your vertex data changes each frame, so GL_STREAM_DRAW (or GL_DYNAMIC_DRAW if you render same geometry multiple times per frame)would be better.
Some more performance tips here :
http://www.ozone3d.net/tutorials/opengl_vbo_p2.php?lang=2

To better understand your performance bottleneck, you can try rendering the same view but much smaller, ie 102*76 viewport. If the framerate does way up, it means you have to optimize something else.

red_gfx
04-04-2009, 04:08 AM
Thanks for the answer ZbufferR

I use GL_STREAM_DRAW and with this one I get only 29FPS. And I changed the size of my viewport from 640*640 to 100*100 but no difference. I checked the link you told me and I implemented Vertex Mapping but I got the same frame rate. I'm sure that with shaders and vbo it will be better but shaders are a long way to go...

I've uploded the file again: http://www.kamino-prod.com/water.jpg