PDA

View Full Version : Buffer vs immediate



Beiufin
06-26-2012, 09:50 AM
I have a stream of points coming in. I need to know which method is the best way to render them.

I know VBO is much much more efficient than using immediates, but as I said I have a constant stream of points coming in so afaik if I used VBO I would either need to create an array with a size large enough for the max potential points on the screen at one time (then to remove or add a point Id have to search through the whole thing) or recreate the vertex array everytime it is updated. These both seem very very inefficient. And I'm pretty sure there is a better way of doing this that I do not know of...

Currently I just have an ever changing deque of points that I iterate through as I define vertexs for immediate rendering.

Whats the best method for rendering an ever changing (new additions and removals) list of verticies (rendered as points)?

mhagain
06-26-2012, 10:51 AM
A VBO would still be preferable. You don't have to create one that's large enough for your hypothetical max number of points; just create one that's a nice size for a comfortable amount of points - say, 65536 or thereabouts (you can experiment with different values and tune it to what would work best for you). Make it GL_STREAM_DRAW and your initial BufferData call specifies NULL data.

At the start of the frame, MapBufferRange it with invalidate buffer, unsynchronized and write-only. Do nothing else yet - definitely don't unmap. Also, keep a counter and init it to 0 here.
Each time you recieve a point, write it to the mapped VBO pointer, increment the pointer by one and the counter by one.
If the counter == the max you can hold in the buffer, unmap, DrawArrays, MapBufferRange again (same params - invalidate/unsynch/write-only) and reset the counter to 0.
At the end of the frame, if the counter is non-0, just unmap and DrawArrays.

That will do it and give you very good performance.

If this doesn't suit your usage requirements, you can modify the pattern slightly. For example, you may not be recieving the full stream of points each frame and may need to keep previously received points for more than one frame, so the solution is to retain your existing list as-is, add/remove points from it as required, then when time comes to draw the list just iterate through it using the same pattern as above: MapBufferRange, add point to buffer, check for full, draw and re-init if full, draw anything left over.

If this seems confusing I'll happily provide sample code.

Beiufin
06-26-2012, 11:10 AM
I have never used a Buffer object setup before so I have a few questions bout where to place certain functions.

From what Ive read, I know Ill need to call:
glGenBuffers
glDeleteBuffers
glBindBuffer
glBufferData
glVertexPointer
glEnableClientState
glDrawArrays
glDisableClientState

But I have no clue when or how often I need to call these, which will need to be called in my Draw() method and which should be initialized at the start of the program?

mhagain
06-26-2012, 11:27 AM
Easiest way is to use something like GLEW to get the function pointers. I assume from your mention of glEnableClientState that you're not using (or proposing to use) generic attrib arrays, so try this for size:

// your point struct might look something like this
struct drawpoint
{
float position[3];
unsigned char color[4];
};

// to do - experiment with different sizes and tune for best perf
#define MAX_POINTS 65536

// let's not have code stretching 4 miles across the screen
#define BUFFER_MAP_BITS (GL_MAP_WRITE_BIT | GL_MAP_UNSYNCHRONIZED_BIT | GL_MAP_INVALIDATE_BUFFER_BIT)

// using a vector here; just change it to whatever container type you prefer
std::vector<drawpoint> myWonderfulPoints;
GLuint pointsvbo = 0;

// call this once only at startup
void CreateMeABuffer (void)
{
glGenBuffers (1, &pointsvbo);
glBindBuffer (GL_ARRAY_BUFFER, pointsvbo);
glBufferData (GL_ARRAY_BUFFER, MAX_POINTS * sizeof (drawpoint), NULL, GL_STREAM_DRAW);
glBindBuffer (GL_ARRAY_BUFFER, 0);
}

// call this every frame to draw stuff!
void DrawMeSomePoints (void)
{
drawpoint *bufpoints = NULL;
int pointcount = 0;

glBindBuffer (GL_ARRAY_BUFFER, pointsvbo);

glEnableClientState (GL_VERTEX_ARRAY);
glEnableClientState (GL_COLOR_ARRAY);

// because bufpoints points to NULL here, this will work (yes, I did test it)
glVertexPointer (3, GL_FLOAT, sizeof (drawpoint), bufpoints->position);
glColorPointer (4, GL_UNSIGNED_BYTE, sizeof (drawpoint), bufpoints->color);

if ((bufpoints = (drawpoint *) glMapBufferRange (GL_ARRAY_BUFFER, 0, MAX_POINTS * sizeof (drawpoint), BUFFER_MAP_BITS)) == NULL)
{
// error handling here - suggest falling back to immediate mode? don't forget to call glBindBuffer (GL_ARRAY_BUFFER, 0) and glDisableClientState!
return;
}

// you may prefer to use a proper iterator here
for (int i = 0; i < myWonderfulPoints.size (); i++)
{
// adds current point to buffer
memcpy (&bufpoints[pointcount], &myWonderfulPoints[i], sizeof (drawpoint));

// go to next slot in buffer
pointcount++;

// check for potential overflow
if (pointcount == MAX_POINTS)
{
// draw what we got so far
glDrawArrays (GL_POINTS, 0, pointcount);

// re-init buffer and counter
if ((bufpoints = (drawpoint *) glMapBufferRange (GL_ARRAY_BUFFER, 0, MAX_POINTS * sizeof (drawpoint), BUFFER_MAP_BITS)) == NULL)
{
// error handling here - if this fails we've got bigger problems on our hands than not being able to get a usable pointer!
}

pointcount = 0;
}
}

// draw anything left over
if (pointcount) glDrawArrays (GL_POINTS, 0, pointcount);

glDisableClientState (GL_VERTEX_ARRAY);
glDisableClientState (GL_COLOR_ARRAY);

glBindBuffer (GL_ARRAY_BUFFER, 0);
}

Beiufin
06-26-2012, 12:12 PM
Ah, I cannot thank you enough. That code cleared so much up about Buffer Objects and how to implement them!

You mentioned the built in vertex atributes... I opted not to use them because I really only need position and color so I assumed defining my own would be more efficient in both memory and speed. Is this correct or does OpenGl perform better with vertices defined using a generic vertex attribute array?

mhagain
06-26-2012, 12:19 PM
It doesn't really matter much to be honest. Generic attribs are more flexible and don't tie you to horrible things like abusing extra TexCoordPointers for packing in additional data, and everything that can be done with conventional attribs can also be done with generic attribs too (they also enable you to make the horrible glClientActiveTexture API call go away from your code). Plus they're friendlier for modern OpenGL so if you're thinking of heading in that direction it's not a bad idea to start getting used to them.

Beiufin
06-26-2012, 12:42 PM
Due to lack of understanding the benefit in this situation I probably wont use it in this program, but what would I change?

Would it just be a matter of using glVertexAttribPointer inplace of glVertexPointer + glColorPointer?
Wouldn't it require me to use glEnableVertexAttribArray + glDisableVertexAttribArray inplace of glEnableClientState + glDisableClientState?

mhagain
06-26-2012, 01:02 PM
There are also some shader changes needed too - in fact, it absolutely requires shaders so that may well be a step too far for the purposes of your current program.

I should probably add that the VBO usage pattern I gave above is best suited to streaming truly dynamic data - like in your points example. If you've got data that never changes from frame-to-frame then there are other, better ways to handle it. Also, if you don't have glMapBufferRange available then you may well find that it's best to abandon VBOs and use regular old-fashioned client-side vertex arrays instead - the code I gave above can be very easily modified to handle that too. But the exact best way will vary from hardware to hardware and from use case to use case.

Beiufin
06-26-2012, 01:26 PM
What do you mean if I dont have glMapBufferRange avaliable?

Kopelrativ
06-26-2012, 02:08 PM
What do you mean if I dont have glMapBufferRange avaliable?
glMapBufferRange is a function from OpenGL 3. For the graphics driver to be able to state that it fulfills the OpenGL 3 specification, they have to fulfill all requirements in OpenGL 3. However, there are many OpenGL 2.1 drivers that also fulfill a selected subset of OpenGL 3. That means, if you base your code on OpenGL 2.1, you can not now for sure that glMapBufferRange is available. You should test for this at initialization, and produce a proper error message if needed.



glVertexPointer (3, GL_FLOAT, sizeof (drawpoint), bufpoints->position);
glColorPointer (4, GL_UNSIGNED_BYTE, sizeof (drawpoint), bufpoints->color);


It should be &bufpoints->position and &bufpoints->color, shouldn't it? That will give you the offset, instead of referencing a value near 0.

mhagain
06-26-2012, 03:34 PM
It should be &bufpoints->position and &bufpoints->color, shouldn't it? That will give you the offset, instead of referencing a value near 0.

No, because of array/pointer "equivalence" - but good to call on it, because if you have a single unsigned int (representing an RGBA colour) you will need to use & on it.

In another piece of code, I have:
surfacevert *svert = NULL;

glBindBuffer (GL_ARRAY_BUFFER, blahwhatever);

glVertexAttribPointer (0, 3, GL_FLOAT, GL_FALSE, sizeof (surfacevert), svert->position);
glVertexAttribPointer (1, 2, GL_FLOAT, GL_FALSE, sizeof (surfacevert), svert->texcoord0);
glVertexAttribPointer (2, 3, GL_FLOAT, GL_FALSE, sizeof (surfacevert), svert->texcoord1);

Viewing in the debugger, we can see that svert->position is 0x00, svert->texcoord0 is 0x0c and svert->texcoord1 is 0x14, which correctly correspond to buffer offsets of 0, 12 and 20 for this vertex type.

Not 100% certain of the portability of this trick though - it works fine on Windows with MSVC and is great for decoupling your struct layout from your gl*Pointer calls, not to mention a darn sight nicer to read than the more usually seen BUFFER_OFFSET macro, but what about other platforms?

aqnuep
06-26-2012, 05:11 PM
Off-topic, but you could use the offsetof (http://en.wikipedia.org/wiki/Offsetof) operator if you don't want to abuse NULL pointers this way.

mhagain
06-26-2012, 05:40 PM
Off-topic, but you could use the offsetof (http://en.wikipedia.org/wiki/Offsetof) operator if you don't want to abuse NULL pointers this way.

That one's new to me, but it's interesting to know. I do note that the "traditional" implementation given on the Wiki page you linked involves similar NULL pointer abuse though, but you could write your own version that casts a chunk of scratch memory to your struct type then evaluates the offset.

Kopelrativ
06-27-2012, 12:11 AM
No, because of array/pointer "equivalence" - but good to call on it, because if you have a single unsigned int (representing an RGBA colour) you will need to use & on it.

In another piece of code, I have:
surfacevert *svert = NULL;

glBindBuffer (GL_ARRAY_BUFFER, blahwhatever);

glVertexAttribPointer (0, 3, GL_FLOAT, GL_FALSE, sizeof (surfacevert), svert->position);
glVertexAttribPointer (1, 2, GL_FLOAT, GL_FALSE, sizeof (surfacevert), svert->texcoord0);
glVertexAttribPointer (2, 3, GL_FLOAT, GL_FALSE, sizeof (surfacevert), svert->texcoord1);

Viewing in the debugger, we can see that svert->position is 0x00, svert->texcoord0 is 0x0c and svert->texcoord1 is 0x14, which correctly correspond to buffer offsets of 0, 12 and 20 for this vertex type.

Not 100% certain of the portability of this trick though - it works fine on Windows with MSVC and is great for decoupling your struct layout from your gl*Pointer calls, not to mention a darn sight nicer to read than the more usually seen BUFFER_OFFSET macro, but what about other platforms?

I see, you are right, of course, I didn't realize they were arrays. I use it in Linux with gcc, it should be portable. This is a little tricky sometimes, if the type of svert->position is "hidden" data type (which it wasn't in your example). For example, if you use glm::vec3, is that an array or not? In that case, I usually do "&svert->position[0]". Will offsetof help for this situation?

I also do not like the BUFFER_OFFSET macro.

mhagain
06-27-2012, 04:35 AM
What's interesting is to look at the generated assembly for this. It demonstrates that it's entirely a compile-time construct, like so:
004950F7 mov eax,dword ptr [svert]
004950FA add eax,14h
004950FD push eax
004950FE push 20h
00495100 push 0
00495102 push 1406h
00495107 push 3
00495109 push 2
0049510B call dword ptr [___glewVertexAttribPointer (56035Ch)]

I do have to agree that it can be a little frightening to look at, though.

Regarding offsetof, because it's defined as a size_t it means that you need to do a (void *) cast in order to use it. &stvert->position[0] works perfectly fine though and also doesn't cause a NULL pointer dereference (it generates the very same asm, in fact).

Either way, there is definitely no perfect solution here. The traditional BUFFER_OFFSET, however, is quite nasty because it requires your draw setup to have knowledge of the struct layout, packing, alignment, etc.

Beiufin
06-27-2012, 08:25 AM
Turns out I only have access to OpenGL 1.4 with ARB extenstions. Should I skip VBO and just use standard vertex arrays?

(glBindBuffer, GL_ARRAY_BUFFER, and a few others are undefined in this version as I do not have access to glew.h)

mhagain
06-27-2012, 09:30 AM
Probably best to stick with standard vertex arrays, yes. There's a good chance that you don't have hardware T&L either, so you won't get much benefit from VBOs even if you could find a driver that supported them.

Beiufin
06-27-2012, 10:11 AM
I have an overlayed grid that only changes when a zoom in or out occurs. Drawing it requires only about 1000 vertices. Whats the most efficient way of drawing these rarely updating verticies (The grid contains a series of lines and points)? Or should I just add to the VertexArray Im using to draw the constantly deque of points?

kowal
06-27-2012, 10:41 AM
I bet that display lists are best for static geometry in legacy OpenGL.
Pack Your grid into display list and rebuild list when You change zoom settings.

Beiufin
06-27-2012, 11:27 AM
Ah display list is simple enough and looks like it will suit my needs perfectly!

I have a couple more questions about efficiency though (I hate to keep bothering you guys with these "newbie" questions, if there is a good place for me to learn the various efficiency values of various c/c++/OpenGl calculations please let me know!).

Is it more efficient to check and exclude off screen points from the VertexArray or just add them all?

Also, as this program has a zoom, when zoomed out a lot of the points may overlap due to lack of pixels. Is it more efficient to check and only include points that are far enough appart to be unique on screen or should I just add every single point regardless of overlaps?
(I have this currently implemented, fully zoomed out it can reduce the VertexArray order of magnitude by 3 or 4 (800000 points would be reduced to around 2000) but it does require me to perform a distance calculation each time a point is added, and a distance calculation on every single point if the user zooms in or out)

Beiufin
06-27-2012, 02:10 PM
Probably best to stick with standard vertex arrays, yes. There's a good chance that you don't have hardware T&L either, so you won't get much benefit from VBOs even if you could find a driver that supported them.

So what are my alternatives to glMapBuffer? I am having the code throws an excpetion because (obviously) without glMapBuffers memcpy attempts to copy to a NULL destination.

Could (should) I just pass (from your code) myWonderfulPoints to glVertexPointer and glColorPointer? Or would this be dangerous due to myWonderfulPoints having the possibility of being changed mid frame?

Or should I create a new array of MAX_POINTS every frame and run memcpy on that, or is that too costly?

mhagain
06-27-2012, 02:50 PM
This code should work as a non-VBO alternative:
// your point struct might look something like this
struct drawpoint
{
float position[3];
unsigned char color[4];
};

// using a vector here; just change it to whatever container type you prefer
std::vector<drawpoint> myWonderfulPoints;

// call this every frame to draw stuff!
void DrawMeSomePoints (void)
{
glEnableClientState (GL_VERTEX_ARRAY);
glEnableClientState (GL_COLOR_ARRAY);

glVertexPointer (3, GL_FLOAT, sizeof (drawpoint), myWonderfulPoints[0].position);
glColorPointer (4, GL_UNSIGNED_BYTE, sizeof (drawpoint), myWonderfulPoints[0].color);

glDrawArrays (GL_POINTS, 0, myWonderfulPoints.size ());

glDisableClientState (GL_VERTEX_ARRAY);
glDisableClientState (GL_COLOR_ARRAY);
}

If you have the kind of gfx hardware that I think you have, display lists are unlikely to give good performance. Plus there is the overhead of having to destroy and recreate the display list every time your points list needs to change, which, even if I'm wrong, will murder your performance.


Is it more efficient to check and exclude off screen points from the VertexArray or just add them all?

Also, as this program has a zoom, when zoomed out a lot of the points may overlap due to lack of pixels. Is it more efficient to check and only include points that are far enough appart to be unique on screen or should I just add every single point regardless of overlaps?

It's likely going to be cheaper to just submit everything and let the GPU and/or driver deal with offscreen/overlap cases ("hang 'em all and let God sort 'em out", if you will) than to do potentially expensive CPU-side checks and potentially expensive point list rebuilding. The exception would be if you were doing some really expensive fragment shader ops on your points, which - given that you have OpenGL 1.4 - I doubt. ;) You would also do these tests yourself if you had geometry that was more complex than points.

Also, and if my first guess is correct, you most likely have software T&L which means that you'd be doing this bunch of work on the CPU only for the driver to have to come along and do the exact same bunch of work (also on the CPU) itself later on. Again, there would be an exception for much more complex geometry, but for points I wouldn't bother.

(One other exception - if you have a bunch of points that you know ahead of time are going to be in close proximity then there is value in doing an offscreen test on the bounds of that bunch. If you don't have this info - "hang 'em all".)

Beiufin
06-28-2012, 06:11 AM
This code should work as a non-VBO alternative:
// your point struct might look something like this
struct drawpoint
{
float position[3];
unsigned char color[4];
};

// using a vector here; just change it to whatever container type you prefer
std::vector<drawpoint> myWonderfulPoints;

// call this every frame to draw stuff!
void DrawMeSomePoints (void)
{
glEnableClientState (GL_VERTEX_ARRAY);
glEnableClientState (GL_COLOR_ARRAY);

glVertexPointer (3, GL_FLOAT, sizeof (drawpoint), myWonderfulPoints[0].position);
glColorPointer (4, GL_UNSIGNED_BYTE, sizeof (drawpoint), myWonderfulPoints[0].color);

glDrawArrays (GL_POINTS, 0, myWonderfulPoints.size ());

glDisableClientState (GL_VERTEX_ARRAY);
glDisableClientState (GL_COLOR_ARRAY);
}


I have divided my points into multiple deques (for orginization purposes), originally I was just going to use memcpy to add them all to the same array each frame and draw that... Is it better to put copy the point's pointers from multiple deques into one array or just to call
glVertexPointer (3, GL_FLOAT, sizeof (drawpoint), myWonderfulPoints[0].position);
glColorPointer (4, GL_UNSIGNED_BYTE, sizeof (drawpoint), myWonderfulPoints[0].color);

glDrawArrays (GL_POINTS, 0, myWonderfulPoints.size ());
multiple times?


If you have the kind of gfx hardware that I think you have, display lists are unlikely to give good performance. Plus there is the overhead of having to destroy and recreate the display list every time your points list needs to change, which, even if I'm wrong, will murder your performance.


Whats the best alternative? (The grid consists of lines, and points. The points in the grid are smaller than the points I'm tracking so I cant just draw them with the other points.)
- Should I just use immediates?
- Should I create and draw three more VertexArrays? (one will remain static (lines), the other two will update on zoom.(lines and points))
- or is there another draw method that is good for static/rarely updating vertices?