[QUOTE=Alfonse Reinheart;1253153]OK, let’s just cut to the chase. Go read this and implement one of those streaming strategies.
I didn’t say it was impossible. I said that recent functionality allows you to make it impossible. And since that functionality exists to make using them faster, that’s a strong hint that you shouldn’t be doing it in the first place.
The ability to resize the storage for a buffer object has nothing to do with how you use it.
Uniform blocks must be of a specific size. Therefore, whatever buffer object you use for them must be at least that size. It could be bigger, but it can’t be smaller.
Where? I said that ARB_buffer_storage/GL 4.4 allows you to allocate buffers that cannot be reallocated. And that means that it was a mistake for OpenGL to let you reallocate them to begin with. So you should never do it.
No it doesn’t. It copies the specific data to the GPU eventually.
Consider this. If you map the buffer, generate your light data every frame into that pointer, and unmap it, the worst-case scenario is that the driver will have to DMA-copy the data from the mapped pointer into the buffer object. It will do that at a time of its choosing, but sometime before you do anything that reads from that data. The best-case scenario is that you’re writing directly to the buffer object’s storage. This is much more likely if you use GL_INVALIDATE_BIT to invalidate the buffer (since you’re overwritting all of its contents).
If you use BufferSubData, you must generate your data into an array of your own, and you give that to BufferSubData. Worst-case, BufferSubData must then copy that array into temporary memory, and later DMA-copy that into the buffer. The reason why is quite simple. If the buffer is currently in use (is going to be read by GL commands that you have already issued that haven’t executed yet), then it can’t simply overwrite that data. The OpenGL memory model doesn’t allow later commands to affect earlier ones. So the implementation must delay the actual DMA-copy into the buffer storage until that storage is no longer in use. And since BufferSubData cannot assume that the pointer it was given will still be around after BufferSubData returns, it must copy that data into temporary memory and DMA from that into the buffer later.
So worst-case with BufferSubData is that there are two temporary buffers. You had to generate your lighting data into one temporary buffer, and OpenGL had to copy it into another temporary buffer.
Best case with BufferSubData is that it is able to do the DMA immediately. But that almost never happens. Why? Because DMA’s aren’t instantaneous. They’re an asynchronous operation. Also, DMA’s typically can’t happen directly from client memory. So most implementations of BufferSubData are still going to have to copy the buffer into some temporary, DMA-able memory, and then DMA it up to the GPU.
With mapped pointers, odds are very good that, if the pointer you get isn’t actually the buffer, it’s at least memory that’s DMA-ready. So the worst-case scenario for mapping is equal to the best case scenario for BufferSubData.
So yes, if performance is a concern (and at this point, it shouldn’t be. Stop prematurely optimizing stuff), mapping will only ever be equally as bad as BufferSubData, and can be a good deal faster.[/QUOTE]
From what you’d telling me, mapped pointers are better if I’m rewriting the storage. So if I only allocate once, I should allocate 100 lights during initialization phase using glBufferData with a null pointer. Then, every frame, I should use a mapped pointer to overwrite the data from lights 0 to current_number_of_lights.
What about using glBufferData with a null pointer every frame just before using the mapped pointer like it is said on the Streaming techniques link you put? Will that be reallocating? (I’m under the impression that glBufferData always reallocates) Or will that be more efficient since it is used to tell the driver that you don’t really care about the previous piece of memory? I might be confusing buffer allocation with uniform block allocation, am I? After reading that link you gave me it seems that using glBufferData with the same size as the initial allocation and with a null pointer will basically be faster since I will be filling a new buffer or the old buffer (if not being used).
Also, should I use glMapBufferRange with GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_BUFFER_BIT | GL_MAP_INVALIDATE_RANGE_BIT | GL_MAP_UNSYNCHRONIZED_BIT ? From that link you gave me, using GL_MAP_INVALIDATE_RANGE_BIT will be an optimization since I’m only writing and not reading. Also, using GL_MAP_UNSYNCHRONIZED_BIT would work since I’m only generating data into it before I actually render. Am I right?
And why should I stop prematurely optimizing? I must admit I’m a perfectionist but isn’t optimization good?
Sorry, I know it’s a lot of questions but this isn’t just for optimization, optimization is just my own way of understanding things thoroughly and I don’t want to be someone who just comes here and asks for people to fix stuff, I want to understand so I can teach others as well. In any case, you’ve already helped A LOT with my understanding of this and I thank you for that.