Quote Originally Posted by tksuoran View Post
Here are some hints:

- Initially use BufferData will NULL pointer once to specify data storage size, do this only once.
- In your case you probably should use STREAM_DRAW usage.
- Use fixed size array(s) with maximum number of lights size.
- Pass number of lights actually used (since length would only tell the max size) to shaders using a uniform.
- When you update the buffer, you can map it all with explicit flushing, and manually flush only the first N lights which are in use.
- Use the invalidate bit. Invalidating the whole buffer is probably best.
- There is no need for BufferData(NULL) - that is just an older way to say invalidate.
- Using unsynchronized bit may be unsafe. When you update the data with the CPU, GPU may still be using the older data for previous frame. However, it is still worth experimenting with it. I found that it gives more performance and rendering errors were not an issue in my case.

In general, I would only use BufferData with NULL data and always use MapBufferRange to specify buffer contents, and never use BufferSubData. However, older OpenGL and unextended OpenGL ES versions before 3.0 do not have MapBufferRange. To support those, you could create an abstraction for buffer with mappufferrange and flush operations; These can be implemented using Buffer(Sub)Data calls if they not available in GL.
Ok so I should use BufferData(100*sizeof(light),NULL) once at initialization stage and then only use glMapBufferRange with those flags? So in that link with the streaming techniques, I might have misread, I thought you'd have to use BufferData to orphan the old buffer when using glMapBufferRange, but is glMapBufferRange doing this already with the invalidate bit flag?

So in initialization stage: glBufferData(100*sizeof(light),NULL,GL_STREAM_DRAW )
Per-frame: glMapBufferRange(program,0,number_of_lights*sizeof (light),GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_BUFFER_BIT | GL_MAP_INVALIDATE_RANGE_BIT) where number_of_lights is lower than 100.

Then what's the difference between a UBO and SSBO here? It seems that a SSBO would allow me to have more lights but a UBO would have a faster reading on the shader. Am I correct?

Also, can you ellaborate on the difference between STREAM and DYNAMIC DRAW? There's not much info out there from what I could tell and nobody really knows the actual difference. I know it's just a hint but still would be nice to understand what it can do.

Quote Originally Posted by thokra View Post
As I said, there are rules and idioms one should obey, like avoiding unnecessary copies of large sets of data and so on (and in this instance, deciding that a function take its arguments by reference is actually a design choice that will probably lead to faster code), but all in all that is not optimization: not doing so is actually premature pessimization - unless you have a good reason not to follow the general rule. So is the above mentioned choice of an obviously poor design.
This is what I was talking about. There are certain things you learn that prevent you from doing premature pessimization as you call it. You can also seeing as doing optimization prematurely, can you not? You could just get it done with passing values by copy instead of references, but you have understood why passing them by reference is generally more optimized so every time you program from that point of understanding you always do so using references for parameter passing. What I'm trying to do now is understand Buffer Object streaming as well as I can so that in the future I can know what I'm looking for and implement it in a generally optimized way. You probably already use one of those streaming techniques on that website, but I'm sure you either understand it already and went through a phase of trying to understand it, or you simply are the type of person to just use it because whoever wrote that page did the research already. I'm the type of person that likes to understand things the first time they are using them. After that, in the future, I will probably just use those streaming techniques, but I can't do so without first understanding.

Quote Originally Posted by thokra View Post
I don't know your background, but I assume you're either a student or rather fresh post-grad, and I don't know if everyone here will agree with me, but please don't let senseless perfectionism take over. You're not gonna get anywhere if you try to tweak every single function and every expression in your code. Ivory tower thinking isn't well applicable in the real world.
I am a PhD student and what I'm currently researching requires me to look at every possible way to implement things and understand them fully, because only by doing so can I tell if there is a different way people haven't noticed. Yes, it may come to you as a bold statement but doubting the generally used techniques is how science progresses sometimes. It may as well be that I won't find any better way of doing things, but that is also a research result that would let people know what not to try. Having said this, I must add that I am a bit inexperienced in OpenGL, having only 2 years of which the first half was purely outdated OpenGL 2.1 where I learned fixed functions. So during this second year of experience with OpenGL I have been basically teaching myself all the 4.3 OpenGL stuff through reading, covering buffer objects and GLSL.

Quote Originally Posted by thokra View Post
That is so not the point. The point is: You were already given sufficient help to tackle your problem at hand, at least on a basic level. If you have specific questions, no one on this forum will deny you their help until you understand what to do. In regards to performance, however, specific means providing actual data and pieces of code responsible for that data. If it's crappy, we'll tell you. If it's OK and you just can't do better on your current hardware, we'll tell you. If you don't seem to get what you're doing at all, we'll tell you. Personally, I think we got a very nice and helpful community here - you could do much, much worse.

Also, how who can anyone actually say that they fully understands everything they do? How, pray tell? Do you know exactly how your hardware works? Do you know exactly what code your GLSL compiler generates for your current GPU? I could go on ... but I suspect the answer is "no!". Being able to fully understanding everything you do when developing software is an illusion. Period.

As a general rule: First make it correct (which implies that it works in general) - then make it fast. This is exactly what Alfonse already told you above:

First make it correct, then make it fast.
As I said, I was trying to understand this topic fully, not really optimizing my code. Optimizing my code is one of the outcomes of my understanding, but not the purpose. Do you understand?

Yes, I could just implement it in one of those ways described in the link and then after having the code I could profile it and find the 10% code where the application spends 90% of its time. But I like to understand things as I explained.

Also, I never said anything bad about the community, I was thanking Alfonse for his help and letting him know that I understood if he didn't want to help any more. I know I might sound sarcastic or cynical at some times, but I am not, that is me being honest and I think people are more used to sarcasm than to honestly nice people. He seemed to have reached a limit of frustration that lead him to discuss premature optimization rather than answering my questions, so I explained that I was just trying to understand everything and said that I understood if he didn't want to help any more if he felt that I should just use on of the streaming techniques and be done with it.

I hope you did understand everything that I am saying, if not just let me know and I'll explain more thoroughly. (Sorry but I have an obsession with people fully understanding what I'm trying to say)