first, i must say, that i read all threads here about instancing, also i read nVidia’s GLSL pseudo-instancig article.
i tried to implement pseudo instancig, but in my app i don’t use modelview matrix (it is always identity matrix) and in shaders i just multiply by projection matrix (no difference in performance when multiplying with mvp matrix).
i also implemented VBO, which gave no performance boost. in nVidia’s pseudo instancing sdk example the VBO is slower then vertex arrays on my pc! (a64 3000+, 512mb ram, gf6600gt 128mb pci-e, 78.01, xp sp2)
my problem is that, that in some situations i call the glDrawElemens up to 200k times per frame. i am rendering line-strips (2-16 vertices, rarely less then 4 and more then 16), vertices are always the same, indices are always the same, the only thing that changes is one uniform (now it is attribute, which give a very little performance boost) and a vertex attribute pointer. code is like this:
for (unsigned int i = 0; i < mNumOfRecords; ++i) {
index = i * (mDimension + 1);
glVertexAttrib1fARB(5, mData[index]);
glVertexAttribPointerARB(1, 1, GL_FLOAT, GL_FALSE, 0, BUFFER_OFFSET(sizeof(float) * (index + 1)));
glDrawElements(GL_LINE_STRIP, count, GL_UNSIGNED_INT, 0);
}
glVertexAttrib1fARB(…) and glVertexAttribPointerARB(…) can be packed into one call to glVertexAttribPointerARB(…), but that will double the memory and that is not good and i think it will give no boost to performance.
i think that this could be rendered with instancig with one function call and the app will be “flying”. so, will there ever be true instancig in OpenGL? it will help me (not only me) a lot. or do you have any suggestions for me how to make this faster?
thank you
ps: one last thing. there is no difference in performance in my app between immediate mode, vertex arrays and VBOs. when using VAs and VBOs i use shaders, in IM everything is on CPU. and also there is no difference between asm and glsl shaders.