VBO and immediate mode with low tris/s
I've been trying to find out how many triangles I can draw on the screen at a time, just to have a rough idea.
When I started coding I used immediate mode as it's explained in some old books and tutorials, then when reading newer versions of the openGL books I decided to switch over to VBOs to increase performance...
I wrote a very simple program that generates random coordinates assigned to quads to draw on the screen, nothing fancy no colors or anything, but I seem to hit a limit at 300k triangles per second, no matter whether I use immediate mode or a buffer object (same limit for both). Now, that seems like an incredibly low value not to mention that it should probably be different for each one, shouldn't it?
My system is as follows: 64bit Archlinux, AthlonII x2 245, 3GB of RAM and a M2N68-AM motherboard with an onboard GeForce 7025. I think that's the most important stuff regarding my "problem", I know it's a very low-end graphics system, but I think it should be able to handle more than 300k. I've been using Bugle for debugging and profiling.
There's surprisingly little information online about that graphics card (or maybe I'm just terrible at searching) and I've been at this for a while now.
I'd appreciate if anyone could point me in the right direction... even if just telling me that that's in fact the actual limit of my graphics card.
If there's any information I should add please let me know and I'll do so.
The GeForce 7025 was a low-end integrated chipset from 2007. It had one vertex shader and a whopping 2 pixel shaders and 2 ROPs.
Is there some reason to believe that it would perform much better than that?
Thanks for the fast response!
Actually, I downloaded and run this application: svPerfGL* (with the smaller data set) and, according to Bugle, it can achieve around 9M triangles per second.
Now I don't know if there's something I'm overlooking, maybe Bugle debugs in a certain way I don't quite understand or something. But going from 300k to 9M is a very big difference.
On a side note, shouldn't I see some kind of variation in that upper limit when switching from immediate mode to VBOs? The only thing that seems to change is that on the first one my application spends most of the time on glVertex calls while on the latter one it does on the SwapBuffers call.
* I don't seem to be able to post links, probably because I'm new here. If you type svPerfGL into google it's the first result.
Even if VBOs are technically faster than immediate mode, you may not see an improvement if the bottleneck of your application is somewhere else. The GPU works like a pipeline and the slowest stage will determine the overall speed, improving the other stages will not make a difference.
Maybe your application is limited by the number of pixels you draw, try to render smaller triangles and/or use a lower resolution.
Definitely second this. In order to properly profile VBO versus immediate mode you need to take every other factor that could be a bottleneck out of the equation.
Originally Posted by mbentrup
It also depends on how you're using the VBO. Sometimes people transitioning from immediate mode will try to use a VBO in a similar style, i.e. create a single small-ish VBO, and fill it dynamically for each polygon they draw. That's not going to give you any performance improvement over immediate mode at all; it may even be slower.
Ah, you guys are absolutely right, I tried smaller triangles and got a much better performance (I didn't think it'd be that big of a difference so I hadn't tried it).
I also got a size "threshold" after which immediate mode stopped improving before the VBO did. Just for the record, I was using one single vertex object with all the data within.
This clearly means I need to do a lot more of reading before continuing.
Thanks a lot for your help!