Texture performance issues

Hi.

Me and my friend are trying to create a platform 2D game using opengl. We agreed that to get the best performance(and wasting some memory), it should be best to build up a “static texture” that contains everything that is static on the game map. When we done that we just need to clip from this texture and then only draw every dynamic object(moveable) each frame iteration.

The problem is, it’s alot faster if we draw, not only the dynamic object for each frame, but also the static objects for each frame iteration. And the only time we actually draw to static texture is in the beginning. Why is the size of the texture important when it’s already in video memory? We thought that the performance would increase when saving alot of draws.

What number is a lot of draw calls?
What do you actually do for each draw call and what do you mean by clip ?

We are saving draws for everything that is static, and that may be up to 30-40 draws on a regular map, depending where the camera is. What we don’t understand is how 30-40 draws more is faster than just one draw(clip) from the static texture for the map we built up. I thought you would save performance by building advanced textures in the load phrase, and not draw everything every time. Yeah, the static map texture may be big, but why does the size matter when we draw the same size but with alot more draws?

With “clip-method” we get about 600 FPS, and with the “draw-everything-each-iteration” we get about 1000-1100 FPS.

With “clip-method” we get about 600 FPS, and with the “draw-everything-each-iteration” we get about 1000-1100 FPS.

Just so you understand the orders of magnitude being discussed, 600 FPS is roughly 1.67 milliseconds per frame, while 1100 is roughly 0.91 milliseconds. It is not “a lot faster”; it is only 0.76 milliseconds faster.

You haven’t told us nearly enough to be able to diagnose a performance change that insignificant. It could be anything from something that varies from driver to driver, to the fact that your “draw(clip)” uses dynamic vertices rather than static ones. This level of micro-optimization is really not something you need to be thinking of unless you have taken care of all of the other low-hanging fruit first.

BTW, I saw your cross-post on Stack Overflow. Why didn’t you bother to answer the entirely reasonable question asked of you in the comments? It’s basically what BionicBytes asked.

“Premature optimization is the root of all evil” – Donald Knuth

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.”

Looks different in context, doesn’t it? :wink:

Isn’t 600 FPS fast enough?
Most monitors are set to between 60 and 100 Hz so most of those frames won’t ever appear on screen.

Also, read
http://www.opengl.org/wiki/Performance#FPS_vs._Frame_Time

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.”

Looks different in context, doesn’t it? :wink: [/QUOTE]
Indeed but how that applies to current situation ?

Indeed but how that applies to current situation ?

0.76 milliseconds. Somehow, I don’t think this is “that critical 3%.”

At the same time, something is obviously wrong. It may be only 0.76 milliseconds on the OP’s machine, but what happens if the OP decides to ship this game, if someone runs it on downlevel hardware, and the performance difference comes out at nearer 10 milliseconds? Something as simple as what has been described should be able to run well on almost anything, and the situation itself is interesting enough to merit further discussion.

Plus it’s annoying to see the “premature optimization” quote given out of context, which happens all too often. :wink:

It may be only 0.76 milliseconds on the OP’s machine, but what happens if the OP decides to ship this game, if someone runs it on downlevel hardware, and the performance difference comes out at nearer 10 milliseconds?

Without knowing exactly why this is happening, it might turn into anything on lower-end hardware. That’s why you always profile on the hardware you intend to ship on. If performance matters to your application, then it is incumbent upon you to actually check the hardware you intend it to run on.

The fact that the performance is clearly not rendering based (even my integrated HD-3300 can chew through a couple hundred vertices in 10ms) suggests that it is highly unlikely to propagate in such a way to lower-end hardware.

So your fear is both premature (without profiling data) and highly unlikely to come to pass.

Plus it’s annoying to see the “premature optimization” quote given out of context, which happens all too often.

Except that it was a perfectly valid use even in context. Indeed, I’m not sure of a way that you could actually validly use it out of context (that is, use it where it would be seemingly correct without context, but wrong with context). The context points out that you need to know where that 3% is. The only way to know where that 3% is would be to profile the application (or if you have a lot of experience in the problem domain). And if you do know where the 3% is, then any optimization you might consider is clearly not premature.

Note that the “out of context” quote isn’t “Optimization is the root of all evil”. The “Premature” part is what makes the statement work.