severe performance problems in opengl

hello I’m using opengl to display some quads close to the frustum. They are about 100 textured quads which are somehow close to the frustum. Im getting 100fps here but on a friends pc he got 20fps and he has a good pc but the gfx card aint that good.

which performance tips can I take? im not doing something stupid like loading a texture from the hdd all the time at the loop, heck im not even using opengl lighting nor calculating nothing in the loop! just fading in and out the quads. i have them textured and the textures aint that big (256x256 someones are smaller) and using glu’s mipmap function for them.

about 90% of the textures has alpha channels yes. i need eye candy but with out performance hit, im asking too much? or opengl cant handle it?

thanks i need help urgently.

My guess is, that the readback from the framebuffer, is your biggest problem. Two possible optimizations come to mind:

  1. Use GL_ALPHA_TEST to discard completely (or almost) transparent fragments. Depending on the image-quality you need, you might be able to increase the reference value for the lower planes. This only helps if large parts of your quads are actually invisible due to low alpha values.

  2. Use multitexturing to merge layers before they hit the blend stage. This should give you way better performance, but is a bit hard to figure out correctly.

Assuming you do the usual SRC_ALPHA, ONE_MINUS_SRC_ALPHA blending, the formula for the resulting color in the framebuffer (only two layers, for simplicity) looks like this:

// two layers, two blend passes
FB' = (FB*(1-a1) + c1*a1)*(1-a2) + c2*a2

If you want to blend more than one texture in a single pass, you have to separate the parts for the blend equation:

// two layers, one blend pass (this is just a transformation of the other equation)
FB' = FB*(1-a1)*(1-a2) + c1*a1*(1-a2) + c2*a2 * 1
         ^^^^^^^^^^^^^   ^^^^^^^^^^^^^^^^^^^^   ^
         destination     source value           source factor (for clarity)
         factor

Using the ARB_texture_env_combine extensions you can get this result with the following setup:

  // Unit 0:  texture 1 with color c1 and alpha channel a1
  SOURCE0_RGB: texture
  OPERAND0_RGB: src_color
  SOURCE1_RGB: texture
  OPERAND1_RGB: src_alpha
  COMBINE_RGB: modulate

  SOURCE0_ALPHA: texture
  OPERAND0_ALPHA: one_minus_src_alpha
  COMBINE_ALPHA: replace

  // color is now c1*a1, alpha is (1-a1)

  // Unit 1:  texture 2 with color c2 and alpha channel a2
  SOURCE0_RGB: previous
  OPERAND0_RGB: src_color
  SOURCE1_RGB: texture
  OPERAND1_RGB: src_color
  SOURCE2_RGB: texture
  OPERAND2_RGB: one_minus_src_alpha
  COMBINE_RGB: interpolate

  SOURCE0_ALPHA: texture
  OPERAND0_ALPHA: one_minus_src_alpha
  COMBINE_ALPHA: modulate

  // color is now c1*a1*(1-a2) + c2*a2, alpha is (1-a1)(1-a2)

Now, you can use the blend function (ONE, SRC_ALPHA) to get the result same result that you’d get using the two pass approach.

You can easily add more texture units by using the same setup as for unit 1.

Caveat: This approach doesn’t incorporate the quads primary color.

How are you drawing the quads? Immediate mode, vertex arrays, VBOs, display lists? I assume you’re using texture objects and not calling glTexImage2D every frame. 20 FPS for 100 textured quads seems extremely low, even for an old card.

You also say you have 100 textured quads close to the frustum. Is this everything? Or are there more quads far away from the frustum? Submitting too much polygons to the graphic card can kill performance, even if only a small fraction is visible. If that’s the case, you should look into a space partitioning scheme like quadtrees or octtrees for culling away large portions of invisible polygons.

No, im using display lists.
I had a 1500tris mesh on the background I took it out and the framerate is the same on the other PC.

Whats the deal with opengl anyway? I was doing this on a dx based engine and the framerate never dropped that bad…

that mesh does have alpha applied too.

Are you using ARB extensions ? They’re implemented since less time than Direct’x vp/fp so if your friend got a crappy card -> it’s gonna be hard

which performance tips can I take? im not doing something stupid like loading a texture from the hdd all the time at the loop, heck im not even using opengl lighting nor calculating nothing in the loop! just fading in and out the quads. i have them textured and the textures aint that big (256x256 someones are smaller) and using glu’s mipmap function for them.
That’s your problem. You are using blending. Old cards have slower memory and blending can be expensive. Even todays cards can be slowed down easily if you use blending all the time for rendering many things.

Use memfr0b advice in this case.

If you don’t need depth, then call glDepthMask(GL_FALSE)
and glDisable(GL_DEPTH_TEST)

What’s the pixelformat? What’s the video card?


Whats the deal with opengl anyway? I was doing this on a dx based engine and the framerate never dropped that bad…

If the same thing is running much faster with D3D on that guys machine, then it’s something else.