Tim Stirling

05-05-2001, 02:12 AM

After finding all the surprising results from my first speed test program which I have explained earlier under the topic "speed with different geometry types" I decided to make some more tests because it seems to me that a lot of what poeple say to use or do or not to do is rubbish-on my computer with my programs anyway. My system is a P3 600,128MB, TNT2m64 32MB running windoze98 with the latest drivers from NV.

After finding the strange and disapointing results with display lists, triangles strips and glArrayElements I decided to use glDrawElements , using GL_QUADS (what I found to be the fastest) it was about 15% faster than non display lists but still slower than the display list version but about 15% again at about 70FPS. Then I tried glDrawElements using tri-strips and this was a disaster, it ran at about 1/5 of the speed of the non array version at 20 FPS. So arrays weren't much use.

So I looked at other ways of speeding up the rendering process. I decided to see how slow binding to a new texture was:

1 bind = 217 FPS

100 binds to the same texture = 217FPS

10,000 bind to the same texture = 217FPS

100 binds to different textures = 217FPS

10,000 binds to 10,000 different textures = 212FPS

The last one took about 5 minutes to load and the hard drive was chugging away making it look like it was using virtual memory. The acual texture used was the smae though but loaded up into an array:

for (n = 0; n < 10000; n++) LoadTexture($texture[n], "thetexture.TGA");

the texture was 256*156*24 bit!

This showed that bindings don't make it that much slower as long as the textures reside in the gfx card memory anyway, the last result is debatable but I ran it twice and those are the results.

Next I built another speed tester to test different geometry: when filling/texturing all primitives were the same speed, without filling the shapes both quads and polygons had an FPS of 78, and triangles, tri-strips, and tri fans had an FPS of 29.% aproximately each.

Then I tried clearing the buffers (glClear() ;) and the results are very dependent of the amount of drawing that has been done between clearings rangeing from 3.75 to 3375 mS during my test clearing both color and depth buffers. If I missed out clearing the color buffer speeds were only a little faster(5%).

Then I decided to do some maths tests and comapare floats and doubles, mults and divs etc. :

floats: 10,000 sqrt(100) = 1.34 ms

double: 10,000 sqrt(100) = 1.34 ms

double : 1 million sqrt(100) = 13.2 ms

float: 1 million sin(100) = 230 ms

float: 1 million cos(100) = 240 ms

( I later found out that the speed of the sin/cos is dependent on the angle provided)

float 1 million tan(100) = 319.2 ms

float: 1 million 100 /10 = 10.0 ms

float: 1 million 100 *10 = 10.0 ms

Double: 1 million 100 /10 = 10.0 ms

Double: 1 million 100 *10 = 10.0 ms

float: 1,000,000,000 100 * 10.01 = 10111.2 ms

Double 1,000,000,000 100 * 10.01 = 10069.1 ms

This shows that dopubles are slightly slower, this last test was repeated several times.

The I wanted to test the speed of drawing 1 gl_polygon of 5 vertices and doing the same but using a triangle fan.

Untextured and unfilled:

polygon = 3.83 FPS

tri-fan = 1.248 FPS

Textured and filled:

polygon = 0.255 FPS

tri-fan = 0.246 FPS

This shows than the polygon is fatser, maybe becuase of the greater number of vertices needed for the tri-fan (2 more)- maybe it will be faster to use a tri fan with a 20 vertex poly. Also it shows the main bottle neck is the fillrate.

Now I am completely stuck for ways of optermising my rendering.

[This message has been edited by Tim Stirling (edited 05-05-2001).]

After finding the strange and disapointing results with display lists, triangles strips and glArrayElements I decided to use glDrawElements , using GL_QUADS (what I found to be the fastest) it was about 15% faster than non display lists but still slower than the display list version but about 15% again at about 70FPS. Then I tried glDrawElements using tri-strips and this was a disaster, it ran at about 1/5 of the speed of the non array version at 20 FPS. So arrays weren't much use.

So I looked at other ways of speeding up the rendering process. I decided to see how slow binding to a new texture was:

1 bind = 217 FPS

100 binds to the same texture = 217FPS

10,000 bind to the same texture = 217FPS

100 binds to different textures = 217FPS

10,000 binds to 10,000 different textures = 212FPS

The last one took about 5 minutes to load and the hard drive was chugging away making it look like it was using virtual memory. The acual texture used was the smae though but loaded up into an array:

for (n = 0; n < 10000; n++) LoadTexture($texture[n], "thetexture.TGA");

the texture was 256*156*24 bit!

This showed that bindings don't make it that much slower as long as the textures reside in the gfx card memory anyway, the last result is debatable but I ran it twice and those are the results.

Next I built another speed tester to test different geometry: when filling/texturing all primitives were the same speed, without filling the shapes both quads and polygons had an FPS of 78, and triangles, tri-strips, and tri fans had an FPS of 29.% aproximately each.

Then I tried clearing the buffers (glClear() ;) and the results are very dependent of the amount of drawing that has been done between clearings rangeing from 3.75 to 3375 mS during my test clearing both color and depth buffers. If I missed out clearing the color buffer speeds were only a little faster(5%).

Then I decided to do some maths tests and comapare floats and doubles, mults and divs etc. :

floats: 10,000 sqrt(100) = 1.34 ms

double: 10,000 sqrt(100) = 1.34 ms

double : 1 million sqrt(100) = 13.2 ms

float: 1 million sin(100) = 230 ms

float: 1 million cos(100) = 240 ms

( I later found out that the speed of the sin/cos is dependent on the angle provided)

float 1 million tan(100) = 319.2 ms

float: 1 million 100 /10 = 10.0 ms

float: 1 million 100 *10 = 10.0 ms

Double: 1 million 100 /10 = 10.0 ms

Double: 1 million 100 *10 = 10.0 ms

float: 1,000,000,000 100 * 10.01 = 10111.2 ms

Double 1,000,000,000 100 * 10.01 = 10069.1 ms

This shows that dopubles are slightly slower, this last test was repeated several times.

The I wanted to test the speed of drawing 1 gl_polygon of 5 vertices and doing the same but using a triangle fan.

Untextured and unfilled:

polygon = 3.83 FPS

tri-fan = 1.248 FPS

Textured and filled:

polygon = 0.255 FPS

tri-fan = 0.246 FPS

This shows than the polygon is fatser, maybe becuase of the greater number of vertices needed for the tri-fan (2 more)- maybe it will be faster to use a tri fan with a 20 vertex poly. Also it shows the main bottle neck is the fillrate.

Now I am completely stuck for ways of optermising my rendering.

[This message has been edited by Tim Stirling (edited 05-05-2001).]