optimization instead of float math ?

hello ladies and gentlemen ,

i’ve picked up the topic “openGL for games”
and now i have a general suggestion:

why float32’s are used for the most frequent functions in openGL?!

and not int32

the points i thought about are

  • games often do NOT use the full range that float32’s have , most of the time the player moves in a cube wich less side-size than about 500000.0000 units , wich fits perfectly in one integer including the fraction part of four digits.

  • the floats must anyway be convertet into screencoordinates , wich are obviously integers.
    i think the conversion overhead could be avoided using integers.

  • taking this a bit further and including SIMD instructions , it is even way to expensive using float32’s in matrice operations , because normalized vectors have only a range of +/- 1.0000 (excluding the translation and scale part).

  • including special functions , like quaternion rotations (based on a vector by vector computation ) etc ,… would also speed up many things.

  • the only problem i see by now(and by my state of knowledge) is the d3d layer on wich openGL resides
    but that would only need a tip to microsoft and their d3d quiz (ehem) api.

  • i dont think that the performance hit caused by
    float math is neglegable(not even by microsoft) , when i compare the instruction latencys.

so in my opinion games definetly need special version of openGL wich uses integral math and has some specialized functions , because the people playing the products made with openGL are the ones that keep openGL alive and they need faster games.

i am often confronted with people that rely on the logic that the only thing that must change in order to make an algorithm faster is the hardware on wich it runs.This logic is like bending a road just because they want to steer.
Often this people do not have the mind and/or just don’t want to have a basic understanding of the assembly level , but i think the developers of openGL and d3d have a strong knowledge of assembly.And that makes me even more curious to get to know why they didn’t adopt a version of openGL/d3d to the needs that games have.

First of all, OpenGL does not work on top of D3D, it is entirely seperate.

Second, most of the functionality of OpenGL does run on the GPU, so the capabilities of CPUs do not have much impact on the performance. The fact that integers are faster than floats does heavily depend on the hardware. For the CPU it may be true, although I’m not so sure about modern chips. But for the vertex processor on the GPU it’s just plain wrong.

Also, the point with floats is not the high accuracy you have. Actually, with floats you have less accuracy than with an int, because the float has only 23 mantissa bits. The point is the high dynamic range you have. The smaller the number, the higher the accuracy…

also that nvidia’s and ATI’s GPU’s have strong floating point matrice multiplication units , the dataflow of a game often depends on computations by the CPU, even if there is the best graphic hardware available.

Many computations must be made by the CPU(collision detection , AI etc) in order to keep the matrice flow consequently towards to the GPU.

the bottleneck when using integer math is when converting (back)to float values to send the data to the GPU(even for SP SIMD) and(of course) having no choice.

Neither the graphic hardware nor the library biulders do support any change.
I think openGL has had a great impact on the way NVidia/ATI had constructed their hardware, but generously spoken the optimization tends in the wrong direction on both sides,
and the poor algorythm developer is unsupported, wich i think should be a major task of such librarys like openGL.

In the case of the needs that games have(reiterating), why insist on float math when it is always slower than integral computation, less accurate and not even necessary in the fence of required values???just to satisfy
the desires of some minor scientific usage ???

the point that openGL does not biuld (anymore)upon d3d would eliminate the problem with d3d.

games often do NOT use the full range that float32’s have , most of the time the player moves in a cube wich less side-size than about 500000.0000 units , wich fits perfectly in one integer including the fraction part of four digits.
So? The absolute first thing the hardware will do with that number, before your vertex shader even see it, is convert it into a float. So you’ve gained nothing.

And it gets worse. There are only 2 possible methods for converting integers to floats (that GL supports): normalized and unnormalized. Normalized means that the signed integer range is converted to a -1 to 1 floating point range. Which means your vertex shader now needs to expand this value onto the real range with a multiply. While the conversion may have been free, the multiply takes time in your vertex shader. It might fall out for free, depending on circumstances, but it just as easily might not.

The unnormalized conversion converts it onto the absolute signed range. So an integer value of 500 becomes 500.0. So, if you were trying to store some decimals in the integer, you’re still going to need to convert it.

Either way, you lose performance.

Lastly, many developers do use short-word or byte integer formats with VBOs. These formats need to be shifted just like above, but that is a performance/vs/performance & memory tradeoff. Developers take shader performance and trade it in for vertex transfer performance and lower memory usage.

the floats must anyway be convertet into screencoordinates , wich are obviously integers.
i think the conversion overhead could be avoided using integers.
All vertex shader definitions define the output coordinates as floats, not ints. It is therefore not possible to pass in integers. So even if integers were faster in some respect, you simply can’t do it.

taking this a bit further and including SIMD instructions , it is even way to expensive using float32’s in matrice operations , because normalized vectors have only a range of +/- 1.0000 (excluding the translation and scale part).
You seem to be operating under a false assumption. Namely, that integer math is fundamentally faster than float math. It is certainly less complex than float math. But faster? No. Not guarenteeably.

You can build single-cycle float arithmatic, just like single-cycle integer arithmatic. It simply takes more hardware transistors to do it. Which is why, for a long time, it was not done. Even on modern CPUs, floats tend to be about as fast as integers, if not equal in speed.

i dont think that the performance hit caused by
float math is neglegable(not even by microsoft) , when i compare the instruction latencys.
Well, then you’re wrong. Plus, since it’s built into the hardware (little shader hardware provides integer math. Integers specified in high-level languages are just floats), we don’t even have a choice.

Neither the graphic hardware nor the library biulders do support any change.
Developers don’t want a change either. Just you.

@previous post
correct me if i am wrong but isn’t this forum heading about things that should be changed ???

if you like to offend me in personel way , this forum is not the right place.
So i skip the foreplay…and please refer to intel’s and amd’s instruction manuals when talking about floating point performance in comparision
with integral sse (hint search for pmaddwd)

i don’t have to explain to you how integer scale is done , neither do you have to explain it…

you say by your own that floating point math needs
extra transistors , integers doesn’t , and when you accept extra transistors in your logic , integers can be even faster always.This is like attaching a carrot to a rabbits head and make a bet weather it can reach it by running forward.

you talk about things that have been made in the past in order to provide better floating point calculation performance but wich are completly redundant operations .At the end you always have a screen coordinate , and this is an WORD not even an integer.This fact can’t be changed in care of todays standart display devices.So why feeding the render device with floats?worthless work here.

when i think about some algorythms used through many games and their need for floating point values i came to the result that floating point calculation is not even required except for two operations fsincos and fsqrt(and the sp simd pedants)but these instructions also can be performed with other means.

AI algorythms(CPU)
collision detection(CPU)
vertex calculations(GPU/CPU)
screen projection(GPU)
texture manipulation(GPU)
etc etc…

now in todays relations these algorythms are forced to use the FPU in order to avoid the
conversion into IEEE floating point values , dispite the fact that whereever possible and affordable they use simple integral operations.

Why forcing the hardware vendors to adopt to software related issues like floating point math. in scientific and design applications this might be an issue?
Concerning games , they didn’t even have to do so.
And games are the important part of the market.

I think you’re completely missing the point here. Everyone in the game industry wants to use floats now. A year ago everyone used integers at the pixel level, but floats just provide a higher dynamic range. The suggestion to use integers everywhere seems a huge step backwards. The days where we could not build hardware that’s capable of providing good float performance are long over.

Your assumption that noone needs floats is just wrong. At the moment graphics hardware is in the progress of eliminating the last remaining integer parts of the pipeline because floats are better from a functionality point of view.

I agree that you may not need floats for AI (though they still have some use there), but for vertex calculation you definitly need them (collision and projection included). Of course you could do everything with fixed point math, but the result will be inferiour, especially when making heavy use of zooming/scaling. And on the pixel level, well, as long as you’re satisfied with how games looked half a year ago, you only need integer math. Who needs all this HDR stuff anyway?

In the case of the needs that games have(reiterating), why insist on float math when it is always slower than integral computation, less accurate and not even necessary in the fence of required values???just to satisfy
the desires of some minor scientific usage ???

the point that openGL does not biuld (anymore)upon d3d would eliminate the problem with d3d.
Don’t get me wrong, but this doesn’t really sound like you know what you’re talking about.

Floats are neither always slower than integral computation, nor are they always less accurate (depending on what you do with them), and for some computations they are absolutely neccesary.

And OpenGL never built on top of D3D.

Floating point arithmetic is quite important. For example, what is 1/20 in integer?
The range from 0.0 to 1.0 in float is quite important.
The range capability of floats can be useful for some people.

That beeing said, rest assured that GPUs are extremely good at parallel processing and have multiple FPU cores within because FPUs are cheap now.

I’m perfectly comfortable using floats all over the place, except for most textures, FBOs.

allright , with d3d its not important to argue about , its execution buffers went somewhere else then …
i am not a great fan of D3D , so excuse this driver issued missinterpretation.

But the point is that the technology is allready there , it just has to be used the right way,
instead of insisting on IEEE , wich arised from scientifical believes.

Lets take a few examples of primitiv algebra , dot-product , angle-computation , cross-product , quaternion-rotation ,…
all these are done in plain integral math much faster than in floating point of view ever since, the problem for floats
is still the multiplication and the add/sub instructions.(besides terrible thing what intel has done to imul )

i did not say that all applications can do all these dirty tricks , because they need to have the full spectrum
IEEE can focusate on.
But when i think of different algorythms like bsp , octrees etc , the pure definition of IEEE stands in the way , because it produces extra instructions just to care for a specific situation.
e.g. equality comparison to a value can never be done with a single compare , it is inavoidable to biuld a ratio
and compare this to some predefined value , so here the application gets stuck anyway to a fixed accurancy that values
might have , wich eliminates all the efforts done with (6 digit correct) floats.

With floats there is always a rounding error included , if there is the need for more than 6 digits and
numbers that are accuratly represented within a given accurancy and range.

furthermore floats can not represent every number in the range
of | -1.000;+1.000 | exactly , because the way they are defined.

there are no fraction digits in binary float or int.All values are always integral, and any wired computation
that does include floating math , is bound to the overhead of interpreting a segmented integral as a
real number.So IEEE floating point math is nothing less than complex integer math , in any way.
And an optimized approach to several cases could solve some major bottlenecks in many ways.

with plain integer math these values can be easily stored with a “just in time” shift agreement in mind.
So as long as there is a limit in the range required (like games do) float math means adding
avoidable and missleading IEEE overhead to the computation.

by the way things went currently , i think the smallest denominator would be to include
very very very fast integer conversion routines into the GPU , so that both sides of the cpu depending algorythms and the (currently) available GPU depending routines can be satisfied.

So IEEE floating point math is nothing less than complex integer math , in any way.
That’s where you’re wrong. With integer math, you have a fixed exponent (thus the name fixed point, instead of floating point). So the range of numbers representable is much smaller.

floats can not represent every number in the range of | -1.000;+1.000 | exactly
Neither can you represent every number with an integer. It’s hard to represent an infinite number of possibilities :wink:

Of course you can just say, you have one bit sign, and a 0.31 fixed point number. Then you have 31 bit right of the comma vs. 23 bit right of the comma with a float. This is the maximum precision that you can archieve with 32 bit in this range, but it comes at the cost that you are unable to represent any number outside this range.

With floats you don’t have the problem. You always have a guaranteed precision of +/- 2^-23 relative to the absolute value of the number, while being able to represent numbers out of a huge interval instead of just (-1, 1) or any equivalent fixed interval. With integers, the larger the interval is, the smaller is the precision, even for small numbers. With floats you start with a bit smaller precision, but you get a huge interval.

To emulate this behaviour would be much more overhead than just using floats, especially when we can build float hardware.

But the point is that the technology is allready there , it just has to be used the right way, instead of insisting on IEEE , wich arised from scientifical believes.
I’m not sure what you mean with “scientifical believes”. If you mean “needed for scientifical calculations only”, you’re wrong. Scientific calculations are more likely to use fixed point math because they need the extra accuracy and they can afford the additional overhead of building structures like big integers to overcome the limitation of the small range integers provide.

On the other hand, games and computer graphics don’t need exact calculations, but they still need the high dynamic range, so for them, using floating point numbers is just fine.

That’s where you’re wrong. With integer math, you have a fixed exponent (thus the name fixed point, instead of floating point). So the range of numbers representable is much smaller.

reply:
integer math does dot need any exponent and games dont in most situations,
only the op has to be adjusted with according shifts.The range is good enough.

Neither can you represent every number with an integer. It’s hard to represent an infinite number of possibilities

With floats you don’t have the problem. You always have a guaranteed precision of +/- 2^-23 relative to the absolute value of the number, while being able to represent numbers out of a huge interval instead of just (-1, 1) or any equivalent fixed interval. With integers, the larger the interval is, the smaller is the precision, even for small numbers. With floats you start with a bit smaller precision, but you get a huge interval.

Of course you can just say, you have one bit sign, and a 0.31 fixed point number. Then you have 31 bit right of the comma vs. 23 bit right of the comma with a float. This is the maximum precision that you can archieve with 32 bit in this range, but it comes at the cost that you are unable to represent any number outside this range.

reply:
in the important intervall , for most normal operations ints can do that pretty good ,
but floats fail to grant that a given number is what you intended.
try to state 0.1 or 0.3 in IEEE , this is not realy a huge range , but in IEEE these are
total approximations.
Many renormalizations caused by rounding errors could be avoided.
but you do not even need all 32 bits , a word is sufficient for a range of +/-1.

games are limited in the range they require , arent they?? have you seen a level larger than 500000 units ??
Or to speak of the view radius , when you extend the 6 digit difference between two floats
the result is about 0.0. The same can integers do with less costs of cycles.
not to speak that you cant tell the difference 0f 0.00001 and 0.0.
but in case of matrixmult there can be quite a difference dividing 0.3 by 0.1 in sequence.
in case of integer math this can be done using a few shift’s add’s or lea’s.
not that i like starwars that much …

On the other hand, games and computer graphics don’t need exact calculations , but they still need the high dynamic range, so for them, using floating point numbers is just fine.

reply:
for what reason to use a dynamic range , when a level is finished it is no more expandable.
most vectors on runtime never reach a range beyond the level limit.

I’m not sure what you mean with “scientifical believes”. If you mean “needed for scientifical calculations only”, you’re wrong. Scientific calculations are more likely to use fixed point math because they need the extra accuracy and they can afford the additional overhead of building structures like big integers to overcome the limitation of the small range integers provide.

reply:
“scientifical believes” ==> aproximations can replace exact calculations when done right and only then, but ,as i stated , any game will have a poor performance under this condition if it has to correct next to a sequence of operations.

anyway i think i should end this discussion by my side, because it tends to get offtopic and it is a more GPU related issue.

You’re right, this discussing leads nowhere.

Just a single point you should perhaps think about: Most graphics hardware is built to satisfy the needs of the gaming industry. Practically all new developments come from this direction, and are adopted to other needs, not the other way round.

And there were (are) many experts involved in the creation of this hardware, and they build float hardware. Don’t you think they have reasons for this other than just “support legacy float apis”?

Try to code two software engines, one using ints and one using floats. Don’t sacrifice any features or quality for using ints. And then compare which one was easier to write and which one is faster.

Please don’t take the last statement seriously. :wink:

But try to think about it. For starters, think about what the projection matrix will do to your coordinates in your 50000 units big level when looking at a wide open spare, and how to represent this with fixed point math. And when you have solved this problem, try to zoom in on some detail that’s 0.5 units large (without code change).

Oh… And google for “HDR rendering” for another example where the high dynamic range is really needed.