silly compiler question

say you’ve had this code fragment:

static const float k = 100.0f;

(some complex expression) / k;

Is the compilers these days smart enough to multiply by the inverse of k, rather than actually divide by k? Or do you have to do:

static const float k = 100.0f;
static const float inv_k = 1 / k;

inv_k * (some complex expression);

That’s up to the compiler writer. Completely implementation dependent.

But that’s a pretty simple optimization for compiler writers to make and most probably do it, provided division takes longer than multiplication so that there is a reason to do so.

Yeah, I know it’s a thing for the compiler writer. But what’s your experiences? Do gcc, msvc, icc,… do it?

You can use the -S option with gcc to see the generated code in assembly “readable” text. Then it will depend on the optimization level, target architecture, etc.

Anyway on current CPUs computations are cheap compared to memory (accesses especially out-of-cache stuff) so it may not be a good idea to store both a normal and an inverted value.

If you care, try both ways, benchmark the results.
Even better, post your test code and results here, to start a benchmark flame war, it has been a long time without posts in the “you are measuring it wrong!” and/or “it is a waste of time !” vein :slight_smile:

I’d love to start a flame war. But I think these days it would be pointless :slight_smile:

Anyway on current CPUs computations are cheap compared to memory (accesses especially out-of-cache stuff) so it may not be a good idea to store both a normal and an inverted value.

However, one does not know for sure if the normal value might be found in the cache. And sometime one does not need the normal value at all, just the inverted one.

Above all, I wanted to know if someone had already seen a compiler perform such an optimization in practice, any compiler at all.

Visual studio (no optimization) and SSE
static float value = 5000;
float a = 40 / value;
movss xmm0,dword ptr [__real@42200000 (13FECB170h)]
divss xmm0,dword ptr [value (13FECE010h)]
movss dword ptr [a],xmm0
float b = xLimit / value;
movss xmm0,dword ptr [xLimit]
divss xmm0,dword ptr [value (13FECE010h)]
movss dword ptr [b],xmm0
use divss in both cases

With full optimization it really does a lot of optimization and move the code around but I still can see the divss instruction, (and these are the only divisions in the whole program).

Don’t want to reboot to test GCC… maybe tomorrow.
And before starting a flame war you should read what Donald Knuth (Turing Award in '74) thinks about this kinds of optimization.
“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil”
The problem is that you don’t know the bottle neck until you find it.

fdivs visible too on my mingw-gcc 3.4.5 with basic -02.

Please humor me on this, but don’t all calculations in the ALU(Arithmetic Logic Unit) of the CPU get converted to binary? (32 bit or 64 bit depending on the architecture?

@Savalia : the OP talked about floating point, so technically it is not the ALU that does the job, but the FPU (or SSE system).
And everything gets converted to a binary representation on a CPU, so maybe I misunderstood your point.

BTW, what does “humor me” means ? Is that like “make fun of me” ?

Peps, I hope you didn’t forget to make the static variable const. This is important, as static variables can still change, but const static variables cannot.

As for Knuth, I didn’t start optimizing prematurely, I merely wonder if I could improve a little bit something I’ve already done and it works.

I’ve tried it with static const with gcc with -O2 -march=native and it optimized the div away. With static const variables gcc appears to be smart enough.