PDA

View Full Version : reference or stack?



ugluk
04-10-2011, 04:28 AM
I'm writing some number crunching function for the CPU and I was wonder how the two function below would compare



inline void fa(float& t)
{
t = t * t * t;
}

inline float fb(float t)
{
return t * t * t;
}


Both functions do the same, but I wonder which one usually results in faster code (say, when using g++). I've read that probably fb, as it takes t from the stack, while fa dereferences a reference. On the other hand the reference from fa might be "compiled away" by the compiler, but the same might be true for fb. What do you think?

EDIT: I should have tested it myself and I did:



#include <cstdio>

inline void fa(float&amp; t)
{
t = t * t * t;
}

inline float fb(float t)
{
return t * t * t;
}

int main(int argc, char* argv[])
{
float a(2);
fa(a);

float b(fb(2));

std::printf("%f %f", a, b);
}




g++ tmp.cpp -g -O3 -march=native -o tmp


Produced this code:



Dump of assembler code for function main(int, char**):
0x0000000000400650 <+0>: sub $0x8,%rsp
0x0000000000400654 <+4>: mov $0x40076c,%esi
0x0000000000400659 <+9>: movsd 0x117(%rip),%xmm0 # 0x400778
0x0000000000400661 <+17>: mov $0x1,%edi
0x0000000000400666 <+22>: movapd %xmm0,%xmm1
0x000000000040066a <+26>: mov $0x2,%eax
0x000000000040066f <+31>: callq 0x400528 <__printf_chk@plt>
0x0000000000400674 <+36>: xor %eax,%eax
0x0000000000400676 <+38>: add $0x8,%rsp
0x000000000040067a <+42>: retq


It seems as if the compiler calculated t^3 internally and both functions produce the same code. I consider the syntax of fb clearer though and think that maybe it is therefore better.

I wonder what would happen if we were dealing with a custom floating point type (say a float with an arbitrary precision, like those provided by GMP).

tksuoran
04-11-2011, 02:38 AM
You call your function with a constant value. Compiler can determine the result during compile time. You probably should go through some effort to make sure the compiler can not determine the values at compile time.