My impressions, as a couple-days-old owner of GFFX 5200:
Strangely, so far I haven’t felt a slightest difference in performance between X/H/R precision modes (except when switching between fp16 and fp32 pbuffer RTT, but that is memory bandwidth cost, I suppose).
Some examples:
“TEX H0, f[TEX0], TEX0, CUBE;”
“DP3X H0, H0, {0.2, 0, 0.9797, 0};”
“POW o[COLH], H0.x, 120;”
“TEX H0, f[TEX0], TEX0, CUBE;”
“DP3H H0, H0, {0.2, 0, 0.9797, 0};”
“POW o[COLH], H0.x, 120;”
“TEX R0, f[TEX0], TEX0, CUBE;”
“DP3R R0, R0, {0.2, 0, 0.9797, 0};”
“POW o[COLR], R0.x, 120;”
These 3 programs above run all at the same speed, however the result quality is vastly different in each case (the TEX0 is a HILO16 cubemap).
Another one:
“TEX H0, f[TEX0], TEX0, CUBE;”
“TEX H1, f[TEX1], TEX1, 2D;”
“DP3X_SAT o[COLH], H0, H1;”
This program uses fixed precision data only, but runs twice slower than equivalent NV_RC program.
I’ve tried longer programs too, but I’ve never been able to get achieve better performance by using the fixed precision. Simply, only instruction count and their complexity (like POW, RFL, etc., accessing interpolants) mattered.
Has anyone ever noticed speed benefit from using “fixed” on 5200 ? or 5600 ?
Maybe the “fixed” thing is really a 5800 specific feature?