double precision float pbuffers

I am looking into using modern GPUs for massively parallel simulations, that will need to run millions of iterations. Iterative simulations are notoriously susceptible to precision limitations in their intermediate calculation states. As such, it would be of tremendous benefit to me to have double precision floats available in pbuffers/textures.

I know that NV_float_buffer provides for unconstrained floating point values up to 32bit. I also know that this is very much a hardware constrained issue. I would like to find out:

  1. How much additional hardware would need to be added to, say, a GeForceFX or Radeon9800, to be able to work with doubles? I do not mind a 2-4X slowdown in fragment programs to achieve this capability. If the existing quantity of calculation logic could be reconfigured to run over 64bits, but slower, that would be fine. (of course, if someone wanted to double/quadruple the amount of silicon on the GPU and support doubles at full speed, that would make me ecstatic, except that I probably couldn’t fit the heatsink in my case…)
  2. Does anyone besides me have any real interest in such a capability?
  3. If chosen as a good idea, how long would it take to get through the design/testing/manufacturing pipelines and into real hardware?

Thanks,
Mac

#2. Not many for sure. The cards you mentioned (GFFX and Radeon 9800) and the future iterations of these are gaming cards.
It’s unlikely that these companies will be using doubles anytimes soon.

In fact, in computer graphics, I dont see a need at the moment to go beyond 32 bit float.