I’m experimenting with floating point mipmapping on a nv6800 GT.
Enabling automatic mipmap generation for unsigned byte textures usually results in a 50 to 100% performance overhead.
When using GL_RGBA_FLOAT16_ATI however, enabling the automatic mipmap generation is +/- 10 times slower.
Is this performance drop to be expected for floating point textures (because they are rather new and not widely used), or can I do something to overcome this problem?
It happens when I’m loading data to level 0 of the floating point texture. I wrote a small testapp that allocates a floating point texture and then repeatedly loads some texture data. When I turn on automatic mipmap generation it is +/- 10 times slower for floating point textures compared to no mipmap generation.
If I change the internalformat to rgba8, it is +/- 2 times slower compared to disabling automatic mipmap generation.
I’m using the GPU for general purpose computation to implement an iterative algorithm which requires many texture loads/subloads to set the output of a previous rendering/iteration as input to the next. This requires many texture loads depending on the number of iterations. I also need the floating point textures for numerical stability.
That’s strange. It should take twice as long to generate FP16 mipmaps, because FP16 bilinear filtering rate is exactly half of the RGBA8 filtering rate on NV40.
If the mipmaps are generated using the CPU however, such a performance drop is very likely because of the lack of native FP16 support.