Performance of changing texture clamping setting

I recently noticed that repeatedly changing the texture clamping setting of a texture can bring down performance quite drastically.

In one scenario I had a 20% performance difference between using duplicates of each texture with different clamping settings or reusing the same texture and changing the clamping as needed.

Can anyone explain why this is such an expensive operation? I would have thought that this might have the same impact as changing one or two uniforms per primitive but it is considerably more than that. Is it really necessary that such textures have to be doubled to get optimal performance?

First time I heard about a use case needing a lot of texture clamp switches, so it may just be an area where hardware optimization is not done because seldom needed.

From/to which states ?

20% difference : can you please be more specific ? How many switches per frames ? Number of textures, and texture size ? Milliseconds per frame ? Number of triangles, viewport size ?

On wich hardware ? OS, cpu ?

It was a rough test because I had a few textures that don’t wrap well. So I just tried to enable texture clamping when I knew that they wouldn’t wrap to get rid of the filtering artifacts at the border. But I had to do it for everything because the program doesn’t know which texture is wrappable and which is not.

20% difference : can you please be more specific ? How many switches per frames ? Number of textures, and texture size ? Milliseconds per frame ? Number of triangles, viewport size ?

The scenarios ranged from 5000-20000 polys with 30-50 different textures and various amounts of switches. But the performance impact was proportional to the amount of clamp changes. The more, the slower it got. 20% was the most extreme case with probably 20% of all rendered polys doing a clamp change. On average it cost 10% performance, regardless of FPS and screen size.

This was on a GF8600, Windows Vista, Intel Core Quad 2.4 GHz.