G16R16 textures on Radeon ?

Hi,

Is there any method or trick to handle high precision normal maps on Radeon 9500+ in a way that is similar to the Nvidia HILO? I’m interested by the fact that HILO can carry precisely normal components at a reasonable cost (32 bits).

I found a demo on the ATI devrel site (http://mirror.ati.com/developer/samples/dx9/HighPrecisionNormalMaps.html) that does this with DX9 (via the D3DFMT_G16R16 texel format).

I guess that it could be possible to generate a fragment program that generates Nz from the 2 first channels and give a dual channels half-float texture as an input, but this should certainly be more expensive than the native HILO management.

Couldn’t you use an intensity/alpha format with 16-bits per component? I think R300-based cards can handle 16-bit-per-component textures, in addition to 16-bit and 32-bit floating-point textures.

Thanks Korval,

I think R300-based cards can handle 16-bit-per-component textures, in addition to 16-bit and 32-bit floating-point textures.

Yes, and the 16-bit floating-point textures should be sufficient to handle such a normal vector, but HILO textures allow the automatic generation of the third normal vector coordinate… Is there any exposed and similar mechanism in an ATI extension ?

Nope, you have to do it yourself in the fragment shader.

ok, thank you…

I guess that the one or two extra fragment program instructions will not be too expensive…

As it seems that the float textures are not filtered, it should be better to use 16-bits integer format…

But I didn’t see any ATI extension that expose such a functionnality.

Any idea ?

That’s core functionality. Use GL_LUMINANCE16_ALPHA16 as the internal format.

Originally posted by kard:
[b]ok, thank you…

I guess that the one or two extra fragment program instructions will not be too expensive…

As it seems that the float textures are not filtered, it should be better to use 16-bits integer format…

But I didn’t see any ATI extension that expose such a functionnality.

Any idea ?[/b]

8 or 9 instructions actually, less if you don’t renormalize.

I guess they just didn’t want to spend the extra transistors for it.

8 or 9 instructions actually, less if you don’t renormalize.

I guess that you include the texture instructions that are required in any case. But do you think this extra work could counterbalance the additional memory for 3 16-bits channels.

I guess they just didn’t want to spend the extra transistors for it.

on the concerned DX9 demo page ( http://mirror.ati.com/developer/samples/dx9/HighPrecisionNormalMaps.html ) it seems that the D3DFMT_G16R16 format does the HILO job on Radeon. Do you think that a hidden program performs the task?

In that demo, it do it in a PS 2.0 shader:

texld r0, t0, s0 // Sample from normal map

dp2add r1, r0, -r0, c1.w // 1 - xx - yy
rsq r1.w, r1.w // 1/sqrt(1 - xx - yy)
rcp r0.z, r1.w // sqrt(1 - xx - yy)

Of course it’s a little different in OpenGL (mostly because of the lack of the dp2add instruction, but can be emulated).

[This message has been edited by NitroGL (edited 12-16-2003).]

Thank you NitroGL! I didn’t see this shader files.

Assuming that the input texture is dual, then only the R and G component are provided, so B is 0 and in opengl this shader should be something like that:

tex     r0, fragment.texcoord[0], texture[0],2D     // Sample from normal map
dp3 	r1,r0,-r0        // - x*x - y*y
add	r1.w,r1.w,c1.w   // 1 - x*x - y*y
rsq 	r1.w,r1.w, r1.w  // 1/sqrt(1 - x*x - y*y)
rcp 	r0.z,r0.z, r1.w  // sqrt(1 - x*x - y*y)

[This message has been edited by kard (edited 12-17-2003).]

[This message has been edited by kard (edited 12-17-2003).]

tex     r0, fragment.texcoord[0], texture[0],2D     // Sample from normal map
dp3 	r1,r0,-r0        // - x*x - y*y
add	r1.w,r1.w,c1.w   // 1 - x*x - y*y
rsq 	r1.w,r1.w, r1.w  // 1/sqrt(1 - x*x - y*y)
rcp 	r0.z,r0.z, r1.w  // sqrt(1 - x*x - y*y)

RCP is unary, this can't be right. I also think MUL may be more efficient than RCP, even if you don't see it on (your) current hardware. My changes in [b]b[/b]old:
tex     r0, fragment.texcoord[0], texture[0],2D     // Sample from normal map
dp3 	r1,r0,-r0        // - x*x - y*y
add	r1.w,r1.w,c1.w   // 1 - x*x - y*y
rsq 	[b]r0.z[/b],r1.w  // 1/sqrt(1 - x*x - y*y)
[b]mul 	r0.z,r0.z, r1.w  // 1/sqrt(1-x²-y²)*(1-x²-y²)=sqrt(1-x²-y²)[/b]

Edited because I’m an idiot. RSQ is unary, too …

[This message has been edited by zeckensack (edited 12-17-2003).]

you’re right zeckensack
… thank you!