PDA

View Full Version : G16R16 textures on Radeon ?



kard
12-15-2003, 10:33 PM
Hi,

Is there any method or trick to handle high precision normal maps on Radeon 9500+ in a way that is similar to the Nvidia HILO? Iím interested by the fact that HILO can carry precisely normal components at a reasonable cost (32 bits).

I found a demo on the ATI devrel site (http://mirror.ati.com/developer/samples/dx9/HighPrecisionNormalMaps.html) that does this with DX9 (via the D3DFMT_G16R16 texel format).

I guess that it could be possible to generate a fragment program that generates Nz from the 2 first channels and give a dual channels half-float texture as an input, but this should certainly be more expensive than the native HILO management.

Korval
12-15-2003, 11:37 PM
Couldn't you use an intensity/alpha format with 16-bits per component? I think R300-based cards can handle 16-bit-per-component textures, in addition to 16-bit and 32-bit floating-point textures.

kard
12-16-2003, 01:05 AM
Thanks Korval,


I think R300-based cards can handle 16-bit-per-component textures, in addition to 16-bit and 32-bit floating-point textures.

Yes, and the 16-bit floating-point textures should be sufficient to handle such a normal vector, but HILO textures allow the automatic generation of the third normal vector coordinate... Is there any exposed and similar mechanism in an ATI extension ?

NitroGL
12-16-2003, 07:41 AM
Nope, you have to do it yourself in the fragment shader.

kard
12-16-2003, 08:19 AM
ok, thank you...

I guess that the one or two extra fragment program instructions will not be too expensive...

As it seems that the float textures are not filtered, it should be better to use 16-bits integer format...

But I didn't see any ATI extension that expose such a functionnality.

Any idea ?

zeckensack
12-16-2003, 08:37 AM
That's core functionality. Use GL_LUMINANCE16_ALPHA16 as the internal format.

NitroGL
12-16-2003, 09:03 AM
Originally posted by kard:
ok, thank you...

I guess that the one or two extra fragment program instructions will not be too expensive...

As it seems that the float textures are not filtered, it should be better to use 16-bits integer format...

But I didn't see any ATI extension that expose such a functionnality.

Any idea ?

8 or 9 instructions actually, less if you don't renormalize.

I guess they just didn't want to spend the extra transistors for it.

kard
12-16-2003, 09:57 AM
8 or 9 instructions actually, less if you don't renormalize.

I guess that you include the texture instructions that are required in any case. But do you think this extra work could counterbalance the additional memory for 3 16-bits channels.


I guess they just didn't want to spend the extra transistors for it.

on the concerned DX9 demo page ( http://mirror.ati.com/developer/samples/dx9/HighPrecisionNormalMaps.html ) it seems that the D3DFMT_G16R16 format does the HILO job on Radeon. Do you think that a hidden program performs the task?

NitroGL
12-16-2003, 06:38 PM
In that demo, it do it in a PS 2.0 shader:



texld r0, t0, s0 // Sample from normal map

dp2add r1, r0, -r0, c1.w // 1 - x*x - y*y
rsq r1.w, r1.w // 1/sqrt(1 - x*x - y*y)
rcp r0.z, r1.w // sqrt(1 - x*x - y*y)


Of course it's a little different in OpenGL (mostly because of the lack of the dp2add instruction, but can be emulated).

[This message has been edited by NitroGL (edited 12-16-2003).]

kard
12-16-2003, 10:46 PM
Thank you NitroGL! I didnít see this shader files.

Assuming that the input texture is dual, then only the R and G component are provided, so B is 0 and in opengl this shader should be something like that:



tex r0, fragment.texcoord[0], texture[0],2D // Sample from normal map
dp3 r1,r0,-r0 // - x*x - y*y
add r1.w,r1.w,c1.w // 1 - x*x - y*y
rsq r1.w,r1.w, r1.w // 1/sqrt(1 - x*x - y*y)
rcp r0.z,r0.z, r1.w // sqrt(1 - x*x - y*y)


[This message has been edited by kard (edited 12-17-2003).]

[This message has been edited by kard (edited 12-17-2003).]

zeckensack
12-17-2003, 07:18 AM
tex r0, fragment.texcoord[0], texture[0],2D // Sample from normal map
dp3 r1,r0,-r0 // - x*x - y*y
add r1.w,r1.w,c1.w // 1 - x*x - y*y
rsq r1.w,r1.w, r1.w // 1/sqrt(1 - x*x - y*y)
rcp r0.z,r0.z, r1.w // sqrt(1 - x*x - y*y)
<HR></BLOCKQUOTE>RCP is unary, this can't be right. I also think MUL may be more efficient than RCP, even if you don't see it on (your) current hardware. My changes in bold:


tex r0, fragment.texcoord[0], texture[0],2D // Sample from normal map
dp3 r1,r0,-r0 // - x*x - y*y
add r1.w,r1.w,c1.w // 1 - x*x - y*y
rsq r0.z,r1.w // 1/sqrt(1 - x*x - y*y)
mul r0.z,r0.z, r1.w // 1/sqrt(1-x≤-y≤)*(1-x≤-y≤)=sqrt(1-x≤-y≤)


Edited because I'm an idiot. RSQ is unary, too ...


[This message has been edited by zeckensack (edited 12-17-2003).]

kard
12-17-2003, 10:00 AM
you're right zeckensack
... thank you!