compressing normalmaps

i havent really studied what the newer hardware methods are so ill pick the brains of those that have.
what are the pros and cons of various methods of normal compression thingees available to me

  • dxt compression gives 1/4 compression ratio but has horrible artifacts with normals looking blocky
  • only encoding the height (typically) 1/4 compression ratio, downside is its very expensive to rebuild the normal in the shader (upside it gives by far the best image quality)
  • ive heard something about ATI normalmap compression (is something available with nv3x?) or to those who’ve used it before, is it any good?
  • dudv texture (i havent looked into them) but the compression ratio doesnt seem good at 1/2
  • anything else?

ta zed

I’ve gotten acceptable results by using DXT5 compression. Put the x component in the alpha channel, and recalculate z in a pixel shader with sqrt(1 - dot(xy, xy)). Works everywhere with good compression.

create normals lookup table with high-precision internal texture format for fast unsigned values renormalization
(say 512x512 with internal_format SIGNED_HILO16_NV or RGB16F_ARB for non nvidia hardware)
, each texel of this texture is signed normalized normal’s vale :
texel at (s,t) = RGB( s2-1, t2-1, sqrt( 1 - (s2-1)^2 + (t2-1)^2 ) ) / HILO( s2-1, t2-1 ); (s/t in range [0,1]).

store normals xy components in dxt5_rgba compressed format, but store x in alpha component
compressed_normal = RGBA( undef, Y, undef, X ).

at runtime you need only one shader’s instruction to decompress normal
(use wy components of compressed texture as st texture coordinates to lookup normal from normals table):

vec3 normal = texture2D( tex_normals_table, texture2D( tex_compressed_normal, tex_coords).wy ).xyz;

How fast is that method though? While it could be compact in memory I would expect it to be slower than uncompressed textures, unless you’d otherwise overflow video memory. You’re doing a dependent texture read that scatters access all over the lookup table. Also, the DXT5 texture interpolates at 8bit, meaning the highest useful dimension of the lookup table would be 256x256. Using lookups for math that can be expressed in a few instructions is generally a bad idea.

Anyway, the regular DXT5 method is a good one. It can also be easily be upgraded to 3Dc without changing the shader for a bit better quality on cards that supports it.

Humus is absolutely right.
Of course, there is some memory overheads.
But why it slower ?
dependent read versus [mad(expand xy to signed )] + [dot + sub + rcp + rsq ].

thanks guys, i checked out dxt5 again + apparently it encodes different than i thought, i thought that alpha encoded worse than RGB but it turns out its actualy higher quality.
coupled with the fact that the green channel encodes with better quality i can stick the XY values in those 2 channels.
i done a test + visually the results are pretty good, + since im leaving the Blue channel untouched (ie im not calculating the z values, since the extra visual quality is not worth the speed hit) theres no performance change

But why it slower ?
Because it’s random-accessing a texture. The texture cache functions most optimally when doing normal sequential reads of a texture. The kind where you’re just interpolating texture coordinates across a texture’s surface. But once you start pulling from arbitrary locations, then all bets are off in terms of the efficiency of the texture cache. And a texture cache miss isn’t terribly cheap.

Originally posted by fenris:
I’ve gotten acceptable results by using DXT5 compression. Put the x component in the alpha channel, and recalculate z in a pixel shader with sqrt(1 - dot(xy, xy)). Works everywhere with good compression.
I’ve forgotten how DXT5 works. Why do you put x into the alpha?

Because color and alpha is encoded separately. The problem with DXT color is that you only store two values and compute two intermediate values by interpolating them. This works fine with colors, but not with vectors. It limits the direction the vectors can point to only four directions. On a line! By encoding X in alpha and Y in green, you get two independent components, for a total of 4x8 = 32 stored directions, that covers an area rather than a line. With 3Dc it’s the same quality both for X and Y, so it’s 8x8 directions and 8bits instead of 6 for X.

another thing does it matter since im also storing the z direction as well (for speed reasons)
does it make a difference if i stick it in the red or the blue channel, i assume it does not?

another thing does it matter since im also storing the z direction as well (for speed reasons)
This is one of those “memory/performance” tradeoffs. If you want more texture memory, you have to give up performance. That is, you’re going to need to compute the Z value. If you want the performance, you have to give up memory and store the Z value. In which case you need at least a 32-bit texture, as DXT5 will destroy your vectors.

whilst thats true thats not really what i meant
ill explain better
i wanna store the normal x,y,z in a dxt5 texture
ok so i store the x in alpha + store the y in green.
but for the best quality where should i store the z in the red or in the blue cghannel.

further info - i just done a test + it does make a difference where u store the z in red or blue, the question still remains wether/weather one of the two is better for normals?
also does anyone have any info on how the dxt textures formats are constructed? all i could was some limited info on msdn
cheers zed

Originally posted by zed:
[b]whilst thats true thats not really what i meant
ill explain better
i wanna store the normal x,y,z in a dxt5 texture
ok so i store the x in alpha + store the y in green.
but for the best quality where should i store the z in the red or in the blue cghannel.

further info - i just done a test + it does make a difference where u store the z in red or blue, the question still remains wether/weather one of the two is better for normals?
also does anyone have any info on how the dxt textures formats are constructed? all i could was some limited info on msdn
cheers zed[/b]
Storing z as well in DXT5 is as bad as storing xy in red and green, rather than green and alpha. You’ll make y and z directly dependent on each other. There shouldn’t be a difference between using red and blue, technically anyway, but I guess the encoder may weight red more than blue, which could explain the difference.

The full format spec is in the texture_compression_s3tc spec:
http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_compression_s3tc.txt

Storing z as well in DXT5 is as bad as storing xy in red and green, rather than green and alpha. You’ll make y and z directly dependent on each other.
made a few more tests
normalmap uncompressed RGB=xyz (result with dot(0,1,0)
http://uk.geocities.com/sloppyturds/uncompressed.png
normalmap DXT5 RGB=xyz (result with dot(0,1,0)
http://uk.geocities.com/sloppyturds/DXT5_RGB.png
normalmap DXT5 RGA=zyx (result with dot(0,1,0)
http://uk.geocities.com/sloppyturds/DXT5_RGA_XXX.png
normalmap DXT5 RGA=zyx (result with dot(0,1.4,0)
http://uk.geocities.com/sloppyturds/DXT5_RGA.png

it looks like its possible to get very close to uncompressed quality (with a bit of mucking around) without uincuring the speed loss of calculating the z in the shader

it looks like its possible to get very close to uncompressed quality
In terms of absolute values, there’s a pretty significant difference between the results and the input. The fact that, in order to make the results look even somewhat similar, you had to use an arbitrary value (dotting with 1.4), shows that this is not a good technique for good quality graphics.

In terms of absolute values, there’s a pretty significant difference between the results and the input. The fact that, in order to make the results look even somewhat similar, you had to use an arbitrary value (dotting with 1.4), shows that this is not a good technique for good quality graphics.
the best way is just to alter the normalmaps when u create them like u do any ways (eg when u create a normalmap u input how usually theres a heightscaling value just make that higher)

u have to admit this
http://uk.geocities.com/sloppyturds/DXT5_RGB.png
looks far better than this
http://uk.geocities.com/sloppyturds/DXT5_RGA.png
yet both are just as expensive in shader terms + use the same amount of memory

u have to admit this
http://uk.geocities.com/sloppyturds/DXT5_RGB.png
looks far better than this
http://uk.geocities.com/sloppyturds/DXT5_RGA.png
yet both are just as expensive in shader terms + use the same amount of memory
Whether one “looks better” than the other is irrelevant; they’re both wrong. Neither one resembles the original normal map enough to be considered viable compression techniques.

So, I guess, there are 3 choices. You can either use 32-bit textures and trade memory for performance, use the 2-vector DXT5 compression and trade performance for memory, or use the 3-vector DXT5 compression and get memory and performance at the (significant) cost of accuracy. I consider accuracy to be of sufficient importance to justify it as a priority in this scenario.

And remember, your visual test is only sampling one of the vector’s components (the dot-product). What you should really do is display the normal map as an RGB color field (x=Red, y=Green, z=Blue) and compare what the various compression techniques do to it.

Does anyone use 3dc compression for normal maps, it looks pretty good but I don’t know if my nvidia 6800 supports it or not.
From looking at the different compression methods it seems there is no clean way to compress a normal map that also contains a height value for parallax mapping. Well I think I could use 2 3dc textures one containing like XY and the second containing Z and height.