Name KHR_texture_compression_astc_ldr Name Strings GL_KHR_texture_compression_astc_ldr Contact Sean Ellis (sean.ellis 'at' arm.com) Contributors Sean Ellis, ARM Jorn Nystad, ARM Tom Olson, ARM Cass Everitt, NVIDIA Walter Donovan, NVIDIA Robert Simpson, Qualcomm Maurice Ribble, Qualcomm Larry Seiler, Intel Daniel Koch, Transgaming IP Status No known issues. Status Complete. Approved by the ARB on 2012/06/18. Approved by the OpenGL ES WG on 2012/06/15. Ratified by the Khronos Board of Promoters on 2012/07/27. Version Last Modified Date: July 27, 2012 Number ARB Extension #118 OpenGL ES Extension #117 Dependencies Written based on the wording of the OpenGL ES 3.0 specification. Overview Adaptive Scalable Texture Compression (ASTC) is a new texture compression technology that offers unprecendented flexibility, while producing better or comparable results than existing texture compressions at all bit rates. It includes support for 2D and 3D textures, with low and high dynamic range, at bitrates from below 1 bit/pixel up to 8 bits/pixel in fine steps. The goal of this extension is to support the 2D, LDR-only profile of the ASTC texture compression specification. ASTC-compressed textures are handled in OpenGL ES and OpenGL by adding new supported formats to the existing mechanisms for handling compressed textures. Issues The HDR and 3D parts of the full ASTC specification are still undergoing detailed evaluation. The encoding space required to implement these parts of the full ASTC specification are marked as reserved. Interactions with Other Extensions Will interact with EXT_texture_storage. Interactions with OpenGL 4.2 OpenGL 4.2 supports the feature that compressed textures can be compressed online, by passing the compressed texture format enum as the internal format when uploading a texture using TexImage1D, TexImage2D or TexImage3D (see Section 3.9.3, Texture Image Specification, subsection Encoding of Special Internal Formats). Due to the complexity of the ASTC compression algorithm, it is not usually suitable for online use, and therefore ASTC support will be limited to pre-compressed textures only. Where on-device compression is required, a domain-specific limited compressor will typically be used, and this is therefore not suitable for implementation in the driver. In particular, the ASTC format specifiers will not be added to Table 3.14, and thus will not be accepted by the TexImage*D functions, and will not be returned by the (already deprecated) COMPRESSED_TEXTURE_FORMATS query. New Procedures and Functions None New Tokens Accepted by the parameter of CompressedTexImage2D, CompressedTexSubImage2D, TexStorage2D, TextureStorage2D, TexStorage3D, and TextureStorage3D: COMPRESSED_RGBA_ASTC_4x4_KHR 0x93B0 COMPRESSED_RGBA_ASTC_5x4_KHR 0x93B1 COMPRESSED_RGBA_ASTC_5x5_KHR 0x93B2 COMPRESSED_RGBA_ASTC_6x5_KHR 0x93B3 COMPRESSED_RGBA_ASTC_6x6_KHR 0x93B4 COMPRESSED_RGBA_ASTC_8x5_KHR 0x93B5 COMPRESSED_RGBA_ASTC_8x6_KHR 0x93B6 COMPRESSED_RGBA_ASTC_8x8_KHR 0x93B7 COMPRESSED_RGBA_ASTC_10x5_KHR 0x93B8 COMPRESSED_RGBA_ASTC_10x6_KHR 0x93B9 COMPRESSED_RGBA_ASTC_10x8_KHR 0x93BA COMPRESSED_RGBA_ASTC_10x10_KHR 0x93BB COMPRESSED_RGBA_ASTC_12x10_KHR 0x93BC COMPRESSED_RGBA_ASTC_12x12_KHR 0x93BD COMPRESSED_SRGB8_ALPHA8_ASTC_4x4_KHR 0x93D0 COMPRESSED_SRGB8_ALPHA8_ASTC_5x4_KHR 0x93D1 COMPRESSED_SRGB8_ALPHA8_ASTC_5x5_KHR 0x93D2 COMPRESSED_SRGB8_ALPHA8_ASTC_6x5_KHR 0x93D3 COMPRESSED_SRGB8_ALPHA8_ASTC_6x6_KHR 0x93D4 COMPRESSED_SRGB8_ALPHA8_ASTC_8x5_KHR 0x93D5 COMPRESSED_SRGB8_ALPHA8_ASTC_8x6_KHR 0x93D6 COMPRESSED_SRGB8_ALPHA8_ASTC_8x8_KHR 0x93D7 COMPRESSED_SRGB8_ALPHA8_ASTC_10x5_KHR 0x93D8 COMPRESSED_SRGB8_ALPHA8_ASTC_10x6_KHR 0x93D9 COMPRESSED_SRGB8_ALPHA8_ASTC_10x8_KHR 0x93DA COMPRESSED_SRGB8_ALPHA8_ASTC_10x10_KHR 0x93DB COMPRESSED_SRGB8_ALPHA8_ASTC_12x10_KHR 0x93DC COMPRESSED_SRGB8_ALPHA8_ASTC_12x12_KHR 0x93DD If extension "EXT_texture_storage" is supported, these tokens are also accepted by TexStorage2DEXT, TextureStorage2DEXT, TexStorage3DEXT and TextureStorage3DEXT. Additions to Chapter 2 of the OpenGL ES 3.0 Specification (OpenGL ES Operation) None Additions to Chapter 3 of the OpenGL ES 3.0 Specification (Rasterization) Added to Section 3.8.6, Compressed Texture Images Add the tokens specified above to Table 3.16, Compressed Internal Formats. In all cases, the base internal format will be RGBA. The encoding allows images to be encoded with fewer channels, but this is always presented as RGBA to the sampler. After the paragraph discussing ETC2/EAC formats, add: "If internalformat is one of the ASTC formats described in table 3.16, the compressed image data is stored using one of the ASTC compressed texture image encodings (see appendix C). The ASTC LDR texture compression algorithm supports only two-dimensional images. If internalformat is a 2D ASTC format, CompressedTexImage3D will generate an INVALID_OPERATION error if target is not TEXTURE_2D_ARRAY." At the end of the section, add: "If internalformat is one of the ASTC formats described in table 3.16, the texture is stored using one of the ASTC compressed texture image encodings (see appendix C). If internalformat is a 2D ASTC format, CompressedTexSubImage3D will generate an INVALID_OPERATION error if target is not TEXTURE_2D_ARRAY. Since ASTC images are easily edited along block footprint boundaries, the limitations on subimage location and size are as follows for CompressedTexSubImage2D: These commands will result in an INVALID_OPERATION error if one of the following conditions occurs: * width is not a multiple of the block width, and width + xoffset is not equal to the width of the texture level. * height is not a multiple of block height, and height+yoffset is not equal to the height of the texture level. * xoffset or yoffset is not a multiple of the corresponding block dimension. The contents of any block of texels of an ASTC compressed texture image that does not intersect the area being modified are preserved during valid CompressedTexSubImage* calls. The block width and height for each ASTC format are determined according to Table 3.17: ------------------------------------------------------ Block Compressed Internal Format Width Height ------------------------------------------------------ COMPRESSED_RGBA_ASTC_4x4_KHR 4 4 COMPRESSED_RGBA_ASTC_5x4_KHR 5 4 COMPRESSED_RGBA_ASTC_5x5_KHR 5 5 COMPRESSED_RGBA_ASTC_6x5_KHR 6 5 COMPRESSED_RGBA_ASTC_6x6_KHR 6 6 COMPRESSED_RGBA_ASTC_8x5_KHR 8 5 COMPRESSED_RGBA_ASTC_8x6_KHR 8 6 COMPRESSED_RGBA_ASTC_8x8_KHR 8 8 COMPRESSED_RGBA_ASTC_10x5_KHR 10 5 COMPRESSED_RGBA_ASTC_10x6_KHR 10 6 COMPRESSED_RGBA_ASTC_10x8_KHR 10 8 COMPRESSED_RGBA_ASTC_10x10_KHR 10 10 COMPRESSED_RGBA_ASTC_12x10_KHR 12 10 COMPRESSED_RGBA_ASTC_12x12_KHR 12 12 COMPRESSED_SRGB8_ALPHA8_ASTC_4x4_KHR 4 4 COMPRESSED_SRGB8_ALPHA8_ASTC_5x4_KHR 5 4 COMPRESSED_SRGB8_ALPHA8_ASTC_5x5_KHR 5 5 COMPRESSED_SRGB8_ALPHA8_ASTC_6x5_KHR 6 5 COMPRESSED_SRGB8_ALPHA8_ASTC_6x6_KHR 6 6 COMPRESSED_SRGB8_ALPHA8_ASTC_8x5_KHR 8 5 COMPRESSED_SRGB8_ALPHA8_ASTC_8x6_KHR 8 6 COMPRESSED_SRGB8_ALPHA8_ASTC_8x8_KHR 8 8 COMPRESSED_SRGB8_ALPHA8_ASTC_10x5_KHR 10 5 COMPRESSED_SRGB8_ALPHA8_ASTC_10x6_KHR 10 6 COMPRESSED_SRGB8_ALPHA8_ASTC_10x8_KHR 10 8 COMPRESSED_SRGB8_ALPHA8_ASTC_10x10_KHR 10 10 COMPRESSED_SRGB8_ALPHA8_ASTC_12x10_KHR 12 10 COMPRESSED_SRGB8_ALPHA8_ASTC_12x12_KHR 12 12 ------------------------------------------------------ Table 3.17: Compressed ASTC Format Block Sizes" Added to Section 3.8.15: The list of converted internal formats at the start of this section must be expanded to include all of the COMPRESSED_SRGB8_ALPHA8_ASTC_*_KHR formats. Additions to Chapter 4 of the OpenGL ES 3.0 Specification (Per-Fragment Operations and the Framebuffer) None Additions to Chapter 5 of the OpenGL ES 3.0 Specification (Special Functions) None Additions to Chapter 6 of the OpenGL ES 3.0 Specification (State and State Requests) None Additions to Appendix A of the OpenGL ES 3.0 Specification (Invariance) None Additions to Appendix B of the OpenGL ES 3.0 Specification (Corollaries) None Additions to Appendix C of the OpenGL ES 3.0 Specification (Compressed Texture Image Formats) Add a new sub-section on ASTC image formats, as follows: C.2 ASTC Compressed Texture Image Formats ========================================= C.2.1 What is ASTC? --------------------- ASTC stands for Adaptive Scalable Texture Compression. The ASTC formats form a family of related compressed texture image formats. They are all derived from a common set of definitions. ASTC textures may be either 2D or 3D. ASTC textures may be encoded using either high or low dynamic range. Low dynamic range images may optionally be specified using the sRGB color space. A sub-profile ("LDR Profile") is defined, which supports only 2D images at low dynamic range. This is the profile supported by this extension. Support for this profile is indicated by the presence of the extension string "GL_KHR_texture_compression_astc_ldr". If, in future, the full profile is supported, "GL_KHR_texture_compression_astc_ldr" must still be published, in order to ensure backward compatibility. ASTC textures may be encoded as 1, 2, 3 or 4 components, but they are all decoded into RGBA. ASTC has a variable block size, and this is specified as part of the name of the token passed to CompressedImage2D and its related functions. C.2.2 Design Goals -------------------- The design goals for the format are as follows: * Random access. This is a must for any texture compression format. * Bit exact decode. This is a must for conformance testing and reproducibility. * Suitable for mobile use. The format should be suitable for both desktop and mobile GPU environments. It should be low bandwidth and low in area. * Flexible choice of bit rate. Current formats only offer a few bit rates, leaving content developers with only coarse control over the size/quality tradeoff. * Scalable and long-lived. The format should support existing R, RG, RGB and RGBA image types, and also have high "headroom", allowing continuing use for several years and the ability to innovate in encoders. Part of this is the choice to include HDR and 3D in the full profile. * Feature orthogonality. The choices for the various features of the format are all orthogonal to each other. This has three effects: first, it allows a large, flexible configuration space; second, it makes that space easier to understand; and third, it makes verification easier. * Best in class at given bit rate. It should beat or match the current best in class for peak signal-to-noise ratio (PSNR) at all bit rates. * Fast decode. Texel throughput for a cached texture should be one texel decode per clock cycle per decoder. Parallel decoding of several texels from the same block should be possible at incremental cost. * Low bandwidth. The encoding scheme should ensure that memory access is kept to a minimum, cache reuse is high and memory bandwidth for the format is low. * Low area. It must occupy comparable die size to competing formats. C.2.3 Basic Concepts ---------------------- ASTC is a block-based lossy compression format. The compressed image is divided into a number of blocks of uniform size, which makes it possible to quickly determine which block a given texel resides in. Each block has a fixed memory footprint of 128 bits, but these bits can represent varying numbers of texels (the block "footprint"). Block footprint sizes are not confined to powers-of-two, and are also not confined to be square. The 2D formats have block dimensions ranging from 4 to 12 texels. Decoding one texel requires only the data from a single block. This simplifies cache design, reduces bandwidth and improves encoder throughput. C.2.4 Block Encoding ---------------------- To understand how the blocks are stored and decoded, it is useful to start with a simple example, and then introduce additional features. The simplest block encoding starts by defining two color "endpoints". The endpoints define two colors, and a number of additional colors are generated by interpolating between them. We can define these colors using 1, 2, 3, or 4 components (usually corresponding to R, RG, RGB and RGBA textures), and using low or high dynamic range. We then store a color interpolant weight for each texel in the image, which specifies how to calculate the color to use. From this, a weighted average of the two endpoint colors is used to generate the intermediate color, which is the returned color for this texel. There are several different ways of specifying the endpoint colors, and the weights, but once they have been defined, calculation of the texel colors proceeds identically for all of them. Each block is free to choose whichever encoding scheme best represents its color endpoints, within the constraint that all the data fits within the 128 bit block. For blocks which have a large number of texels (e.g. a 12x12 block), there is not enough space to explicitly store a weight for every texel. In this case, a sparser grid with fewer weights is stored, and interpolation is used to determine the effective weight to be used for each texel position. This allows very low bit rates to be used with acceptable quality. This can also be used to more efficiently encode blocks with low detail, or with strong vertical or horizontal features. For blocks which have a mixture of disparate colors, a single line in the color space is not a good fit to the colors of the pixels in the original image. It is therefore possible to partition the texels into multiple sets, the pixels within each set having similar colors. For each of these "partitions", we specify separate endpoint pairs, and choose which pair of endpoints to use for a particular texel by looking up the partition index from a partitioning pattern table. In ASTC, this partition table is actually implemented as a function. The endpoint encoding for each partition is independent. For blocks which have uncorrelated channels - for example an image with a transparency mask, or an image used as a normal map - it may be necessary to specify two weights for each texel. Interpolation between the components of the endpoint colors can then proceed independently for each "plane" of the image. The assignment of channels to planes is selectable. Since each of the above options is independent, it is possible to specify any combination of channels, endpoint color encoding, weight encoding, interpolation, multiple partitions and single or dual planes. Since these values are specified per block, it is important that they are represented with the minimum possible number of bits. As a result, these values are packed together in ways which can be difficult to read, but which are nevertheless highly amenable to hardware decode. All of the values used as weights and color endpoint values can be specified with a variable number of bits. The encoding scheme used allows a fine- grained tradeoff between weight bits and color endpoint bits using "integer sequence encoding". This can pack adjacent values together, allowing us to use fractional numbers of bits per value. Finally, a block may be just a single color. This is a so-called "void extent block" and has a special coding which also allows it to identify nearby regions of single color. This may be used to short-circuit fetching of what would be identical blocks, and further reduce memory bandwidth. C.2.5 Returned Values ----------------------- The decoding process for LDR content produces output with the following properties. ------------------------------------------------ Operation LDR Mode ------------------------------------------------ Returned value Vector of FP16 values, or Vector of UNORM8 values. sRGB compatible Yes LDR endpoint 16 bits, or decoding precision 8 bits for sRGB HDR endpoint mode Error color results Error results Error color ------------------------------------------------ Table C.2.1 - LDR Profile Properties The error color for SRGB decode is opaque fully-saturated magenta (R,G,B,A = 0xFF, 0x00, 0xFF, 0xFF). This has been chosen as it is much more noticeable than black or white, and occurs far less often in valid images. For linear RGB decode, the error color may be either opaque fully-saturated magenta (R,G,B,A = 1.0, 0.0, 1.0, 1.0) or a vector of four NaNs (R,G,B,A = NaN, NaN, NaN, NaN). In the latter case, the recommended NaN value returned is 0xFFFF. The error color is returned as an informative response to invalid conditions, including invalid block encodings or use of reserved endpoint modes. Future, forward-compatible extensions to KHR_texture_compression_astc_ldr may define valid interpretations of these conditions, which will decode to some other color. Therefore, encoders and applications must not rely on invalid encodings as a way of generating the error color. C.2.6 Configuration Summary ----------------------------- The global configuration data for the format is as follows: * Block dimension (always 2D for LDR profile) * Block footprint size * Dynamic range (HDR or LDR) * sRGB output enabled or not The data specified per block is as follows: * Texel weight grid size * Texel weight range * Texel weight values * Number of partitions * Partition pattern index * Color endpoint modes * Color endpoint data * Number of planes * Plane-to-channel assignment C.2.7 Decode Procedure ------------------------ To decode one texel: (Optimization: If within known void-extent, immediately return single color) Find block containing texel Read block mode If void-extent block, store void extent and immediately return single color For each plane in image If block mode requires infill Find and decode stored weights adjacent to texel, unquantize and interpolate Else Find and decode weight for texel, and unquantize Read number of partitions If number of partitions > 1 Read partition table pattern index Look up partition number from pattern Read color endpoint mode and endpoint data for selected partition Unquantize color endpoints Interpolate color endpoints using weight (or weights in dual-plane mode) Return interpolated color C.2.8 Block Determination and Bit Rates ----------------------------------------- The block footprint is a global setting for any given texture, and is therefore not encoded in the individual blocks. For 2D textures, the block footprint's width and height are selectable from a number of predefined sizes, namely 4, 5, 6, 8, 10 and 12 pixels. For square and nearly-square blocks, this gives the following bit rates: ------------------------------------- Footprint Width Height Bit Rate Increment ------------------------------------- 4 4 8.00 125% 5 4 6.40 125% 5 5 5.12 120% 6 5 4.27 120% 6 6 3.56 114% 8 5 3.20 120% 8 6 2.67 105% 10 5 2.56 120% 10 6 2.13 107% 8 8 2.00 125% 10 8 1.60 125% 10 10 1.28 120% 12 10 1.07 120% 12 12 0.89 ------------------------------------- Table C.2.2 - 2D Footprint and Bit Rates The block footprint is shown as width x height in the format enumerator, so for example the enumerator COMPRESSED_RGBA_ASTC_8x6_KHR specifies an image with a block width of 8 texels, and a block height of 6 texels. The "Increment" column indicates the ratio of bit rate against the next lower available rate. A consistent value in this column indicates an even spread of bit rates. For images which are not an integer multiple of the block size, additional texels are added to the edges with maximum X and Y. These texels may be any color, as they will not be accessed. Although these are not all powers of two, it is possible to calculate block addresses and pixel addresses within the block, for legal image sizes, without undue complexity. Given an image which is W x H pixels in size, with block size w x h, the size of the image in blocks is: Bw = ceiling(W/w) Bh = ceiling(H/h) C.2.9 Block Layout -------------------- Each block in the image is stored as a single 128-bit block in memory. These blocks are laid out in raster order, starting with the block at (0,0,0), then ordered sequentially by X, Y and finally Z (if present). They are aligned to 128-bit boundaries in memory. The bits in the block are labeled in little-endian order - the byte at the lowest address contains bits 0..7. Bit 0 is the least significant bit in the byte. Each block has the same basic layout: 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 -------------------------------------------------------------- | Texel Weight Data (variable width) Fill direction -> -------------------------------------------------------------- 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 -------------------------------------------------------------- Texel Weight Data -------------------------------------------------------------- 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 -------------------------------------------------------------- Texel Weight Data -------------------------------------------------------------- 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 -------------------------------------------------------------- Texel Weight Data -------------------------------------------------------------- 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 -------------------------------------------------------------- : More config data : -------------------------------------------------------------- 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 -------------------------------------------------------------- <-Fill direction Color Endpoint Data -------------------------------------------------------------- 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 -------------------------------------------------------------- : Extra configuration data -------------------------------------------------------------- 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 -------------------------------------------------------------- Extra | Part | Block mode | -------------------------------------------------------------- Table C.2.3 - Block Layout Overview Dotted partition lines indicate that the split position is not fixed. The "Block mode" field specifies how the Texel Weight Data is encoded. The "Part" field specifies the number of partitions, minus one. If dual plane mode is enabled, the number of partitions must be 3 or fewer. If 4 partitions are specified, the error value is returned for all texels in the block. The size and layout of the extra configuration data depends on the number of partitions, and the number of planes in the image, as follows (only the bottom 32 bits are shown): 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 -------------------------------------------------------------- <- Color endpoint data |CEM -------------------------------------------------------------- 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 -------------------------------------------------------------- CEM | 0 0 | Block Mode | -------------------------------------------------------------- Table C.2.4 - Single-partition Block Layout CEM is the color endpoint mode field, which determines how the Color Endpoint Data is encoded. If dual-plane mode is active, the color component selector bits appear directly below the weight bits. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 -------------------------------------------------------------- | CEM | Partition Index -------------------------------------------------------------- 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 -------------------------------------------------------------- Partition Index | Block Mode | -------------------------------------------------------------- Table C.2.5 - Multi-partition Block Layout The Partition Index field specifies which partition layout to use. CEM is the first 6 bits of color endpoint mode information for the various partitions. For modes which require more than 6 bits of CEM data, the additional bits appear at a variable position directly beneath the texel weight data. If dual-plane mode is active, the color component selector bits then appear directly below the additional CEM bits. The final special case is that if bits [8:0] of the block are "111111100", then the block is a void-extent block, which has a separate encoding described in section C.2.22. C.2.10 Block Mode ------------------ The Block Mode field specifies the width, height and depth of the grid of weights, what range of values they use, and whether dual weight planes are present. Since some these are not represented using powers of two (there are 12 possible weight widths, for example), and not all combinations are allowed, this is not a simple bit packing. However, it can be unpacked quickly in hardware. The weight ranges are encoded using a 3 bit value R, which is interpreted together with a precision bit H, as follows: Low Precision Range (H=0) High Precision Range (H=1) R Weight Range Trits Quints Bits Weight Range Trits Quints Bits ----------------------------------------------------------------------------- 000 Invalid Invalid 001 Invalid Invalid 010 0..1 1 0..9 1 1 011 0..2 1 0..11 1 2 100 0..3 2 0..15 4 101 0..4 1 0..19 1 2 110 0..5 1 1 0..23 1 3 111 0..7 3 0..31 5 ----------------------------------------------------------------------------- Table C.2.6 - Weight Range Encodings Each weight value is encoded using the specified number of Trits, Quints and Bits. The details of this encoding can be found in Section C.3.12 - Integer Sequence Encoding. For 2D blocks, the Block Mode field is laid out as follows: ------------------------------------------------------------------------- 10 9 8 7 6 5 4 3 2 1 0 Width Height Notes ------------------------------------------------------------------------- D H B A R0 0 0 R2 R1 B+4 A+2 D H B A R0 0 1 R2 R1 B+8 A+2 D H B A R0 1 0 R2 R1 A+2 B+8 D H 0 B A R0 1 1 R2 R1 A+2 B+6 D H 1 B A R0 1 1 R2 R1 B+2 A+2 D H 0 0 A R0 R2 R1 0 0 12 A+2 D H 0 1 A R0 R2 R1 0 0 A+2 12 D H 1 1 0 0 R0 R2 R1 0 0 6 10 D H 1 1 0 1 R0 R2 R1 0 0 10 6 B 1 0 A R0 R2 R1 0 0 A+6 B+6 D=0, H=0 x x 1 1 1 1 1 1 1 0 0 - - Void-extent x x 1 1 1 x x x x 0 0 - - Reserved* x x x x x x x 0 0 0 0 - - Reserved ------------------------------------------------------------------------- Table C.2.7 - 2D Block Mode Layout Note that, due to the encoding of the R field, as described in the previous page, bits R2 and R1 cannot both be zero, which disambiguates the first five rows from the rest of the table. The penultimate row of the table is reserved only if bits [5:2] are not all 1, in which case it encodes a void-extent block (as shown in the previous row). The D bit is set to indicate dual-plane mode. In this mode, the maximum allowed number of partitions is 3. The size of the grid in each dimension must be less than or equal to the corresponding dimension of the block footprint. If the grid size is greater than the footprint dimension in any axis, then this is an illegal block encoding and all texels will decode to the error color. C.2.11 Color Endpoint Mode --------------------------- In single-partition mode, the Color Endpoint Mode (CEM) field stores one of 16 possible values. Each of these specifies how many raw data values are encoded, and how to convert these raw values into two RGBA color endpoints. They can be summarized as follows: --------------------------------------------- CEM Description Class --------------------------------------------- 0 LDR Luminance, direct 0 1 LDR Luminance, base+offset 0 2 -- Reserved for HDR -- 0 3 -- Reserved for HDR -- 0 4 LDR Luminance+Alpha, direct 1 5 LDR Luminance+Alpha, base+offset 1 6 LDR RGB, base+scale 1 7 -- Reserved for HDR -- 1 8 LDR RGB, direct 2 9 LDR RGB, base+offset 2 10 LDR RGB, base+scale plus two A 2 11 -- Reserved for HDR -- 2 12 LDR RGBA, direct 3 13 LDR RGBA, base+offset 3 14 -- Reserved for HDR -- 3 15 -- Reserved for HDR -- 3 --------------------------------------------- Table C.2.8 - Color Endpoint Modes In multi-partition mode, the CEM field is of variable width, from 6 to 14 bits. The lowest 2 bits of the CEM field specify how the endpoint mode for each partition is calculated: ---------------------------------------------------- Value Meaning ---------------------------------------------------- 00 All color endpoint pairs are of the same type. A full 4-bit CEM is stored in block bits [28:25] and is used for all partitions. 01 All endpoint pairs are of class 0 or 1. 10 All endpoint pairs are of class 1 or 2. 11 All endpoint pairs are of class 2 or 3. ---------------------------------------------------- Table C.2.9 - Multi-Partition Color Endpoint Modes If the CEM selector value in bits [24:23] is not 00, then data layout is as follows: Part n m l k j i h g ------------------------------------------ 2 ... Weight : M1 : ... ------------------------------------------ 3 ... Weight : M2 : M1 :M0 : ... ------------------------------------------ 4 ... Weight : M3 : M2 : M1 : M0 : ... ------------------------------------------ Part 28 27 26 25 24 23 ---------------------- 2 | M0 |C1 |C0 | CEM | ---------------------- 3 |M0 |C2 |C1 |C0 | CEM | ---------------------- 4 |C3 |C2 |C1 |C0 | CEM | ---------------------- Table C.2.10 - Multi-Partition Color Endpoint Modes In this view, each partition i has two fields. Ci is the class selector bit, choosing between the two possible CEM classes (0 indicates the lower of the two classes), and Mi is a two-bit field specifying the low bits of the color endpoint mode within that class. The additional bits appear at a variable bit position, immediately below the texel weight data. The ranges used for the data values are not explicitly specified. Instead, they are derived from the number of available bits remaining after the configuration data and weight data have been specified. Details of the decoding procedure for Color Endpoints can be found in section C.2.13. C.2.12 Integer Sequence Encoding --------------------------------- Both the weight data and the endpoint color data are variable width, and are specified using a sequence of integer values. The range of each value in a sequence (e.g. a color weight) is constrained. Since it is often the case that the most efficient range for these values is not a power of two, each value sequence is encoded using a technique known as "integer sequence encoding". This allows efficient, hardware-friendly packing and unpacking of values with non-power-of-two ranges. In a sequence, each value has an identical range. The range is specified in one of the following forms: Value range MSB encoding LSB encoding 0 .. 2^n-1 - n bit value m (n <= 8) 0 .. (3 * 2^n)-1 Base-3 "trit" value t n bit value m (n <= 6) 0 .. (5 * 2^n)-1 Base-5 "quint" value q n bit value m (n <= 5) Value range Value Block Packed block size 0 .. 2^n-1 m 1 n 0 .. (3 * 2^n)-1 t * 2^n + m 5 8 + 5*n 0 .. (5 * 2^n)-1 q * 2^n + m 3 7 + 3*n Table C.2.11 -Encoding for Different Ranges Since 3^5 is 243, it is possible to pack five trits into 8 bits(which has 256 possible values), so a trit can effectively be encoded as 1.6 bits. Similarly, since 5^3 is 125, it is possible to pack three quints into 7 bits (which has 128 possible values), so a quint can be encoded as 2.33 bits. The encoding scheme packs the trits or quints, and then interleaves the n additional bits in positions that satisfy the requirements of an arbitrary length stream. This makes it possible to correctly specify lists of values whose length is not an integer multiple of 3 or 5 values. It also makes it possible to easily select a value at random within the stream. If there are insufficient bits in the stream to fill the final block, then unused (higher order) bits are assumed to be 0 when decoding. To decode the bits for value number i in a sequence of bits b, both indexed from 0, perform the following: If the range is encoded as n bits per value, then the value is bits b[i*n+n-1:i*n] - a simple multiplexing operation. If the range is encoded using a trit, then each block contains 5 values (v0 to v4), each of which contains a trit (t0 to t4) and a corresponding LSB value (m0 to m4). The first bit of the packed block is bit floor(i/5)*(8+5*n). The bits in the block are packed as follows (in this example, n is 4): 27 26 25 24 23 22 21 20 19 18 17 16 ----------------------------------------------- |T7 | m4 |T6 T5 | m3 |T4 | ----------------------------------------------- 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 -------------------------------------------------------------- | m2 |T3 T2 | m1 |T1 T0 | m0 | -------------------------------------------------------------- Table C.2.12 - Trit-based Packing The five trits t0 to t4 are obtained by bit manipulations of the 8 bits T[7:0] as follows: if T[4:2] = 111 C = { T[7:5], T[1:0] }; t4 = t3 = 2 else C = T[4:0] if T[6:5] = 11 t4 = 2; t3 = T[7] else t4 = T[7]; t3 = T[6:5] if C[1:0] = 11 t2 = 2; t1 = C[4]; t0 = { C[3], C[2]&~C[3] } else if C[3:2] = 11 t2 = 2; t1 = 2; t0 = C[1:0] else t2 = C[4]; t1 = C[3:2]; t0 = { C[1], C[0]&~C[1] } If the range is encoded using a quint, then each block contains 3 values (v0 to v2), each of which contains a quint (q0 to q2) and a corresponding LSB value (m0 to m2). The first bit of the packed block is bit floor(i/3)*(7+3*n). The bits in the block are packed as follows (in this example, n is 4): 18 17 16 ----------- |Q6 Q5 | m2 ----------- 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 --------------------------------------------------------------- m2 |Q4 Q3 | m1 |Q2 Q1 Q0 | m0 | --------------------------------------------------------------- Table C.2.13 - Quint-based Packing The three quints q0 to q2 are obtained by bit manipulations of the 7 bits Q[6:0] as follows: if Q[2:1] = 11 and Q[6:5] = 00 q2 = { Q[0], Q[4]&~Q[0], Q[3]&~Q[0] }; q1 = q0 = 4 else if Q[2:1] = 11 q2 = 4; C = { Q[4:3], ~Q[6:5], Q[0] } else q2 = T[6:5]; C = Q[4:0] if C[2:0] = 101 q1 = 4; q0 = C[4:3] else q1 = C[4:3]; q0 = C[2:0] Both these procedures ensure a valid decoding for all 128 possible values (even though a few are duplicates). They can also be implemented efficiently in software using small tables. Encoding methods are not specified here, although table-based mechanisms work well. C.2.13 Endpoint Unquantization ------------------------------- Each color endpoint is specified as a sequence of integers in a given range. These values are packed using integer sequence encoding, as a stream of bits stored from just above the configuration data, and growing upwards. Once unpacked, the values must be unquantized from their storage range, returning them to a standard range of 0..255. For bit-only representations, this is simple bit replication from the most significant bit of the value. For trit or quint-based representations, this involves a set of bit manipulations and adjustments to avoid the expense of full-width multipliers. This procedure ensures correct scaling, but scrambles the order of the decoded values relative to the encoded values. This must be compensated for using a table in the encoder. The initial inputs to the procedure are denoted A (9 bits), B (9 bits), C (9 bits) and D (3 bits) and are decoded using the range as follows: --------------------------------------------------------------- Range T Q B Bits A B C D --------------------------------------------------------------- 0..5 1 1 a aaaaaaaaa 000000000 204 Trit value 0..9 1 1 a aaaaaaaaa 000000000 113 Quint value 0..11 1 2 ba aaaaaaaaa b000b0bb0 93 Trit value 0..19 1 2 ba aaaaaaaaa b0000bb00 54 Quint value 0..23 1 3 cba aaaaaaaaa cb000cbcb 44 Trit value 0..39 1 3 cba aaaaaaaaa cb0000cbc 26 Quint value 0..47 1 4 dcba aaaaaaaaa dcb000dcb 22 Trit value 0..79 1 4 dcba aaaaaaaaa dcb0000dc 13 Quint value 0..95 1 5 edcba aaaaaaaaa edcb000ed 11 Trit value 0..159 1 5 edcba aaaaaaaaa edcb0000e 6 Quint value 0..191 1 6 fedcba aaaaaaaaa fedcb000f 5 Trit value --------------------------------------------------------------- Table C.2.14 - Color Unquantization Parameters These are then processed as follows: T = D * C + B; T = T ^ A; T = (A & 0x80) | (T >> 2); Note that the multiply in the first line is nearly trivial as it only needs to multiply by 0, 1, 2, 3 or 4. C.2.14 LDR Endpoint Decoding ----------------------------- The decoding method used depends on the Color Endpoint Mode (CEM) field, which specifies how many values are used to represent the endpoint. The CEM field also specifies how to take the n unquantized color endpoint values v0 to v[n-1] and convert them into two RGBA color endpoints e0 and e1. The HDR Modes are reserved. How they are dealt with is documented in the following section. The methods can be summarized as follows. ------------------------------------------------- CEM Range Description n ------------------------------------------------- 0 LDR Luminance, direct 2 1 LDR Luminance, base+offset 2 2 -- Reserved for HDR -- 2 3 -- Reserved for HDR -- 2 4 LDR Luminance+Alpha, direct 4 5 LDR Luminance+Alpha, base+offset 4 6 LDR RGB, base+scale 4 7 -- Reserved for HDR -- 4 8 LDR RGB, direct 6 9 LDR RGB, base+offset 6 10 LDR RGB, base+scale plus two A 6 11 -- Reserved for HDR -- 6 12 LDR RGBA, direct 8 13 LDR RGBA, base+offset 8 14 -- Reserved for HDR -- 8 15 -- Reserved for HDR -- 8 ------------------------------------------------- Table C.2.15 - Color Endpoint Modes Decode the different LDR endpoint modes as follows: Mode 0 LDR Luminance, direct e0=(v0,v0,v0,0xFF); e1=(v1,v1,v1,0xFF); Mode 1 LDR Luminance, base+offset L0 = (v0>>2)|(v1&0xC0); L1=L0+(v1&0x3F); if (L1>0xFF) { L1=0xFF; } e0=(L0,L0,L0,0xFF); e1=(L1,L1,L1,0xFF); Mode 4 LDR Luminance+Alpha,direct e0=(v0,v0,v0,v2); e1=(v1,v1,v1,v3); Mode 5 LDR Luminance+Alpha, base+offset bit_transfer_signed(v1,v0); bit_transfer_signed(v3,v2); e0=(v0,v0,v0,v2); e1=(v0+v1,v0+v1,v0+v1,v2+v3); clamp_unorm8(e0); clamp_unorm8(e1); Mode 6 LDR RGB, base+scale e0=(v0*v3>>8,v1*v3>>8,v2*v3>>8, 0xFF); e1=(v0,v1,v2,0xFF); Mode 8 LDR RGB, Direct s0= v0+v2+v4; s1= v1+v3+v5; if (s1>=s0){e0=(v0,v2,v4,0xFF); e1=(v1,v3,v5,0xFF); } else { e0=blue_contract(v1,v3,v5,0xFF); e1=blue_contract(v0,v2,v4,0xFF); } Mode 9 LDR RGB, base+offset bit_transfer_signed(v1,v0); bit_transfer_signed(v3,v2); bit_transfer_signed(v5,v4); if(v1+v3+v5 >= 0) { e0=(v0,v2,v4,0xFF); e1=(v0+v1,v2+v3,v4+v5,0xFF); } else { e0=blue_contract(v0+v1,v2+v3,v4+v5,0xFF); e1=blue_contract(v0,v2,v4,0xFF); } clamp_unorm8(e0); clamp_unorm8(e1); Mode 10 LDR RGB, base+scale plus two A e0=(v0*v3>>8,v1*v3>>8,v2*v3>>8, v4); e1=(v0,v1,v2, v5); Mode 12 LDR RGBA, direct s0= v0+v2+v4; s1= v1+v3+v5; if (s1>=s0){e0=(v0,v2,v4,v6); e1=(v1,v3,v5,v7); } else { e0=blue_contract(v1,v3,v5,v7); e1=blue_contract(v0,v2,v4,v6); } Mode 13 LDR RGBA, base+offset bit_transfer_signed(v1,v0); bit_transfer_signed(v3,v2); bit_transfer_signed(v5,v4); bit_transfer_signed(v7,v6); if(v1+v3+v5>=0) { e0=(v0,v2,v4,v6); e1=(v0+v1,v2+v3,v4+v5,v6+v7); } else { e0=blue_contract(v0+v1,v2+v3,v4+v5,v6+v7); e1=blue_contract(v0,v2,v4,v6); } clamp_unorm8(e0); clamp_unorm8(e1); The bit_transfer_signed procedure transfers a bit from one value (a) to another (b). Initially, both a and b are in the range 0..255. After calling this procedure, a's range becomes -32..31, and b remains in the range 0..255. Note that, as is often the case, this is easier to express in hardware than in C: bit_transfer_signed(int& a, int& b) { b >>= 1; b |= a & 0x80; a >>= 1; a &= 0x3F; if( (a&0x20)!=0 ) a-=0x40; } The blue_contract procedure is used to give additional precision to RGB colors near grey: color blue_contract( int r, int g, int b, int a ) { color c; c.r = (r+b) >> 1; c.g = (g+b) >> 1; c.b = b; c.a = a; return c; } The clamp_unorm8 procedure is used to clamp a color into the UNORM8 range: void clamp_unorm8(color c) { if(c.r < 0) {c.r=0;} else if(c.r > 255) {c.r=255;} if(c.g < 0) {c.g=0;} else if(c.g > 255) {c.g=255;} if(c.b < 0) {c.b=0;} else if(c.b > 255) {c.b=255;} if(c.a < 0) {c.a=0;} else if(c.a > 255) {c.a=255;} } C.2.15 HDR Endpoint Decoding ------------------------- In the LDR profile of ASTC, all HDR endpoint modes are reserved. Color endpoints specified with any of the HDR modes will return the error color. Since different partitions may specify different endpoint modes, this may only affect a subset of texels within a block. It is not classed as a block decode error. C.2.16 Weight Decoding ----------------------- The weight information is stored as a stream of bits, growing downwards from the most significant bit in the block. Bit n in the stream is thus bit 127-n in the block. For each location in the weight grid, a value (in the specified range) is packed into the stream. These are ordered in a raster pattern starting from location (0,0,0), with the X dimension increasing fastest, and the Z dimension increasing slowest. If dual-plane mode is selected, both weights are emitted together for each location, plane 0 first, then plane 1. C.2.17 Weight Unquantization ----------------------------- Each weight plane is specified as a sequence of integers in a given range. These values are packed using integer sequence encoding. Once unpacked, the values must be unquantized from their storage range, returning them to a standard range of 0..64. The procedure for doing so is similar to the color endpoint unquantization. First, we unquantize the actual stored weight values to the range 0..63. For bit-only representations, this is simple bit replication from the most significant bit of the value. For trit or quint-based representations, this involves a set of bit manipulations and adjustments to avoid the expense of full-width multipliers. For representations with no additional bits, the results are as follows: Range 0 1 2 3 4 -------------------------- 0..2 0 32 63 - - 0..4 0 16 32 47 63 -------------------------- Table C.2.16 - Weight Unquantization Values For other values, we calculate the initial inputs to a bit manipulation procedure. These are denoted A (7 bits), B (7 bits), C (7 bits), and D (3 bits) and are decoded using the range as follows: Range T Q B Bits A B C D ------------------------------------------------------- 0..5 1 1 a aaaaaaa 0000000 50 Trit value 0..9 1 1 a aaaaaaa 0000000 28 Quint value 0..11 1 2 ba aaaaaaa b000b0b 23 Trit value 0..19 1 2 ba aaaaaaa b0000b0 13 Quint value 0..23 1 3 cba aaaaaaa cb000cb 11 Trit value ------------------------------------------------------- Table C.2.17 - Weight Unquantization Parameters These are then processed as follows: T = D * C + B; T = T ^ A; T = (A & 0x20) | (T >> 2); Note that the multiply in the first line is nearly trivial as it only needs to multiply by 0, 1, 2, 3 or 4. As a final step, for all types of value, the range is expanded from 0..63 up to 0..64 as follows: if (T > 32) { T += 1; } This allows the implementation to use 64 as a divisor during inter- polation, which is much easier than using 63. C.2.18 Weight Infill --------------------- After unquantization, the weights are subject to weight selection and infill. The infill method is used to calculate the weight for a texel position, based on the weights in the stored weight grid array (which may be a different size). The procedure below must be followed exactly, to ensure bit exact results. The block size is specified as two dimensions along the s and t axes (Bs, Bt). Texel coordinates within the block (s,t) can have values from 0 to one less than the block dimension in that axis. For each block dimension, we compute scale factors (Ds, Dt) Ds = floor( (1024 + floor(Bs/2)) / (Bs-1) ); Dt = floor( (1024 + floor(Bt/2)) / (Bt-1) ); Since the block dimensions are constrained, these are easily looked up in a table. These scale factors are then used to scale the (s,t) coordinates to a homogeneous coordinate (cs, ct): cs = Ds * s; ct = Dt * t; This homogeneous coordinate (cs, ct) is then scaled again to give a coordinate (gs, gt) in the weight-grid space . The weight-grid is of size (N, M), as specified in the block mode field: gs = (cs*(N-1)+32) >> 6; gt = (ct*(M-1)+32) >> 6; The resulting coordinates may be in the range 0..176. These are inter- preted as 4:4 unsigned fixed point numbers in the range 0.0 .. 11.0. If we label the integral parts of these (js, jt) and the fractional parts (fs, ft), then: js = gs >> 4; fs = gs & 0x0F; jt = gt >> 4; ft = gt & 0x0F; These values are then used to bilinearly interpolate between the stored weights. v0 = js + jt*N; p00 = decode_weight(v0); p01 = decode_weight(v0 + 1); p10 = decode_weight(v0 + N); p11 = decode_weight(v0 + N + 1); The function decode_weight(n) decodes the nth weight in the stored weight stream. The values p00 to p11 are the weights at the corner of the square in which the texel position resides. These are then weighted using the fractional position to produce the effective weight i as follows: w11 = (fs*ft+8) >> 4; w10 = ft - w11; w01 = fs - w11; w00 = 16 - fs - ft + w11; i = (p00*w00 + p01*w01 + p10*w10 + p11*w11 + 8) >> 4; C.2.19 Weight Application -------------------------- Once the effective weight i for the texel has been calculated, the color endpoints are interpolated and expanded. For LDR endpoint modes, each color component C is calculated from the corresponding 8-bit endpoint components C0 and C1 as follows: If sRGB conversion is not enabled, C0 and C1 are first expanded to 16 bits by bit replication: C0 = (C0 << 8) | C0; C1 = (C1 << 8) | C1; If sRGB conversion is enabled, C0 and C1 are expanded to 16 bits differently, as follows: C0 = (C0 << 8) | 0x80; C1 = (C1 << 8) | 0x80; C0 and C1 are then interpolated to produce a UNORM16 result C: C = floor( (C0*(64-i) + C1*i + 32)/64 ) If sRGB conversion is enabled, the top 8 bits of the interpolation result are passed to the external sRGB conversion block. Otherwise, if C = 65535, then the final result is 1.0 (0x3C00) otherwise C is divided by 65536 and the infinite-precision result of the division is converted to FP16 with round-to-zero semantics. For HDR endpoint modes, the error color is returned. C.2.20 Dual-Plane Decoding --------------------------- If dual-plane mode is disabled, all of the endpoint components are inter- polated using the same weight value. If dual-plane mode is enabled, two weights are stored with each texel. One component is then selected to use the second weight for interpolation, instead of the first weight. The first weight is then used for all other components. The component to treat specially is indicated using the 2-bit Color Component Selector (CCS) field as follows: Value Weight 0 Weight 1 -------------------------- 0 GBA R 1 RBA G 2 RGA B 3 RGB A -------------------------- Table C.2.18 - Dual Plane Color Component Selector Values The CCS bits are stored at a variable position directly below the weight bits and any additional CEM bits. 3.15 Partition Pattern Generation ------------------------------------ When multiple partitions are active, each texel position is assigned a partition index. This partition index is calculated using a seed (the partition pattern index), the texel's x,y,z position within the block, and the number of partitions. An additional argument, small_block, is set to 1 if the number of texels in the block is less than 31, otherwise it is set to 0. This function is specified in terms of x, y and z in order to support the full profile of ASTC. For 2D textures, z will always be 0. The full partition selection algorithm is as follows: int select_partition(int seed, int x, int y, int z, int partitioncount, int small_block) { if( small_block ){ x <<= 1; y <<= 1; z <<= 1; } seed += (partitioncount-1) * 1024; uint32_t rnum = hash52(seed); uint8_t seed1 = rnum & 0xF; uint8_t seed2 = (rnum >> 4) & 0xF; uint8_t seed3 = (rnum >> 8) & 0xF; uint8_t seed4 = (rnum >> 12) & 0xF; uint8_t seed5 = (rnum >> 16) & 0xF; uint8_t seed6 = (rnum >> 20) & 0xF; uint8_t seed7 = (rnum >> 24) & 0xF; uint8_t seed8 = (rnum >> 28) & 0xF; uint8_t seed9 = (rnum >> 18) & 0xF; uint8_t seed10 = (rnum >> 22) & 0xF; uint8_t seed11 = (rnum >> 26) & 0xF; uint8_t seed12 = ((rnum >> 30) | (rnum << 2)) & 0xF; seed1 *= seed1; seed2 *= seed2; seed3 *= seed3; seed4 *= seed4; seed5 *= seed5; seed6 *= seed6; seed7 *= seed7; seed8 *= seed8; seed9 *= seed9; seed10 *= seed10; seed11 *= seed11; seed12 *= seed12; int sh1, sh2, sh3; if( seed & 1 ) { sh1 = (seed&2 ? 4:5); sh2 = (partitioncount==3 ? 6:5); } else { sh1 = (partitioncount==3 ? 6:5); sh2 = (seed&2 ? 4:5); } sh3 = (seed & 0x10) ? sh1 : sh2: seed1 >>= sh1; seed2 >>= sh2; seed3 >>= sh1; seed4 >>= sh2; seed5 >>= sh1; seed6 >>= sh2; seed7 >>= sh1; seed8 >>= sh2; seed9 >>= sh3; seed10 >>= sh3; seed11 >>= sh3; seed12 >>= sh3; int a = seed1*x + seed2*y + seed11*z + (rnum >> 14); int b = seed3*x + seed4*y + seed12*z + (rnum >> 10); int c = seed5*x + seed6*y + seed9 *z + (rnum >> 6); int d = seed7*x + seed8*y + seed10*z + (rnum >> 2); a &= 0x3F; b &= 0x3F; c &= 0x3F; d &= 0x3F; if( partitioncount < 4 ) d = 0; if( partitioncount < 3 ) c = 0; if( a >= b && a >= c && a >= d ) return 0; else if( b >= c && b >= d ) return 1; else if( c >= d ) return 2; else return 3; } As has been observed before, the bit selections are much easier to express in hardware than in C. The seed is expanded using a hash function hash52, which is defined as follows: uint32_t hash52( uint32_t p ) { p ^= p >> 15; p -= p << 17; p += p << 7; p += p << 4; p ^= p >> 5; p += p << 16; p ^= p >> 7; p ^= p >> 3; p ^= p << 6; p ^= p >> 17; return p; } This assumes that all operations act on 32-bit values C.2.22 Data Size Determination ------------------------------- The size of the data used to represent color endpoints is not explicitly specified. Instead, it is determined from the block mode and number of partitions as follows: config_bits = 17; if(num_partitions>1) if(single_CEM) config_bits = 29; else config_bits = 25 + 3*num_partitions; num_weights = M * N * Q; // size of weight grid if(dual_plane) config_bits += 2; num_weights *= 2; weight_bits = ceil(num_weights*8*trits_in_weight_range/5) + ceil(num_weights*7*quints_in_weight_range/3) + num_weights*bits_in_weight_range; remaining_bits = 128 - config_bits - weight_bits; num_CEM_pairs = base_CEM_class+1 + count_bits(extra_CEM_bits); The CEM value range is then looked up from a table indexed by remaining bits and num_CEM_pairs. This table is initialized such that the range is as large as possible, consistent with the constraint that the number of bits required to encode num_CEM_pairs pairs of values is not more than the number of remaining bits. An equivalent iterative algorithm would be: num_CEM_values = num_CEM_pairs*2; for(range = each possible CEM range in descending order of size) { CEM_bits = ceil(num_CEM_values*8*trits_in_CEM_range/5) + ceil(num_CEM_values*7*quints_in_CEM_range/3) + num_CEM_values*bits_in_CEM_range; if(CEM_bits <= remaining_bits) break; } return range; In cases where this procedure results in unallocated bits, these bits are not read by the decoding process and can have any value. C.2.23 Void-Extent Blocks -------------------------- A void-extent block is a block encoded with a single color. It also specifies some additional information about the extent of the single- color area beyond this block, which can optionally be used by a decoder to reduce or prevent redundant block fetches. The layout of a 2D Void-Extent block is as follows: 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 --------------------------------------------------------------- | Block color A component | --------------------------------------------------------------- 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 --------------------------------------------------------------- | Block color B component | --------------------------------------------------------------- 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 --------------------------------------------------------------- | Block color G component | --------------------------------------------------------------- 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 --------------------------------------------------------------- | Block color R component | --------------------------------------------------------------- 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 --------------------------------------------------------------- | Void-extent maximum T coordinate | Min T --------------------------------------------------------------- 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 --------------------------------------------------------------- Void-extent minimum T coordinate | Void-extent max S --------------------------------------------------------------- 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 --------------------------------------------------------------- Void-extent max S coord | Void-extent minimum S coordinate --------------------------------------------------------------- 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 -------------------------------------------------------------- Min S coord | 1 | 1 | D | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | -------------------------------------------------------------- Table C.2.19 - 2D Void-Extent Block Layout Overview Bit 9 is the Dynamic Range flag, which indicates the format in which colors are stored. A 0 value indicates LDR, in which case the color components are stored as UNORM16 values. A 1 indicates HDR, which is reserved. If a void-extent block with HDR values is decoded in LDR mode, then the result will be the error color for all texels within the block. The reason for the storage of UNORM16 values in the LDR case is due to the possibility that the value will need to be passed on to sRGB conversion. By storing the color value in the format which comes out of the interpolator, before the conversion to FP16, we avoid having to have separate versions for sRGB and linear modes. The minimum and maximum coordinate values are treated as unsigned integers and then normalized into the range 0..1 (by dividing by 2^13-1 or 2^9-1, for 2D and 3D respectively). The maximum values for each dimension must be greater than the corresponding minimum values, unless they are all all-1s. If all the coordinates are all-1s, then the void extent is ignored, and the block is simply a constant-color block. The existence of single-color blocks with void extents must not produce results different from those obtained if these single-color blocks are defined without void-extents. Any situation in which the results would differ is invalid. Results from invalid void extents are undefined. If a void-extent appears in a MIPmap level other than the most detailed one, then the extent will apply to all of the more detailed levels too. This allows decoders to avoid sampling more detailed MIPmaps. If the more detailed MIPmap level is not a constant color in this region, then the block may be marked as constant color, but without a void extent, as detailed above. If a void-extent extends to the edge of a texture, then filtered texture colors may not be the same color as that specified in the block, due to texture border colors, wrapping, or cube face wrapping. Care must be taken when updating or extracting partial image data that void-extents in the image do not become invalid. C.2.24 Illegal Encodings ------------------------- In ASTC, there is a variety of ways to encode an illegal block. Decoders are required to recognize all illegal blocks and emit the standard error color value upon encountering an illegal block. Here is a comprehensive list of situations that represent illegal block encodings in the LDR operation mode: * The block mode specified is one of the modes explicitly listed as Reserved. * A block mode has been specified that would require more than 64 weights total. * A block mode has been specified that would require more than 96 bits for integer sequence encoding of the weight grid. * A block mode has been specifed that would require fewer than 24 bits for integer sequence encoding of the weight grid. * The size of the weight grid exceeds the size of the block footprint in any dimension. * Color endpoint modes have been specified such that the color integer sequence encoding would require more than 18 integers. * The number of bits available for color endpoint encoding after all the other fields have been counted is less than ceil(13C/5) where C is the number of color endpoint integers (this would restrict color integers to a range smaller than 0..5, which is not supported). * Dual weight mode is enabled for a block with 4 partitions. * Void-Extent blocks where the low coordinate for some texture axis is greater than or equal to the high coordinate. * Void-Extent blocks where the constant color is defined as HDR. Note also that, in LDR profile, a block which has both HDR and LDR endpoint modes assigned to different partitions is not an error block. Only those texels which belong to the HDR partition will result in the error color. Texels belonging to a LDR partition will be decoded as normal. C.2.25 LDR PROFILE SUPPORT --------------------------- In order to ease verification and accelerate adoption, the relationship of this LDR-only subset to the full ASTC specification implies some properties which must be respected. Implementations of this LDR Profile must satisfy the following requirements: * All textures with valid encodings for LDR Profile must decode identically using either a LDR Profile or Full Profile decoder. * All features included only in the Full Profile must be treated as reserved in the LDR Profile, and return the error color on decoding. * Any sequence of API calls valid for the LDR Profile must also be valid for the Full Profile and return identical results when given a texture encoded for the LDR Profile. Additions to Appendix D of the OpenGL ES 3.0 Specification (Shared Objects and Multiple Contexts) None Additions to Appendix E of the OpenGL ES 3.0 Specification (Version 3.0 and before) None