If you are willing to use the imaging subset and do one CPU lookup transformation per component, you can easily do 32-bit YCrCb to RGB conversion in hardware without using pixel shaders, or at least without having to write shader code.
First construct luminance and chromiance lookup tables which take care of subtracting the bias of 16 from the input components, pre-clamping to [0,219] and [[0,224], and then scaling the result to [0,255].
I didn’t see any previous mention of clamping, but in the case of CCIR-601 YCrCb video data, the valid luminance range is 16-235 and the valid chromiance range is 16-240. On Nuon DVD players, the video data is often algorithmically converted from RGB to the native YCrCb frame buffer format. In this case, you must pre-clamp or else you almost certainly will experience underflow and or overflow for some pixels.
The lookup table entries also expand the range from [16,235] and [16,240] to [0,1]. OpenGL is going to want to operate on [0,1] component values, and the expansion can be done at the same time as clamping, so its easier just to do it there than try to hack OpenGL to work around it.
After transforming the components, the color vector is simply multiplied by the 3x3 color matrix containing the standard CCIR-601 conversion coefficients. After multiplication, the color components are post-biased to take care of the constant portion of the calculation.
The CCIR-601 equations subtract the bias of 16 from the compoents and then normalize them. Unlike luminance, which is normalized to [0,1], chromiance is normalized to
[-1/2,1/2]. I simply factored this into the post-bias values.
int8 LuminanceTable[256];
int8 ChromianceTable[256];
void CalculateTableEntries(uint8 *table, uint8 min, uint8 max)
{
uint8 clampedVal;
for(uint32 i = 0; i < 256; i++)
{
clampedVal = i;
if(clampedVal < min)
{
clampedVal = min;
}
else if(clampedVal > max)
{
clampedVal = max;
}
table[i] = (unsigned __int8)(((double)(clampedVal - min)) * (255.0/(max-min)));
}
}
GLfloat ycrcb2rgbColorMatrix = {
1.000, 1.000, 1.000, 0.000,
1.402, -0.700, 0.000, 0.000,
0.000, -0.340, 1.772, 0.000,
0.000, 0.000, 0.000, 1.000};
int8 LuminanceTable[256];
int8 ChromianceTable[256];
CalculateTableEntries((uint8 *)LuminanceTable,16,235);
CalculateTableEntries((uint8 *)ChromianceTable,16,240);
glMatrixMode(GL_COLOR);
glLoadMatrixf(ycrcb2rgbColorMatrix);
glPixelTransferf(GL_POST_COLOR_MATRIX_RED_BIAS,-1.402/2.0);
glPixelTransferf(GL_POST_COLOR_MATRIX_GREEN_BIAS,(0.70 + 0.34)/2.0);
glPixelTransferf(GL_POST_COLOR_MATRIX_BLUE_BIAS,-1.772/2.0);
for each 32-bit pixel in YCrCb format,
pixel[0] = LuminanceLookup[pixel[0]];
pixel[1] = ChromianceLookup[pixel[1]];
pixel[2] = ChromianceLookup[pixel[2]];
end for
render ycrcb texture
This is sufficient for rendering to screen. I can only assume that the color matrix method would also work for rendering to a texture/pbuffer.
The color matrix is accelerated on almost all nvidia cards according to their extensions table, and I know its supported on at least the recent ATI cards (9500 and above).
[This message has been edited by Riff (edited 09-10-2003).]