YUV -> RGB in hardware

I’m trying to implement video playing within my OpenGL app. I have some code that does most of what I want, but not using OpenGL. It creates a frame as a YUV overlay, which is then displayed. I need to find a way to display the YUV overlay as a texture within OpenGL. To the best of my knowledge I must convert the frame to a RGB texture. This can be done in software, but since modern video cards (I have a GeForce4 Ti 4400) can do this much faster, I’d really like to be able to harness the power of the GPU.

I saw a SGI newsgroup post about drawing into a pbuffer with YUV source packing and RGBA destination. Is this the way to go? Is there a better way? Can someone point me to some sample code? Any tips?

thanks
Gib

You might be able to pull off the conversion in a register combiner

B=1.164(Y-16)+2.018(U-128)
G=1.164(Y-16)-0.813(V-128)-0.391(U-128)
R=1.164(Y-16)+1.596(V-128)

With YUV mapped to RGB (Y=R, U=G, and B=V), you might be able to get that to work.

I assume there exists a 3x3 matrix to rotate YUV colors to RGB colors. If so, you need to apply this 3x3 matrix to each pixel’s final color.

This can be done in the register combiners on nvidia hardware. Input the matrix as per-stage register combiner constants using glCombinerStageParameterNV(). You’ll be doing three dot products and three adds on your (otherwise) final pixel color, so if I’m thinking about it right you’ll need two or three general combiner stages in addition to any you may already have.

– Zeno

Thanks NitroGL and Zeno. It has also been suggested to me that I should be able to do the color space transformation using the color matrix (in GL_ARB_imaging). It is not clear whether one approach or the other would be faster.

Gib

Well, if there’s an extension for this very thing, it would probably be the fastest and it wouldn’t tie up register combiners that you could use for something else.

However, as we all know, there’s really only one way to find out .

– Zeno

It could be a while before I can find time to put the pieces together to test this, but when I do I’ll report back.

Gib

The imaging subset is not accelerated on any consumer level gfx cards I know of.

Do you mean it will not be done on the GPU?

Yes. Not hw accelerated = not done on GPU, done on CPU

This is on consumer hw, high end workstation boards might support the imaging subset in hardware.

Here’s some ARB_f_p code that I made to convert YUV to RGB (not a complete frag program of course). In is the input image (YUV mapped to RGB), Out is the image output (the converted YUV to RGB), temp is just a TEMP var.

PARAM Half=0.5;
PARAM YUV={ 1.402, 1.772, 0.34414, 0.71414 };

SUB temp.w, In.z, Half;
MAD_SAT Out.x, YUV.x, temp.w, In.x;
SUB temp.z, In.y, Half;
MUL temp.x, YUV.z, temp.z;
SUB temp.y, In.x, temp.x;
MUL temp.x, YUV.w, temp.w;
SUB_SAT Out.y, temp.y, temp.x;
MAD_SAT Out.z, YUV.y, temp.z, In.x;

Here’s a pic of it in action (on my 9700): http://evilman.netfirms.com/yuv.jpg

The right side is with the fragment program enabled, the left side is the raw YUV image.

Now obviously this isn’t going to work on ANY GeForce hardware (unless the drivers support ARB_f_p with NV30 emulation enabled).

[This message has been edited by NitroGL (edited 10-30-2002).]

I don’t believe my GF4 driver supports ARB_f_p, so this precise method won’t be available. But a version of it using register combiners may be feasible.

Thanks for all the suggestions.

Gib