problem with nVIDIA's register combiner language

rgreene · June 4, 2002, 2:59pm

Hey guys, I’m having some unexpected results with the nVIDIA register combiner language, and wonder if somebody could tell me what I’m missing. Here’s a snippet of code:

	nvparse(
		"!!RC1.0

"
"const0 = (1, 1, 1, 1);
"
"{
"
" rgb
"
" {
"
" spare0 = const0 . const0;
"
" spare1 = const0 * tex0;
"
" }
"
"}
"
"out.rgb = spare1;
"
);

This is obviously just a stupid way of saying out.rgb = tex0, however when I change it to:

	nvparse(
		"!!RC1.0

"
"const0 = (1, 1, 1, 1);
"
"{
"
" rgb
"
" {
"
" spare0 = const0 . const0;
"
" spare1 = spare0 * tex0;
"
" }
"
"}
"
"out.rgb = spare1;
"
);

My polys turn up as black. I can’t figure out why, but my guess is because of an alpha problem when I do:

		"		spare1 = spare0 * tex0;

"

I’ve tried isolating it to just use the rgb values, but this doesn’t seem to help. What am I doing wrong guys?

Thanks!

[This message has been edited by rgreene (edited 06-04-2002).]

SirKnight · June 4, 2002, 4:24pm

" spare1 = spare0 * tex0;
"

You cant use spare0 in your computaion like that. If this code were in the 2nd general combiner, 3rd, 4th, etc then you can. spare0 is not a variable but an output line. Here is an example when you can use spare0, or even spare1, in an equation.

nvparse(
"!!RC1.0
"
"{
"
" rgb
"
" {
"
" spare0 = tex0 * col0;
"
" }
"
"}
"
"{
"
" rgb
"
" {
"
" spare1 = spare0 * col1;
"
" }
"
"}
"
"out.rgb = spare1;
"
"out.a = one;
"
);

-SirKnight

rgreene · June 4, 2002, 5:49pm

Okay, looks like I had the wrong idea about how to use register combiners, they are certainly not meant to act as programs.

Is there any functional equivalent (even if it is a multipart implementation) to Direct3D pixel shaders in OpenGL?

Effectively, I want to be able to seperate the r, g, b, and a values and use them as scalars for a texel value, and then add them all up. This way I can have 4 textures applied to a poly based on a texture weight specified in one of the color components.

If you guys are familiar with Direct3D, here’s what I’m trying to reproduce in OpenGL:

// assume c0 - c4 are formed as (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), and (0, 0, 0, 1)
// assume that v0.r + v0.g + v0.b + v0.a == 1
ps.1.1
tex t0
tex t1
tex t2
tex t3
dp3 r1, c0, v0
mul r0, r1, t0
dp3 r1, c1, v0
mad r0, r0, r1, t1
dp3 r1, c2, v0
mad r0, r0, r1, t2
mov r1, v0.a
mad r0, r0, r1, t3

Can anybody help me out with this?

Thanks again!

Korval · June 4, 2002, 7:04pm

it’s simple. I don’t know nvparse that well, but I know that register combiners can do this. Here is basically how it will look (you’ll have to figure out how to get it into nvparse form):

Stage 1:
spare0 = const0 DOT color0
spare1 = const1 DOT color0

Stage 2:
spare0 = spare0 * tex0

Stage3:
spare0 = spare1 * spare0 + tex1

Stage4:
spare1 = const2 DOT color0

Stage5:
spare0 = spare1 * spare0 + tex2

Stage6:
spare1 = color0.a

FinalCombiner:
“Multiply spare1 and tex4, and add spare0 to it” The FinalCombiner can do this.

You see, register combiners come in stages. You could consider each stage a D3D op-code, but the RC’s can do more per op-code than D3D lets you get at. Taking this as an example, you can only get 8 D3D op-codes and 8 RC stages, but this program only took 6 RC’s (not counting the final combiner, which is always free), while the pixel-shader took all 8.

BTW, you said:

“Effectively, I want to be able to seperate the r, g, b, and a values and use them as scalars for a texel value, and then add them all up”

Your D3D shader code doesn’t do that. Each MAD operation does: arg1 = arg2 * arg3 + arg4. In order to get what you say you’re looking for, you need to do this:

mad r0 r1 t1 r0

rather than in the other order. For the RC program, here’s a version that does that:

Stage1:
spare0 = const0 DOT color0
spare1 = const1 DOT color0

Stage2:
spare0 = spare0 * tex0 + spare1 * tex1
//Yes, a single RC stage can do this.

Stage3:
spare1 = const2 DOT color0

Stage4:
spare0 = spare1 * tex2 + spare0

Stage5:
spare1 = color0.a

FinalCombiner:
“spare1 * tex3 + spare0”, which the FC can do in one step.

That takes one fewer RC’s than the original one.

Edit: I’m not certain, but it is entirely possible that the final combiner can get color0.a and use it directly without storing it in spare1 first.

[This message has been edited by Korval (edited 06-04-2002).]

rgreene · June 4, 2002, 10:18pm

Yup, that is exactly what I needed. I didn’t realize that you could do more than 1 operation in a combiner.