nvparse documentation

I am trying to learn how to program the register combiners on my GeForce4 Ti4200 card using nvparse. So far, I’ve found lots of sample code and PPT presentations (in the nvOpenGLSDK), but almost no official documentation of the RC language understood by nvparse.

I was wondering if someone on this forum could help me clarify a few sticky points.

  1. What does expand(…) do?

In the code below:

// Specular Lighting (N’•H)
// tex0: normal map (N’)
// tex1: normalization cube-map (H)
!!RC1.0
{
rgb {
spare0 = expand(tex0) . expand(tex1); // NdotL
}
}
out.rgb = spare0; // auto clamped to [0,1]

I assume that expand(tex0) means that it will treat the 32-bit entity tex0 as a 4-element entity (a “vector”) for the purposes of computing the dot product. Is this accurate? If so, how is the dot product computed: is it just the sum of the pairwise products of R, G, B and A components of the two “vectors”? Is the resulting value “clamped” in any way?

  1. When is the output “auto-clamped”?

The comment on the last line in the code above indicates that the output “auto-clamped to [0,1]”. How is this auto-clamping actually computed? Does this happen only for dot-product computation? What does it mean to assign a value in the range [0,1] to out.rgb (which expects a 32-bit entity, composed of R,G,B,A)?

  1. When should unsigned(…) be used?

In the code below:

// Specular Lighting (N’•H)^4
!!RC1.0
{
rgb {
spare0 = expand(tex0) . expand(tex1); // NdotH
}
}
{
rgb {
spare0 = unsigned(spare0) * unsigned(spare0);
}
}
final_product = spare0 * spare0;
out.rgb = final_product;

In the 2nd combiner stage, why is spare0 squared as unsigned(spare0) * unsigned(spare0) instead of spare0 * spare0? unsigned(…) seems to clamp the value to the range [0,1]; if unsigned(…) is not called, what is the range for spare0?

  1. How should discard be used?

In the code below:

{ // normalize V (step 1.)
rgb {
spare0 = expand(col0) . expand(col0); // VdotV
}
}
{ // normalize V (step 2.)
rgb {
discard = expand(col0); // V in [-1…1]
discard = half_bias(col0) * unsigned_invert(spare0);
col0 = sum();
}
}

Why is discard assigned twice, in the 2nd combiner? From reading another PPT (on texture blending using combiners), it seems that discard is a write-only register; how is discard actually used?

If there is some detailed nvparse() documentation available somewhere, please point it out.

Thanks in advance!

Razvan.

  1. What does expand(…) do?

expand maps the value from the [0,1] range to the [-1,1] range. So expand(tex0) would basically be this: 2.0 * (tex0 - 0.5)

I assume that expand(tex0) means that it will treat the 32-bit entity tex0 as a 4-element entity (a “vector”) for the purposes of computing the dot product. Is this accurate?

Yes.

If so, how is the dot product computed: is it just the sum of the pairwise products of R, G, B and A components of the two “vectors”? Is the resulting value “clamped” in any way?

the dot product would be computed like this:

vec1.xvec2.x +
vec1.y
vec2.y +
vec1.z*vec2.z

everything in the register combiners is in the -1,1 range so the output of the dot product would fit in that range.

  1. When is the output “auto-clamped”?

When its output from the register combiners.

What does it mean to assign a value in the range [0,1] to out.rgb (which expects a 32-bit entity, composed of R,G,B,A)?

each component of the output vector is clamped to the [0,1] range.

  1. When should unsigned(…) be used?

When you want to clamp the value to [0,1] range.

In the 2nd combiner stage, why is spare0 squared as unsigned(spare0) * unsigned(spare0) instead of spare0 * spare0? unsigned(…)

unsigned is used to ensure the dot product isn’t negative.

if unsigned(…) is not called, what is the range for spare0?

It will be [-1, 1]

  1. How should discard be used?

When you don’t want to store an intermediate result, instead you want to use the result of result of two calculations.

discard = expand(col0); // V in [-1…1]
discard = half_bias(col0) * unsigned_invert(spare0);
col0 = sum();

This discards the result of ‘expand(col0)’ and ‘half_bias(col0) * unsigned_invert(spare0)’ instead of storing them in temp registers such as spare0 or spare1. Instead these two values are summed at the end of the combiner stage and then stored in col0.

If there is some detailed nvparse() documentation available somewhere, please point it out.

Check out this doc: http://developer.nvidia.com/view.asp?IO=gdc2001_programmable_texture

Also, I’d recommend using Cg and the fp20 profile for doing your pixel shader programming. Its much, much easier than using nvparse.

expand maps the value from the [0,1] range to the [-1,1] range. So expand(tex0) would basically be this: 2.0 * (tex0 - 0.5)

tex0 contains RGBA values (range [0, 255]). I assume these are mapped to [0,1] or [-1,1] in the “obvious” way (consider the value as signed and divide by 128). Is that accurate?

This discards the result of ‘expand(col0)’ and ‘half_bias(col0) * unsigned_invert(spare0)’ instead of storing them in temp registers such as spare0 or spare1. Instead these two values are summed at the end of the combiner stage and then stored in col0.

The thing I found extremely confusing is that different values are assigned to “discard”, in consecutive steps, and then sum() or mux() somehow is able to use these different values (even though they seem to have been assigned to the “same place”).

Thanks very much for the follow-up,

Razvan.

The thing I found extremely confusing is that different values are assigned to “discard”, in consecutive steps, and then sum() or mux() somehow is able to use these different values (even though they seem to have been assigned to the “same place”).

I can see why your confused about this. Discard is not like a typical variable or register. The way the register combiners were designed you can’t just say “spare0 = expand(tex0) + half_bias(col0) * unsigned_invert(spare0);” That is too complex to do all at one time in the combiners. So what you have to do is break this equation up into two smaller parts and tell the combiners to discard these two results (ie don’t store them in a regsiter like the spareX registers b/c we don’t need to keep these intermediate values) then give the sum or mux command to take these intermediate computations and sum them or mux them together, which will then be placed into one of the combiner’s outputs (spare0 or spare1).

Also, I’d recommend using Cg and the fp20 profile for doing your pixel shader programming. Its much, much easier than using nvparse.

So you mean the fp20 profile finally works now!?

-SirKnight

[This message has been edited by SirKnight (edited 01-08-2003).]

You need to understand how register combiners work before you can begin to understand what nvparse is doing. It’s really not a shading language, more of a very limited configuration script. You’re basically selecting each ‘stages’ outputs to be one result of a limited set of fixed calculations.
Read the register combiners specification here:- http://oss.sgi.com/projects/ogl-sample/registry/NV/register_combiners.txt

And then use nvparse if you must, but regcombiners using nvparse only serves to lull you into thinking you’ve got more freedom than you actually have.

Oh my goodness, I just looked at the Cg page and yep, Cg does now support a fragment profile for the NV2X hardware. YAY! I sure am glad nvidia were finally able to get it going. Hopefully I can find some examples using it now.

-SirKnight

regcombiners using nvparse only serves to lull you into thinking you’ve got more freedom than you actually have

I agree.

I would not want to write code that manipulates the combiners directly (it’s terribly unreadable and probably hard to debug), but it is important to understand how the combiners work before using nvparse.

I will spend some more time reading the combiners spec.

Thanks,

Razvan.