Swizzling with low level shaders

Is it possible to write the following command when using low level shaders:
DP4 oPos, iPos.xy, temp;
If it’s possible, How can we interpreate swizzling in this commnad?
-Ehsan-

Such swizzle is not possible. Only swizzles with one or four components are possible.

You mean I can use the following code?
DP4 oCol, iCol.xxyy, temp;

What about this one:
ADD oCol, iCol.x, temp;

-Ehsan-

Also I taked a quick look at the ARB_vertex_program. Here’s a sample program from the reference:

!!ARBvp1.0
         #
         # Program environment parameters:
         # c[0].xyz = normalized light direction in object-space
         #
         # outputs diffuse illumination for color and perturbed position
         #
         ATTRIB iPos         = vertex.position;
         ATTRIB iNormal      = vertex.normal;
         PARAM  mvp[4]       = { state.matrix.mvp };
         PARAM  lightDir     = program.env[0];
         PARAM  diffuseCol   = { 1, 1, 0, 1 };
         TEMP   temp;
         OUTPUT oPos         = result.position;
         OUTPUT oColor       = result.color;
 
         DP3   temp, lightDir, iNormal;
         MUL   oColor.xyz, temp, diffuseCol;
         MAX   temp, temp, 0;            # clamp dot product to zero
         MUL   temp, temp, iNormal;      # align in direction of normal
         MUL   temp, temp, 0.125;        # scale displacement by 1/8
         SUB   temp, temp, iPos;         # perturb
         DP4   oPos.x, mvp[0], temp;     # xform using perturbed position
         DP4   oPos.y, mvp[1], temp;
         DP4   oPos.z, mvp[2], temp;
         DP4   oPos.w, mvp[3], temp;
         END

At this line:
MUL oColor.xyz, temp, diffuseCol;
It has used swizzling with 3 components. I guess that it means we can use from any number of components for swizzling.But how should we interpreate them? Is there a general law?
-Ehsan-

That’s on the destination register, so that’s the write mask. All it means is that the .w value is not written to oColor, so it would essentially be:
MUL oColor.xyz, temp.xyz, diffuseCol.xyz;

The vector operations of low level shaders generaly take four component vectors as input and generate four component vector as output.

The swizzle masks define how the four component input vectors for the operation are constructed (this is why they need to have four components, the “foo.y” semantics is in this context a shortcut for “foo.yyyy”). After the operation is completed, components of the resulting vector that are specified in write mask (looks like swizzle mask however it is present on output register and can not specify different order of vector component) are copied into corresponding components of the output register. E.g. if write mask is “temp.z”, only the z component of the result is written to z component of the temp register.

In reality the driver can optimize operations that write less than four components however the result must be equivalent to the calculation I mentioned previously.

The mentioned behaviour is different from what you might expect if you previously worked with GLSL:

vec4 result ;
result.zw = foo.xy + bar.yz ;

you need to write something like:

TEMP result ;
ADD result.zw, foo.xxxy, bar.xxyz ;

Basically in the low level shaders you use the swizzles to move desired data into components you wish to update in the target register and then you use the write mask to prevent the operation from modifying other components of that register.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.