View Full Version : swizzling with low level shaders

Ehsan Kamrani
01-16-2007, 07:15 AM
Is it possible to write the following command when using low level shaders:
DP4 oPos, iPos.xy, temp;
If it's possible, How can we interpreate swizzling in this commnad?

01-16-2007, 09:15 AM
Such swizzle is not possible. Only swizzles with one or four components are possible.

Ehsan Kamrani
01-16-2007, 11:47 AM
You mean I can use the following code?
DP4 oCol, iCol.xxyy, temp;

What about this one:
ADD oCol, iCol.x, temp;


Ehsan Kamrani
01-16-2007, 12:03 PM
Also I taked a quick look at the ARB_vertex_program. Here's a sample program from the reference:

# Program environment parameters:
# c[0].xyz = normalized light direction in object-space
# outputs diffuse illumination for color and perturbed position
ATTRIB iPos = vertex.position;
ATTRIB iNormal = vertex.normal;
PARAM mvp[4] = { state.matrix.mvp };
PARAM lightDir = program.env[0];
PARAM diffuseCol = { 1, 1, 0, 1 };
TEMP temp;
OUTPUT oPos = result.position;
OUTPUT oColor = result.color;

DP3 temp, lightDir, iNormal;
MUL oColor.xyz, temp, diffuseCol;
MAX temp, temp, 0; # clamp dot product to zero
MUL temp, temp, iNormal; # align in direction of normal
MUL temp, temp, 0.125; # scale displacement by 1/8
SUB temp, temp, iPos; # perturb
DP4 oPos.x, mvp[0], temp; # xform using perturbed position
DP4 oPos.y, mvp[1], temp;
DP4 oPos.z, mvp[2], temp;
DP4 oPos.w, mvp[3], temp;
ENDAt this line:
MUL oColor.xyz, temp, diffuseCol;
It has used swizzling with 3 components. I guess that it means we can use from any number of components for swizzling.But how should we interpreate them? Is there a general law?

01-16-2007, 12:30 PM
That's on the destination register, so that's the write mask. All it means is that the .w value is not written to oColor, so it would essentially be:
MUL oColor.xyz, temp.xyz, diffuseCol.xyz;

01-16-2007, 02:30 PM
The vector operations of low level shaders generaly take four component vectors as input and generate four component vector as output.

The swizzle masks define how the four component input vectors for the operation are constructed (this is why they need to have four components, the "foo.y" semantics is in this context a shortcut for "foo.yyyy"). After the operation is completed, components of the resulting vector that are specified in write mask (looks like swizzle mask however it is present on output register and can not specify different order of vector component) are copied into corresponding components of the output register. E.g. if write mask is "temp.z", only the z component of the result is written to z component of the temp register.

In reality the driver can optimize operations that write less than four components however the result must be equivalent to the calculation I mentioned previously.

The mentioned behaviour is different from what you might expect if you previously worked with GLSL:

vec4 result ;
result.zw = foo.xy + bar.yz ;you need to write something like:

TEMP result ;
ADD result.zw, foo.xxxy, bar.xxyz ;Basically in the low level shaders you use the swizzles to move desired data into components you wish to update in the target register and then you use the write mask to prevent the operation from modifying other components of that register.