Pipeline stall question

What’s the performance impact of changing
the modelview matrix? Suppose I fire a
number of vertexes, then call glRotatef()
and the fire more vertexes, will the changing
of the matrix implicitly sync (“fence”) the
previous vertexes, causing transform pipeline
bubbles?

Obviously, this is hardware dependent, and
I’m mostly interested in the answer for
GeForce2.

[This message has been edited by bgl (edited 12-02-2000).]

If I understand you correct (“Will changing the matrix affect vertices that has already been passed?”), then the answer is: NO, definitely no. This is not allowed to happen under any circumstances. Think about what can happen IF it does. Models will look totally screwed (if major changes to the matrix is done), or objects might pop up in the wrong place. If it does, something is seriously wrong.

Bob, you’re right, but I think bgl was asking whether matrix state changes would stall the pipeline until all vertices issued under the previous matrix state had been completely processed.

Obviously this is implementation-dependant, but for NV hardware the answer is no. I asked a while back about the performance cost of matrix changes (and whether transform sorting made sense compared to texture/blend sorting), and Cass stated that the cost of matrix ops was down in the noise. That seems to rule out a pipeline stall.

I don’t know about other architectures, but I’d be VERY surprised if any driver forced a stall in this case.

Originally posted by MikeC:
[b]Bob, you’re right, but I think bgl was asking whether matrix state changes would stall the pipeline until all vertices issued under the previous matrix state had been completely processed.

Obviously this is implementation-dependant, but for NV hardware the answer is no[/b]

Thank you; this was the answer I was looking
for.

Now, if there were only a glMatrixPointer()
extension and a glMatrixWeightsPointer() way
of applying them to vertexes. Yeah, I am
(slowly) working with skeletal animation :slight_smile:

NV_vertex_program. You can fit a lot of matrices into 96 constant registers, and you can do paletted skinning with relative addressing.

Rumor has it that there’s a leaked driver out there that seems to support this extension.

  • Matt

96/12 == 8 (just in case you forgot your
calculator at home).

It seems that using a vertex_program will
substitute “traditional” lighting, so I will
have to reduce at least one matrix to have
the program do lighting.

Would one also be blown on the modelview
transform? And how do I get the N matrix
weights into the program on a per-vertex
basis? Seems like N would be 6 in this case,
which is a respectable number.

Does this mean we will see 4- or 6-matrix
support in DX (and OGL?) in future Detonator
drivers for existing hardware? :slight_smile:

No, actually, 96/4 = 24. All the registers are defined as xyzw vectors.

You would need to implement your own lighting. Generally, multi-matrix vertex blending occurs at the stage of the modelview transform, so you would blend between N matrixes to compute eye coordinates and eye normals, then multiply the resulting eye coordinate by a projection matrix to get clip coordinates. The eye coordinates and eye normals are used in lighting.

  • Matt

That still doesn’t explain how the set of
weighting values for each matrix for each
vertex gets handed to the vertex program.
Or can the vertex program issue DMA requests
for floating-point data? (Doesn’t look like
it, but I haven’t dived deep into it).

I’m sorry to aks here, but what’s the trick with vertex blending. Didn’t get the theory yet. What does it do how?
I could understand if nobody would answer my idiotic questions…

Actually, I want to blend transform matrices
for skeletal animation, not vertexes. Vertex
blending (a la Quake 2) only needs two
matrices and a fixed weight.

Anyway, skeletal animation is where you have
some “bones” (invisible) in your model,
which have swivel “joints” (one end of the
bone, typically) and some degree of freedom
of movement. Then you “tag” each vertex with
which “bone” it belongs to. At seams (joints)
a vertex belongs a little to at least two
bones; sometimes more.

The bones, in turn, are hierarchical, so that
bending your upper-arm will move the joint to
the lower-arm; then you bend the lower arm in
relation to the upper arm.

While you can pre-calculate the various
matrices for the bones, the problem is that
to “fire” an entire mesh in one go, you need
blending support for one matrix for each
bone, which at a minimum (for a biped) is
back, upper&lower left&right legs, ditto
arms, and neck, for a total of 10 matrices.
Also, even if any individual vertex only
belongs to one or two bones (i e has non-0
weights for those matrices) the problem is
that to draw a triangle, you have to involve
three vertexes, and thus you could (worst
case) need 6 matrices active at the same time
to draw that one triangle.

The current consensus is to do all of this
in model space on the host, and then write
the transformed vertexes to the card for
placement in the world, lighting, etc.

For more details, look up “skinning” on your
favourite web search tool.

Originally posted by mcraighead:
Rumor has it that there’s a leaked driver out there that seems to support this extension.

Another rumor tells me there only are Win2K and Win9x versions of this driver around… I am starting to wonder if WinNT is still supported… (just kidding here ! I asked nVidia and they said an NT version will be present for the OFFICIAL release !).

Regards.

Eric

Oh yeah, thanks bgl, I get what you mean. So you don’t only apply only the transformation matrix for one bone, you apply the other as well. But how to weight between two matrices, from the mathematical way, I mean. Is it to “middle” the outcoming vertices, regarding the weights of the matrices?

The way the spec says it’s implemented, it
calculates the transform for each of the
matrices, and then averages the result based
on the weight (i e w*V + (1.0-w)*V’). This
is for the nVidia 2-matrix extension, which
for various reasons is sub-optimal for skin
animation (but is optimal for full mesh
blending).

Oh, and in my sample, you really need another
four bones (feet, hands) to get good
animations, plus possible articulation for
fingers etc. So say 16 matrices for a good
enough game-level animation system.

I now have an answer on how the weighting
data per-vertex gets into the program,
however: glVertexAttrib4fNV(). Dunno how I
could possibly have missed that before even
though I only glanced at the docs.

Looks like, with swizzling, you can store the
weights in only 4 attribute registers for the
16 matrices. Man, I wish I was getting paid
to do this, because it looks like fun, but it
will take quite a while to get it right :frowning: