Assimp: skeletal animation: How to

hi everyone,

i’d like to know how to setup a skeletal animation, how to structure the necessary data practically. first, assimp gives me the data like this:


struct Bone {
mat4 offset;
vector<pair<int, float>> vertexweights;
};

struct Mesh { 
vector<vec3> vertices;
vector<vec3> normals;
...
vector<Bone> Bones;
};

struct Scene {
vector<Mesh> meshes;
vector<Animation> animations;
...
};

i know how to extract the geometry so that i can draw the model without bones, for each mesh i create a “drawarrays” call. but there are several things to keep in mind:

  1. a “joint” (or “bone”) has a offset matrix, it transforms from mesh space to “bind pose”
  2. the order in which the joint matrices are multiplied is crucial

how i plan to do it: (vertex shader)


#version 450 core


/* uniform */
/****************************************************/
layout (std140, binding = 1) uniform StreamBuffer {
	mat4 View;
	mat4 Projection;
};

layout (std140, binding = 2) uniform BoneBuffer {
	mat4 Bones[256];
};

layout (location = 0) uniform mat4 Model = mat4(1);
/****************************************************/


/* input */
/****************************************************/
layout (location = 0) in vec3 in_position;
layout (location = 1) in vec2 in_texcoord;
layout (location = 2) in vec3 in_normal;
layout (location = 3) in vec3 in_tangent;
layout (location = 4) in vec3 in_bitangent;
layout (location = 5) in uvec4 in_boneindices;
layout (location = 6) in vec4 in_boneweights;
/****************************************************/


/* output */
/****************************************************/
out VS_FS {
	smooth vec3 position;
	smooth vec2 texcoord;
	smooth vec3 normal;
} vs_out;
/****************************************************/


mat4 Animation()
{
	mat4 animation = mat4(0);
	vec4 weights = normalize(in_boneweights);
	for (uint i = 0; i < 4; i++)
		animation += Bones[in_boneindices[i]] * weights[i];
	return animation;
}


void main()
{
	mat4 Transformation = Model;// * Animation();
	mat4 MVP = Projection * View * Transformation;

	gl_Position = MVP * vec4(in_position, 1);

	vs_out.position = (Transformation * vec4(in_position, 1)).xyz;
	vs_out.texcoord = in_texcoord;
	vs_out.normal = (Transformation * vec4(in_normal, 0)).xyz;
}

so i need to send up to 4 references to joint matrices up to the vertexshader.
if a vertex need less than 4 joint matrices, then i’ll fill the rest with a reference to the very first joint matrix which is by default mat4(1) –> to eliminate undesired effects (as a fallback). i a vertex needs more then 4 joint matrix references, then … what ?

how to build these attribute data ?
vector<vector<unsigned int> indices_lists(vertexcount);
vector<vector<float>> weights_lists(vertexcount);
for (each bone of this mesh)
–> fill both arrays:
each array element belongs to e certain vertex, and is an array of needed joint references (index to uniform joint matrix + weight)

is this the “usual way” to do this or am i thinking a little too simple/complicated/weird ? :doh:

assuming thats rihgt, the next thing is to KEEP THE JOINT MATRIX ORDER, that means i have to rebuild these 2 arrays (of arrays of references to joints): how ?
i guess that i just have to sort the indices (to uniform joint matrices) ascending, and BEFORE that i recursively go through the scene node tree and add needed “bones” BEFORE i add their needed children (if any) … correct, or wrong ?


when done, and assuming thats correct (?), what data to i need on the cpu-side?
– joint offset matrix array
– array of the same size of type mat4, containing computed joint matrices (i have to send as uniform block)
– animations

when computing animations, how to compute the mat4s ?
for each animation
– for each “channel” (affecting exact 1 joint)
---- uniformJoints[…certain.index…] = jointOffsetMatrix[…certain.index…] * interpolatedKeys

… where “interpolatedKeys” are pairs of vec3 location / quat rotation / vec3 scale

is this how its done ?
(i’ve searched for simple examples, could find anything [including model file] that is understandable … this one is simply not transpaerent enough for me :()

i appreciate every advice !!

More than 4 bones per vertex is unlikely. The simplest solution is simply to take the four with the highest weights and discard the rest (reapportioning the weights to the 4 which are kept).

If you want more than 4, you’ll need to change the bone{weights,indices} vectors to arrays of vectors (using arrays of float/uint may use a vec4/uvec4 for each entry).

Use vectors of arrays rather than of vectors, e.g. vector<array<float,4>>. The maximum number of bones per vertex has to be fixed at compile time. But more importantly, the layout has to be array-like (i.e. M vertices with N weights each must be stored as M*N contiguous floats; vectors of vectors won’t give you that.

It appears that assimp refers to bones by name. So you need to construct a mapping between names and indices, then topologically sort that so that the index of a bone is always less than that of its children (that way, you can just process the bones in numerical order, and the parent’s transformation will always have been constructed by the time that you need it to construct the child’s transformation).

For each vertex: for each bone to which it is attached, compute an absolute (object-space) position, then blend all of those using the bone weights.

For vertex i, its bone-space position for bone j is obtained by transforming the vertex’ bind-pose position V[i] into the bone’s space using the bone’s offset matrix Mo[j]. This gives the vertex’ bone-space position Vb[i,j], which remains fixed as the mesh is animated.

Vb[i,j]=Mo[j]*V[i].

The transformation obtained by interpolating the per-bone per-keyframe transformations is the bone’s instantaneous transformation relative to its parent, Mr[j]. Multiplying all such transformations together down the skeletal hierarchy gives the instantaneous absolute (object-space) transformation Ma[j] for each bone.

Ma[j] = Ma[parent[j]] * Mr[j].

Note that Ma[parent[j]] needs to be calculated before Ma[j], which is why you need to topologically sort the bones.

For vertex i, the absolute position Va[i] is obtained by transforming its bone-space position by the bone transformation for each bone, then blending the results according to the bone weights:

Va[i] = sumfor j in bone_indices[i]
= sum

So the matrices which are passed to the vertex shader are Ma[j]*Mo[j].

This is for linear blend skinning. For dual-quaternion skinning, the process is similar, but the transformations Ma[j]*Mo[j] are passed as dual quaternions rather than matrices, and instead of transforming each vertex by a transformation and blending the results, you blend the transformations and transform the vertex by the blended transformation.

Here’s a good tutorial on doing skeletal animation with ASSIMP and OpenGL:

I think this’ll answer your questions. If not, post-back here.

[QUOTE=Dark Photon;1287351]Here’s a good tutorial on doing skeletal animation with ASSIMP and OpenGL:

That’s the link he posted …

thanks very much for your replies!

i’m a little confused, but … ok
here’s a assimp overview:

WHAT THE VERTEXSHADER DOES (on the gpu-side):
– it has up to 4 matrix references available (i mean index + weight)
– it climbs ITSELF up the BONE tree

WHAT I NEED TO DO (on the cpu-side):
– “linearize” the bone tree (that means just create an SORTED array, parent ALWAYS BEFORE child)
– then create the meshes (with matrix references … with ascendingly SORTED indices)
– while the app runs, compute “current pose” matrices

now, assimp applies animations to NODES, not to BONES
a node can be anything, scene (root), a camera or light source, just anything, not just bones or meshes.

a bone:
–> offset matrix = transforms from MESH to BIND pose

a node:
–> transform = transforms from PARENT to CHILD

question:
do i HAVE TO calculate (+ send to GL as uniforms) the complete chain ?
that is:
(MESH to BIND) x
– (BIND to PARENT0) x
– (PARENT0 to CURRENT)

(MESH to BIND) x
– (BIND to PARENT0) x
– (PARENT0 to PARENT1) x
– (PARENT1 to CURRENT)

(MESH to BIND) x
– (BIND to PARENT0) x
– (PARENT0 to PARENT1) x
– (PARENT1 to PARENT2) x
– (PARENT2 to CURRENT)

and…
(MESH to BIND) x
– (BIND to PARENT0) x
– (PARENT0 to PARENT1) x
– (PARENT1 to PARENT2) x
– (PARENT2 to PARENT3) x
– (PARENT3 to CURRENT)

where each (PARENT_X to PARENT_X+1) is the node’s (bone’s) parent-to-child transform

OR just (MESH to BIND) x (BIND to CURRENT) ? (which means the vertex shader climbs itself up the chain, and the “node’s transform” doenst matter at all)

where (MESH to BIND) = bone’s offset matrix
and (BIND to CURRENT) = provided by myself through key interpolation of certain nodes (bones?)

assumption:
i HAVE TO (/can) discard for this purpose animations of NODES that are NOT BONES, right ?

It really shouldn’t, because that’s going to be performing many redundant matrix multiplications (and each invocation will take as long as the worst-case invocation in the work group).

Topological sort. Have a “todo” set, initially containing all bones, and a “done” list (vector, array), initially empty. Remove any bones with no parent from todo and append them to done. Then move those bones whose parent is in done from todo to done. Repeat until you can’t move any more bones (if the todo set isn’t empty at that point you have a cyclic graph, which shouldn’t happen for a skeleton; it should be a tree).

The offset matrix transforms a vertex from bind-pose object (mesh) space to bone space. It’s the inverse of the transformation from bone-space to object space when the skeleton is in the bind pose.

Some formats (mainly those without blending) store vertex positions in bone space, as this makes the animation code simple. Other formats store it in object space, and also store a bind pose which is the set of joint transformations corresponding to the vertex data; this is typically how the data is represented by the editor, and it’s the sensible choice if a vertex can be attached to multiple bones (storing vertex positions in bone space would require one position for each bone).

No. You just need one matrix per bone: the overall transformation from the bind pose to to the instantaneous pose for that bone.

Vnow = (Mobject * (Mroot * Mchild1 * Mchild2 * … MchildN) * Moffset) * Vbind.

The transformation created from an aiNodeAnim replaces that from the bind pose; i.e. each transformation is relative to the parent, but isn’t relative to the bind-pose transformation (the data stored in the model file often is relative to the bind pose; the conversion will have been performed by assimp in this case).

[QUOTE=john_connor;1287377]
assumption:
i HAVE TO (/can) discard for this purpose animations of NODES that are NOT BONES, right ?[/QUOTE]
If you’re importing a model with skeletal animation, there probably won’t be any non-bone animations, except perhaps for a single parent node. Movement animations (walk, run, etc) sometimes include an animated transformation corresponding to the model’s movement. Sometimes this is done by animating the skeleton’s root node, sometimes by transforming the object itself; sometimes it’s omitted, with the motion being handled by other means.

sorry for resurrecting this thread … :dejection:

just to make sure that i’m on the right track, i want to ask:
how do i have to build “transformation chains” ?

i have a “scene graph”, it’s an array of CNode types:


struct CDrawable
{
	unsigned int MaterialReference = 0;
	unsigned int MeshReference = 0;
};

struct CTransformation
{
	glm::vec3 Position = glm::vec3(0, 0, 0);
	glm::quat Rotation = glm::quat(1, 0, 0, 0);
	glm::vec3 Size = glm::vec3(1, 1, 1);
};

struct CNode
{
	/* identifier */
	std::string Name = "";

	/* parent-child relation */
	int ParentIndex = -1;
	std::vector<unsigned int> ChildrenIndices;
	CTransformation Transformation_RelativeToParent;

	/* drawables */
	std::vector<CDrawable> Drawables;
};

to be able to build transformations for all the meshes i have to draw, i’ve got to start from “root” (array index 0, always exists in a CScene), and then recursively traverse the tree, keeping the transformation of the current node as the “base_transform” for all the children.


void RecursivelyDrawNode(const std::vector<CScene::CNode>& nodearray, int nodeindex, const glm::mat4& base_transformation, std::vector<CDrawableNode>& drawablenodes)
{
	/* skip out-of-range indices */
	if (nodeindex < 0 || nodeindex >= nodearray.size())
		return;

	/* draw node */
	mat4 transformation = base_transformation * nodearray[nodeindex].Transformation_RelativeToParent.Matrix();
	for (auto& drawable : nodearray[nodeindex].Drawables)
	{
		CDrawableNode drawablenode;
		drawablenode.Transformation = transformation;
		drawablenode.NodeReference = nodeindex;
		drawablenode.MaterialReference = drawable.MaterialReference;
		drawablenode.MeshReference = drawable.MeshReference;
		drawablenodes.push_back({ drawablenode });
	}

	/* draw children */
	for (auto& childindex : nodearray[nodeindex].ChildrenIndices)
		RecursivelyDrawNode(nodearray, childindex, transformation, drawablenodes);
}

that the recursion. but how do i add keyframe animations ?
is it like this:

glm::mat4 transformation_for_this_node = base_transformation * relative_to_parent * animated_transformation;

/* ... using "transformation_for_this_node" as "base_transformation" for children ??? */

regarding key interpolation i use lerp for vec3 position, scale and slerp for quat rotation

[QUOTE=john_connor;1288221]sorry for resurrecting this thread … :dejection:

just to make sure that i’m on the right track, i want to ask:
how do i have to build “transformation chains” ?

i have a “scene graph”, it’s an array of CNode types:


struct CDrawable
{
	unsigned int MaterialReference = 0;
	unsigned int MeshReference = 0;
};

struct CTransformation
{
	glm::vec3 Position = glm::vec3(0, 0, 0);
	glm::quat Rotation = glm::quat(1, 0, 0, 0);
	glm::vec3 Size = glm::vec3(1, 1, 1);
};

struct CNode
{
	/* identifier */
	std::string Name = "";

	/* parent-child relation */
	int ParentIndex = -1;
	std::vector<unsigned int> ChildrenIndices;
	CTransformation Transformation_RelativeToParent;

	/* drawables */
	std::vector<CDrawable> Drawables;
};

to be able to build transformations for all the meshes i have to draw, i’ve got to start from “root” (array index 0, always exists in a CScene), and then recursively traverse the tree, keeping the transformation of the current node as the “base_transform” for all the children.


void RecursivelyDrawNode(const std::vector<CScene::CNode>& nodearray, int nodeindex, const glm::mat4& base_transformation, std::vector<CDrawableNode>& drawablenodes)
{
	/* skip out-of-range indices */
	if (nodeindex < 0 || nodeindex >= nodearray.size())
		return;

	/* draw node */
	mat4 transformation = base_transformation * nodearray[nodeindex].Transformation_RelativeToParent.Matrix();
	for (auto& drawable : nodearray[nodeindex].Drawables)
	{
		CDrawableNode drawablenode;
		drawablenode.Transformation = transformation;
		drawablenode.NodeReference = nodeindex;
		drawablenode.MaterialReference = drawable.MaterialReference;
		drawablenode.MeshReference = drawable.MeshReference;
		drawablenodes.push_back({ drawablenode });
	}

	/* draw children */
	for (auto& childindex : nodearray[nodeindex].ChildrenIndices)
		RecursivelyDrawNode(nodearray, childindex, transformation, drawablenodes);
}

that the recursion. but how do i add keyframe animations ?
is it like this:

glm::mat4 transformation_for_this_node = base_transformation * relative_to_parent * animated_transformation;

/* ... using "transformation_for_this_node" as "base_transformation" for children ??? */

regarding key interpolation i use lerp for vec3 position, scale and slerp for quat rotation[/QUOTE]

It’s actually very easy : first, you need to interpolate to find the keyframed bone transform for the current animation time - for every bone affected by “the current animation” (I won’t talk about blending animations here).
Having done so, know that for some bone, you must multiply the animated transform with the local transform for that bone.
The animated transform modifies the local transform prior to calculating the world transform for some Node in the Graph.
To be clear, we normally say that for node N, the world transform is the parent of N world transform * the local node transform.
Make it be the parent world * (local * animated) transform, so the animation is taken into account.
Now you have a world transform for each node, the last step is to pre-multiply by the bone offset transform, loaded with the skeleton. This is just the inverse of the bone’s world transform when its in model space. Happy to clarify this further, hell it took me years to understand and make it work, (and more years to work out how to instance them!) happy to share!

the skeletal stuff works fine, i replaced the array nonsense by a hierarchical scene graph (a node, containing pointers to dynamically allocated child nodes). what still doesnt work is “skinning” meshes, in other woords: using the bone offset matrix somehow … this is what i have now:
https://sites.google.com/site/john87connor/home/6-1-loading-models-with-assimp

next would be mesh skinning to animate characters and so on …

question:
consider a node in a model, it has a parent node and several child nodes
the FINAL transformation of parent = P
the FINAL transformation of this node = T = ???
base transformation of this node (relative to parent) = B
blended keyframe transformation of this node = A
bone / node offset transformation = X

bind pose
T = P * B

animated pose
T = P * B * A

skinned animated pose ?
T = ?

correct me if i’m wrong …