I have been moving from intermediate mode to to full opengl 3+, it has gone well but I still have a couple of issues to solve. One of the biggest I have is the move to having 4x4 transformation matrix.

I understand where the translation bits go, the rotation bits and the scaling bits go but I am a little unsure as to how best to handle the multiplication. I have a single 4x4 matrix with the scaling, rotation and translation all applied, lets say I have a bone skeleton and i want to work out the initial positions (So I need to multiply in a recursive way). When I multiply this out to work out the bones final positions I "should" (in my mind) just be able to multiply the 4x4 transformations but I don't think this is right.

The scale components effect the rotation components so I am thinking should I actually keep the scale, rotation and translation components separate and multiply these like I did before and only combine to a 4x4 at the very end when I have the final transformation, or am I talking rubbish and the 4x4 matrix multiplication should all work and there is a bug else where in my code? (I can't get my objects to draw correctly as they were before so there is definitely a problem somewhere).

To give you are idea of what I did before, this is the pseudo code for the transformation code I had before (scale, rotation and translation kept as separate items)

newXXX = the result of the multiplication
currentXXX = the current transformation values
toMultiplyXXX = the transformation to multiplied with
(newXXX = currentXXX * toMultiplyXXX)

newPosition = (currentRotation * toMultiplyPosition) + currentPosition
newRotation = currentRotation * toMultiplyRotation
newScale = currentScale * toMultipleScale