I am trying to create a simple Augmented reality application in OpenGL.
What i have is a set of images, the global coordinates and rotation of the camera (actually i have quaternions representing rotation), and the intrinsic camera parameters.

I am following http ://ksimek.github.io/2013/06/03/calibrated_cameras_in_opengl/ to implement the camera.

i am using
Code :
 glOrtho(0, 640, 480, 0, 0.1, 100);

Overall intrinsicMat = glOrthoMatrix*ModifiedCameraMatrix


Code :
extrinsicMat = {invR, -invR*t; 0, 1} //4x4 matrix

where R i get from converting quaternion to matrix and inverting (transposing) it. t is position of camera in global frame.

To display a cube (overlay it on image), i use cube vertices in global frame and use a vertex shader saying -

Code :
    gl_Position = intrinsicMat*(ExtrinsicMat*vec4(ModelRotation*(cubeVertex)+cubeCenter,1.0f));

But as the camera moves in the 3d plane the object does not stay stable, even when it comes into view, it does not seem to hang in the space.

My question - is the above approach correct, or is there an algorithmic mistake (am i missing out some transpose or inverse or my camera calculation is plain wrong).

PS. I am confident that my camera calibrations and my quaternions are correct, as i am using a verified dataset