transfer the screen space back to clip space, how?

Hi,

How can I transfer a point in the screen space back to the clip space in the shader?

should be a matrix multiplication .

thanks

should be a matrix multiplication .

You can’t. You need to do a division, so a pure-matrix multiplication isn’t really possible.

The math for reverse-transforming is pretty simple. I cover it here.

Jason??
OMG, you are my hero, thank you thousand times for your fantastic tutorial!
according to your tutorial, I finished a hand modeling with modern opengl here http://vimeo.com/30794383

As you may notice here http://vimeo.com/29135246, I can get the hand global position in the screen space (maybe viewport space only), so to match the 3d model to the exact hand position in the screen space, I think I need to get the clip space coordinate of the hand.

I think I don’t need to calibrate the Kinect for OpenGL, just need a correct view frustum.

I paste here some of your tutorial’s code about my question.

[b]vec3 CalcCameraSpacePosition()
{
vec4 ndcPos;
ndcPos.xy = ((gl_FragCoord.xy / windowSize.xy) * 2.0) - 1.0;
ndcPos.z = (2.0 * gl_FragCoord.z - gl_DepthRange.near - gl_DepthRange.far) /
(gl_DepthRange.far - gl_DepthRange.near);
ndcPos.w = 1.0;

vec4 clipPos = ndcPos / gl_FragCoord.w;

return vec3(clipToCameraMatrix * clipPos);
}[/b]

So that the clipPos is what I needed, and I should pass this value to vertex shader to have the correct position of my matched point.

If I can get the gl_FragCoord from other source, then it won’t be hard to get the clipPos.

Should I do this in shader? I just need one point’s clip position per frame.
What is the gl_depthRange?

thank you

So that the clipPos is what I needed, and I should pass this value to vertex shader to have the correct position of my matched point.

I don’t know. You’re asking a different question now. Your original question assumed that the point was in screen space, but you’re not sure that this is what you’re getting from Kinect. My suggestion is to first be certain of what space your input data is in, then figure out how to get it into the space you need.

What is the gl_depthRange?

I didn’t write all ~400KB of text for those tutorials just so that you could ignore it, copy-and-paste code into your project, and then ask me questions about it that the text already answers. I explained what it is exactly 3 paragraphs below the code you copied.

I just reviewed the output from kinect.

A hand point position in the kinect’s “perspective coordinate” is like this.

ptProjective.X, ptProjective.Y are in pixel coordinate. (100.0, 100.0) means this point is at (100 pixels left, 100 pixels from top), origin is at the top left corner of the window.

ptProjective.Z is in a coordinate space that in centimeters. 1000.0 means this point is away from the camera 1 meter, 1500 means 1.5 meter.

do you have an idea to transfer it back to clip space?
I think I should normalize the Z to [0,1] then try to convert the window (viewport) space back to the clip space.

Sorry that I was confused I thought this was in the screen space.

And I have another question, how to save the rendered image to a specific image data type ?

I heard that one way is to render the framebuffer to a texture and get the data from the texture.

is there a way fast and easy to use?

do you have an idea to transfer it back to clip space?

Why would you want to transform it to clip space? It’s not in a homogenous coordinate system as is. It isn’t projected (is it?); it’s just in a different world space from the one you’re using.

What you want it in is probably world or camera space, to use as a transform offset for your object.

Why would you want to transform it to clip space?

cause I want my 3d model to render at the position from the extracted special “perspective coordinate”, this special point position is changing.

it’s just in a different world space from the one you’re using.

yes, this position is in another world coordinate, so that my mission here is to simulate a same with OpenGL to render my 3d model as it is shot from a real camera.

What you want it in is probably world or camera space, to use as a transform offset for your object.

I wanted the clip space because my model’s global position is in the clip space, I sent this position to shaders to render them out.

All I am doing here is sort of “object tracking” according to my real world camera. Especially, from Kinect, I can in addition get a depth value to use as Z value in 3d space.

If it is not homogeneous, can I still use the formula above?

I wanted the clip space because my model’s global position is in the clip space

I find it highly unlikely that the vertex positions for the mesh you’re trying to render are in clip-space. Are you sure you’re not confusing clip-space with camera space or world space? Or even just model space?

If it is not homogeneous, can I still use the formula above?

No. That formula is for transforming OpenGL window-space coordinates into OpenGL’s clip-space (and later user-defined camera space). The coordinates you have are not in OpenGL window-space, and therefore any data you attempt to feed into that formula will produce garbage.

I tested this code, which works, but not very accurately.

I estimated the “gl_FragCoord.w” by myself according to my GL camera look at position.

[b] float ndcPos[4] ;
ndcPos[0] = ((g_pDrawer->ptProjective.X / CurrentWidth) * 2.0f) - 1.0f ;
ndcPos[1] = (((CurrentHeight - g_pDrawer->ptProjective.Y) / CurrentHeight) * 2.0f) - 1.0f ; //Y should be up
ndcPos[2] = (2.0f * g_pDrawer->ptProjective.Z - 1.0f - 1200.0f) / (1200.0f - 1.0f) ;
ndcPos[3] = 1.0 ;
std::cout <<ndcPos[0]<<" “<<ndcPos[1]<<” "<<ndcPos[2]<<std::endl ;

handPara[0] = ndcPos[0] * 10.0f ;
handPara[1] = ndcPos[1] * 10.0f ;
handPara[2] = ndcPos[2] * -10.0f ;[/b]

ptProjective is from the Kinect. near far value depends on the ptProjective.Z value.

And to estimate the w, I observed the ndcPos[2] (Z), and found 10.0 is good.

Now, the 3d model followed my hand. I uploaded the video, hand follows the ptProjective value. http://vimeo.com/30836638

now, the problem is accuracy.