Just simple, clear questions about Basics of 3D

artariel · June 27, 2012, 4:17am

It is too bad that i have always worked with 2D coordinate system, so everything was just simple, just snap things around x,y

-What is Eye Space
-What is Model Space
-What is Modelview
-What is Projection
-What is World
-What do matrices of above stand for
-Why can’t i just work with x,y,z
-Why do i have to transform things, multiple with some matrices
-Does every object have a certain x,y,z coordinate
-Coordinates are relative with what ? I mean in 2d, everything is based on 0,0 point. no world things projections etc., you just put things in a certain 2d location.

kRogue · June 27, 2012, 6:28am

I think the best place for you to start is to remember that in 3D graphics API’s (such as OpenGL or D3D) you are still drawing to a 2-dimensional screen. This is key. Each of these concepts:
[ul]
[li] What is Eye Space[/li][li] What is Model Space[/li][li] What is Modelview[/li][li] What is Projection[/li][li] What is World[/li][li] What do matrices of above stand for[/li][li] Why do i have to transform things, multiple with some matrices[/li][/ul]

are ultimately about how to transform points in 3-space projected onto a 2-dimensional screen. In OpenGL there are some standard coordinates to be aware of:

[ul]
[li] Window coordinates. These essentially name a pixel on the screen. However, the GL convention is that (0,0) is the bottom left corner[/li][li] In GL, via glViewport you can restrict drawing to a portion of the window[/li][li] Normalized Device Coordinates: there are for x, y and z. The values that are visible are within [-1,1] and are mapped to the viewport specified via glViewport. For example (-1,-1, z) means the bottom left corner of the viewport and (0,0, z) means the middle of the viewport. The z-value determines what value to “test and write” for depth testing, again in the range [-1,1]. -1 indicating minimum and +1 indicating maximum. That too when looking at the depth buffer has an analgoue to glViewport via glDepthRange. Though, use of glDepthRange is typically not a starter place.[/li][li] Eye coordinates are the coordinates relative to the viewer of a point. So if the viewer “moves up 10 units” then the point’s y-coordinate reduces by 10. Also if the point “moves down 10 units” it’s coordinate does too. Very often the composition of “where the point is in the world being simulated” and “where the viewer is and looking at” are assembled together in one transformation represented by a Modelview matrix. There are other ways to represent transformations besides a 4x4 matrix, but this method is easy for developers and machines to handle.[/li][li] Projection refers to taking eye-coordinate and getting what are called “Clip Coordinates”. Roughly speaking, 3D-coordiataes in GL are referenced by 4 numbers. (x,y,z,w) where usually w is 1 for eye-coordinates. Projective coordinates produces another tuple of 4 numbers (xC, yC, zC, wC) which are then used to figure out the normalized device coordinates. The formula is simple: (xC/wC, yC/wC, zC/wC). The matrix that produces (xC, yC, zC, wC) from eye coordinates is usually called a projection matrix.[/li][/ul]

As a side note matrix multiplication is matrix operation composition.

artariel · June 28, 2012, 7:01am

One more : Is what i do in GLSL done in eye space ?

thokra · June 28, 2012, 9:05am

What space you operate on is up to you. However, there are certain clearly defined spaces that are part of transformation pipeline:

Object (or Model) Space => Eye (or View) Space => Clip Space => Normalized Device Coordinates => Screen Space

For instance, if you say your object is defined in object-space you have, at least mathematically, no relation between vertices of the object and other vertices of other objects. Operating on vertices from different spaces usually gives you false results and is basically nonsense. If you transform your object-space vertices to world-space then you have a precisely defined relation between the vertices and other vertices in world-space.

However, if you have a 3x3 model-matrix M which is the identity and a object-space 1x3 vector V_object you get

V_object = M * V_object = V_world -> V_object = V_world (where V_world is the world-space position)

Still, mathematically you do a linear transform from R^3 to R^3.

The difference becomes more obvious when doing calculations with mixed world-space and eye-space values. If you have a 4x4 model-matrix M and a 4x4 view-matrix V and two 1x4 object-space vectors V1_object and V2_object consider the following equations:

V1_world = M * V1_object

V2_eye = V * M * V2_object (where V != I)

V3_wtf = V1_world - V2_eye

Now you got V3_wtf which has been calculated from vectors which conceptually reside in different spaces (although mathematically both are still in R^4) where their relation may (and most likely will) be completely different. This is a frequent source of errors when doing lighting calculations - at least for beginners.

The quintessence: You can use whatever space you want but (generally) NEVER do your math with stuff from different spaces.

Just to add some code to show some GLSL:


uniform mat4 ModelMatrix;
uniform mat4 ViewMatrix;
uniform mat4 ProjectionMatrix;

// our vertices - defined in object space
in vec4 Position;

void main()
{
    vec4 WorldSpaceVec = vec4(1.0, 1.0, 1.0, 1.0);
    vec4 WorldSpacePos = ModelMatrix * Position;

    // good - substract vectors from the same space
    vec4 WorldSpaceDiff = WorldSpaceVec - WorldSpacePos;

    vec4 EyeSpacePos = ViewMatrix * WorldSpacePos;

    // bad - substract world-space vector from eye-space vector
    vec4 WTFVec = EyeSpacePos - WorldSpaceDiff;
    
    vec4 ClipSpacePos = ProjectionMatrix * EyeSpacePos;
    vec4 NormalizedDeviceCoordinates = ClipSpacePos / ClipSpacePos.w;
}

artariel · June 28, 2012, 9:34am

What happens when i multiple gl_vertex by gl_modelviewmatrix in gl_position ? which space will i get with both gl_modelviewmatrix and gl_modelviewprojectionmatrix ?

thokra · June 28, 2012, 10:10am

You’d get the eye-space position. Which, when assigned to gl_Position, will most likely not yield the result you want. If you multiply by both the model-view and the model-view-projection matrix the universe will explode.

Are you familiar with matrix composition and what it implies?

artariel · June 28, 2012, 10:39am

so if i multiple real coordinates (i mean the one we do with gltranslatef()) with gl_modelviewmatrix, i will get eye coordinates. once everything is in the same space, i can freely compute them with each other ? by the way, is gl_vertex in glsl real coordinates without any multiplication ? and i have heard that -gl_position = eye vector in glsl, how true is that ?

thokra · June 28, 2012, 12:04pm

so if i multiple real coordinates (i mean the one we do with gltranslatef()) with gl_modelviewmatrix[…]

Stop right there. glTranslatef() has nothing to do with vertices or coordinates. glTranslatef() constructs a 4x4 translation matrix T and and automatically multiplies the current (or top) matrix, on the currently selected stack, e.g. GL_MODELVIEW, by this translation matrix: NewMatrix = T * C

Now, gl_ModelViewMatrix will contain whatever matrix is currently at the top of the GL_MODELVIEW matrix stack. If you do the following



vec4 EyeSpacePos = gl_ModelViewMatrix * gl_Vertex;

you can see the obvious result of that operation.

once everything is in the same space, i can freely compute them with each other ?

You can compute whatever you want - GLSL won’t hinder you. Computing stuff with operands that are in the same space, however, can produce meaningful results in the current context. Calculating stuff with operands from unrelated spaces most likely won’t.

by the way, is gl_vertex in glsl real coordinates without any multiplication ?

gl_Vertex is set to the value the current invocation of the vertex shader will process, i.e. if you send some vertices v1, v2, v3, … and so to the GPU, no matter how, the vertex shader will be invoked multiple times to process all vertices. Invocation one may process v1, invocation 2 may process v2 and so on. gl_Vertex will take on the value of v_n for some invocation m. In general, vertices are defined to be in object-space - that’s why we multiply be the model-view matrix in the first place. To take them to a space where other vertices from objects may already be.

and i have heard that -gl_position = eye vector in glsl

This might be the case if you somehow manage to set gl_Position = -eye. I may be ignorant, but I’ve not seen that as of yet.

, how true is that ?

In general such a suggestion is complete nonsense.

One final thought: I think you need to work through an introductory text on general graphics programming and a similar text on OpenGL. You get many important things mixed up and reading this forum alone will not save you when problems come up. And come up they will.

artariel · June 28, 2012, 2:44pm

what is “homogenous coordinates” of an object ? Fixed location in space, like everything in 2d has ? I read that in definition of gl_vertex, homogenous coordinates of a vertex point.

thokra · June 28, 2012, 3:19pm

The homogeneous coordinate, usually dubbed w, is a little trick used to be able to extend from R^3 (e.g. 3D vectors and 3x3 matrices) to R^4 which allows to pack a rotation and translation and projections into a single 4x4 matrix. w is usually 1, but can be any non-zero number. (I’m not quite sure at the moment if a negative w is possible or would make sense).

One thing that is notable is that after doing a perspective projection, the w coordinate usually is not 1 or any other preset value anymore - when doing an orthgraphic projection w doesn’t change.

After perspective projection, a vertex has coordinates in the range [-w, w] and is of the form (x, y, z, w). You’ll notice the difference: the components of the vertex are now within a defined range which can easily be mapped to the so called canonical volume - i.e. a cube which spans from [-1, 1] in either direction. Vertices which have coordinates in this space are also said to be in normalized device coordinates and can easily be transformed to any viewport.

I repeat: You can learn all this from a text on computer graphics! If you’re serious about graphics programming you should read such a text!

Edit: Although introducing w has consequences when doing transformations, my explanation is lacking the mathematical foundation. See this for more on the topic: Homogeneous Coordinates