Measure lengths in a picture

Hi!

I’ve been asked to measure lengths in a picture. We have an OpenGL program that loads a panoramic image in cube-map format (I wasn’t involved in the project). For now, I’m working only with one face.

Up until now I’ve been reading about OpenGL and how to retrieve pixel information under the mouse cursor (using Qt):


    GLint viewport[4];
    GLdouble modelview[16];
    GLdouble projection[16];
    glGetDoublev( GL_MODELVIEW_MATRIX, modelview );
    glGetDoublev( GL_PROJECTION_MATRIX, projection );
    glGetIntegerv( GL_VIEWPORT, viewport );

    const int x = pme->x();
    const int y = viewport[3] - pme->y();

    qDebug() << "HERE: " << x << y;

    GLfloat color[4];
    glReadPixels( x, y, 1, 1, GL_RGBA, GL_FLOAT, color);
    GLenum error = glGetError();
    std::cout << "RETRIEVED COLOR:" << color[0] << ", " << color[1] << ", " << color[2] << ", " << color[3] << std::endl;
    printf( "	ERROR: %s (Code: %u)
", gluErrorString(error), error );
    if(GL_NO_ERROR != error) throw;

    GLdouble depthScale;
    glGetDoublev( GL_DEPTH_SCALE, &depthScale );
    std::cout << "DEPTH SCALE: " << depthScale << std::endl;
    GLfloat z;
    glReadPixels( x, y, 1, 1, GL_DEPTH_COMPONENT, GL_FLOAT, &z );
    error = glGetError();
    std::cout << "X: " << x << ", Y: " << y  << ", RETRIEVED Z: " << z << std::endl;
    printf( "	ERROR: %s (Code: %u)
", gluErrorString(error), error );
    if(GL_NO_ERROR != error) throw;
    std::cout << std::endl << std::endl;

    GLdouble posX, posY, posZ;
    GLint result;
    result = gluUnProject( x, y, z, modelview, projection, viewport, &posX, &posY, &posZ);
    error = glGetError();
    std::cout << "3D point with POS: " << posX << " " << posY << " " << posZ << std::endl;
    printf( "	ERROR: %s (Code: %u)
", gluErrorString(error), error );
    std::cout << "	glUnProject: " << (( GL_FALSE == result ) ? "FALSE" : "TRUE") << std::endl;
    if(GL_NO_ERROR != error) throw;

Now, I would like to “transform” the coordinates I receive ( like: 0.115003, -0.178121, 0.977627; that I understand are MODEL coordinates) into “real-world” coordinates.

For example, if I have a picture of table (let’s say that the table in the real world has a length of 3 meters) I’m trying to measure those 3 meters using the picture.

So:

[ol]
[li]Is this even possible?[/li][li]If it is possible, what is the next step to do? I’m not asking for code, but for guidance.[/li][/ol]

I have the FOV values and texture sizes and texture scale. I also have the angles involved in the rotation of the panoramic image, the eye position and the origin, up and direction vectors from the camera.

The problem is that I’m not able to use them to get the “real-world” coords.

Any help will be appreciated.

Thanks and cheers,
Andrés.

I believe this type of problem falls into the field of “computer vision” and you may have better luck asking in a forum that is dedicated to that topic - for example you could try the OpenCV (a computer vision library) forums.
I have only very limited experience in that field, but my understanding is that it is possible to do what you want. However, you may need more detailed information about the camera (specifically parameters that describe the projection the camera lens performs). I don’t know enough to make a suggestion what a good next step could be, other than trying to understand the transformations that map a real world point to a pixel in the camera (== a position on the camera sensor). If you had those transformations their inverse would take you from a pixel in the image to a real world location from where you can do your measurements.

Hi Carsten! Thanks for your reply!

I’ll check the camera information you mentioned.

About the transformations, do you mean the OpenGL transformations in the code to load the images and apply them as textures?

Cheers!

No, I meant the transformations that formed the image (the camera applies some projection to get light onto the sensor), those are the ones you’d invert to go from a pixel position to a real world position. The OpenGL transforms on the other hand allow you to go from a mouse position clicked by the user (through some intermediate steps) to a pixel location in the image.
Note how these two steps meet at the location in image space and that by combining them you can go all the way from mouse position to real world position. At least in theory and assuming you can find all the intermediate steps. Sorry this is all somewhat vague and ‘in theory’ only, I just don’t have the experience how this is done in practice.

I see. Thanks!

I’ll check the camera and related docs and see if I can get those transformations.

Thanks again!