Jambolo is right. It is like in the real world:
There are several units (meters, feet, yards, miles, AU, Ångstöm etc). Depending on which unit you chose, a certain object (say, a table) will have different “sizes”. For instance, if you chose the unit meter, your table might be 2.5 units wide, but if you chose the unit feet, it will be 8.3 units wide.
What it all comes down to is that it does not matter what unit you chose. To make things look realistic you only have to be consistent and make sure that the proportions between different objects, lights and cameras are realistic.
I usually use the following “standard”: one OpenGL unit is one meter, and the camera FOV is in the range 60-90 degrees.
A few comments… Don’t bother about pixels! They don’t mean anything! With an OpenGL “unit”, we refer to the 3D coordinates before they are transformed.
Another thing is the FOV - the human eye has a FOV of over 180 degrees. However, if we wanted to use that on a monitor, the monitor would physically have to “surround” us so that it covers our FOV. Depending on how large your monitor is, and how close you sit to it, it covers a certain amount of your FOV (normally 30-40 degrees). If we used that for the rendering FOV, we would get very narrow/telescopic views (it’s roughly equivalent to looking through a 6 cm long pipe that is 4 cm in diameter - try it with a cut-off, used up toilet roll ).
So normally we use a compromize: a little bit more than the actually seen FOV (the FOV covered by the monitor), and alot less than the entire human FOV (>180 degrees). This gives an impression of “being there”, while not distorting the image too much.