I'd say this is definitely possible when you write your own shader.
You can bind 2 textures, one with the color information, and one with the depth information.
In your vertex shader you take the X- and Y-coordinates from the client, and the Z-coordinate from the depth texture.
In your fragment shader you take RGBA from the color texture.
Yes you can directly write your depth value to gl_FragDepth in the fragment shader.
But that will disable HyperZ/Hierarchical Z-Buffering. That should not be a performance problem if you create a simple 2d game.
There also is ARB_conservative_depth. With this you can tell the driver that you only will increase/decrease the current depth value and may still get the HyperZ/Hierarchical Z-Buffering.
More about it in the wiki: http://www.opengl.org/wiki/Fragment_Shader#Outputs