FBO Readback problem

I am trying to use GPGPU technique to transform a set of floats by a vertex shader and read the data back in regular RAM.

For this to work, I am passing the floats to be transformed by calling glBegin(GL_POINTS) and a set of glVertex3f. Since the result will be encoded in a 128x128 texture, I also pass a set of texture coordinates which are in fact the coordinates of the homogeneous coordinates for that point in that texture.

Like this:

glBegin(GL_POINTS);
for (LONG p=0;p<nbPoints;p++)
{
int u = (int)p % 128;
int v = (int)p / 128;
float uc = (-1 + u*2.0f/128);
float vc = (-1 + v*2.0f/128);
glTexCoord2d ( uc, vc);
glVertex3f(pnts[p].x, pnts[p].y, pnts[p].z);
}
glEnd();

Then, I have a vertex shader that takes the position of the point, transforms it, and outputs it in the color so that the transformed position ends up in the RGB triplet of the framebuffer.

static const char* vtxProgram = 
struct v2f
{
	float4 	hpos : HPOS;
	float4	color : COL0;
};
v2f main
(
	float4	pos : POSITION,
	float4	col : COLOR0,
	float2	tc0 : TEXCOORD0
)
{
	v2f OUT;

	OUT.hpos.x = tc0.x;
	OUT.hpos.y = tc0.y;
	OUT.hpos.z = 0;
	OUT.hpos.w = 1;

	OUT.color = pos * 2;	// transform the point and output into color
	return OUT;
}

I then call glReadPixel and get my data back.

Now… this works perfectly when using the color buffer (ie. not using any pbuffer or fbo). I wanted to make sure it was possible before I started porting it to a faster, fancier method.
My problem is when I ported it to FBO.
When I do a ReadPixel from the FBO, I get values that aren’t quite right… as if the homogenous coordinates I output from the vertex program are wrong. Only some fragments are being shaded and they seem offsetted somehow. (vertex program HPOS output of -1, -1 does not end up in the first texel of the FBO. I can’t figure it out.

Here’s how I create the texture and read back:

glGenTextures(1, &_iTexture);
glBindTexture(GL_TEXTURE_2D, _iTexture);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, 128, 128, 0, GL_RGBA, GL_FLOAT, 0);
...
glReadBuffer ( GL_COLOR_ATTACHMENT0_EXT);
glReadPixels( 0,0,128,128,GL_RGBA,GL_FLOAT, (void*)floatBuffer );

If anybody has any idea why the vertex program’s output ends up in a different place in the FBO than in a regular framebuffer, I would appreciate all input.

tx