Vertex Program Skinning

Hi, I am having a problem with skinning (just one bone for now) using arb_vertex_program. On a geforce fx5200 it runs just fine (52.16 drivers, I think), however on my radeon 9800 pro (4.11 drivers) it has some pretty strange results. It runs correctly when compiled under release mode in vc6, but only when run from inside the IDE. Running the program itself from outside of the IDE causes part of the model to stretch out towards ±infinity in the y axis, and running the debug build of the program will cause the same problem but in every direction. Also, when I disable the skinning section of my vertex program, I have no problems, however running the program from outside the IDE causes it to run slower (same exe, however about 170fps vs 130).
I couldn’t find any of these problems on the fx5200 I ran it on. All the program does is load up a model, calculate the bone matrices for a looping animation, and display several copies of the same model.
Here is the vertex program code, if anyone has any suggestions on what I should do or if I’m doing something wrong that would be great.
Thanks

 !!ARBvp1.0

PARAM arr[24] = {program.local[0..23]};
ADDRESS addr;
TEMP r0;

MUL r0.x, vertex.attrib[6].x, 3;
ARL addr.x, r0.x;
DP4 r0.x, vertex.position, arr[addr.x];
DP4 r0.y, vertex.position, arr[addr.x + 1];
DP4 r0.z, vertex.position, arr[addr.x + 2];
MOV r0.w, 1;

DP4 result.position.x, r0, state.matrix.mvp.row[0];
DP4 result.position.y, r0, state.matrix.mvp.row[1];
DP4 result.position.z, r0, state.matrix.mvp.row[2];
DP4 result.position.w, r0, state.matrix.mvp.row[3];

MOV result.color, 1;
MOV result.texcoord[0], vertex.texcoord[0];

END 

I would tripple check that the values you are setting in the matrix array are valid. (ie these debug/release problems are usually un-init memory related, (Nvidia (I think) inits parameters to 0,0,0,1 by default although the spec sayes it can be any value. )