The old thread of the same name was closed (maybe because the last activity was almost two years ago?!) but I hope it’s ok if I “revive” it.
I am trying to get mfort’s CUDA workaround described in post #47 in the thread (“slow-transfer-speed-on-fermi-cards”, the bbs-system won’t let me post with URLs… sigh) to work, but I am banging my head into a wall at the moment.
When I try to execute
cErr = cudaMemcpyFromArray( cuda_mem, cArray, 0, 0, 32, cudaMemcpyDeviceToHost );
I get a SIGFPE (Arithmetic exception) even before it returns an error. (Note that the count 32 is just for testing, I have malloc’ed successfully much more memory.) If I use DeviceToDevice etc., I get the expected cudaErrorInvalidMemcpyDirection, and if I try to copy 0 bytes, it does not fail. All commands up until this point (both CUDA and OpenGL) has completed successfully.
I have done what mfort outlined, but the instructions do not say how
the renderBufferId object is set up. I tried with this
glGenRenderbuffers(1, &renderBufferId);
glBindRenderbuffer(GL_RENDERBUFFER, renderBufferId);
// glFramebufferRenderbuffer( GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER,
// GL_RENDERBUFFER, renderBufferId );
and also combinations with the commented out function call enabled, none of which works. I thought maybe the glFramebufferRenderbuffer would be required to connect the framebuffer (or whatever it is correct to call the thingy that OpenGL renders into) to the renderbuffer that the CUDA-copy will do the actual copying from.
Anybody got any ideas that I could try out?
J.