Opengl rendering "bypass" with CUDA

Hi All !
I’m trying to modify a render of a 3D model, done with opengl.
This model is projected (rendered) onto camera plane(s), with “gldrawrangeelements” (from a vertex buffer object, and a faces buffer object).
This process is slow .
I was thinking to project all the vertices in parallel with CUDA, to obtain an image “made of points” (so “bypassing” the rendering with opengl ).
How could i proceed ?
Any idea would be greatly appreciated … :tongue:
Thanks !!

Why do you think that this would be faster? And what’s so slow about rendering with OpenGL?

Hi !
Because it incurs
a memory transfer between the CPU and GPU every time the models are ren-
dered to generate 2D image…
Porting the camera projection onto GPU wouldn’t remove the
problem of the CPU-GPU memory transfer ?
Thanks !

Wait, what sysram-vram transfers??

Your vertices and indices to them are in VBOs, thus usually in vram. The gpu fetches them from there (vram) and calculates vertices and fragments in parallel on its 100-2000 cores. And writes the non-rejected fragments onto the framebuffer (vram).
Where exactly is sysram involved here? Except for the 50-60 bytes for the command FIFO per drawcall.

Use a display list, or some modern equivalent. VBOs ? that would solve the problem.