Re: Nvidia Dual Copy Engines
I have to point out that this two step method is unusable on AMD cards because the glCopyBufferSubData function is very slow in their current OpenGL implementation.
The best way how to deal with read-backs and buffers in general on AMD is to use AMD_pinned_memory extension.
I ran over this post while searching for using NVIDIA's dual copy engines in OpenGL. I am writing a renderer for a video application in OpenGL, and while planning, I went back and forth between deciding whether we should use OpenCL/CUDA for dealing with buffer transfers or go with the pure OpenGL approach. Unfortunately, for each approach there are some drawbacks, so at the moment we have decided to stick with OGL transfers. Anyway, I have a couple of questions to add to this thread, and I was hoping that maybe someone has tried one the following already:
- Has anyone successfully tried to use NVIDIA's "dual copy engine" with sharing OGL contexts on Linux yet?
- Is there an equivalent OpenGL implementation feature when using AMD? I believe their current GPU generation has a similar feature to perform GPU-asynchronous tasks (e.g. I think you can already use multiple OpenCL command queues for transfers and doing computations… but I would be interested in an OpenGL-only solution).
btw, @l_hrabcak, I have read your chapter "Asynchronous Buffer Transfers" and I very much liked it! Definitely one of the highlights in "OpenGL insights"!