Does the silence mean no one knows?
One of the reasons I hate GL_ARB_buffer_storage is that you can't specify the memory location except for the stupid hint.
If you could it would be easy to use one system memory buffer, access it with easy read/write and then use glCopyBufferSubData (ARB_copy_buffer) to move it over to a vram buffer. That would pretty much ensure that the DMA copy engine would be used and no CPU/GPU cycles would be wasted.
Why don't you try and share with us the findings. Catch for a few seconds what happens on your system when the transfer is active, and analyze with GPUView.
Originally Posted by Prune
What I can confirm is that Fermi really has three hardware queues, and some actions are done in parallel (other queues are used for Desktop Window Manager in the cases I saw).