Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Results 1 to 10 of 22

Thread: Nvidia Dual Copy Engines

Hybrid View

  1. #1
    Member Regular Contributor
    Join Date
    Nov 2003
    Location
    Germany
    Posts
    293

    Nvidia Dual Copy Engines

    Very interesting read:
    http://www.nvidia.com/docs/IO/40049/...py_engines.pdf

    Good to finally get high performance asynchronous data streaming to and from device memory.

    But, the thing that annoys me is the note on page 14: "Having two separate threads running on a Quadro graphics card with the consumer NVIDIAŽ Fermi architecture or running on older generations of graphics cards the data transfers will be serialized resulting in a drop in performance."

    Why in hell not enable it for consumer products (if, and i may be wrong here, the hardware feature is present on all high end Fermi chips)? Texture streaming is extremely important there to. I am working in a scientific visualization context and we do have access to Quadro Boards, but we can not afford these a lot cards for every workstation where we develop and demonstrate large volume and image rendering software. The Fermi Quadro boards currently are extremely expensive, so access to them is almost impossible to us.

    The data transport to the GPU is almost always the main bottleneck for us, so the decision to cut this feature (next to quad buffer stereo) is very sad. And i can imagine that D3D, at least for some games, will make use of the extra copy engines... So D3D gets stereo rendering (ok, i know no QBS, but still) and the other cool GPU features.

    Sorry, but i get mad at such decisions.
    -chris

  2. #2
    Member Regular Contributor
    Join Date
    Nov 2003
    Location
    Czech Republic
    Posts
    317

    Re: Nvidia Dual Copy Engines

    Thanks for sharing the link. I was waiting to this paper for some time.

    Actually, I am quite disappointed about their approach. I'd like to be able to use one opengl context/thread and still use both copy engines. I would rather see some solution allowing us to issue memory transfer(s) and gfx commands from the same context and still have it running in parallel. This solution would help everyone without making any changes in the code.

    Making OpenGL context just for memory transfer is strange. This looks like workaround for me. Especially when PBOs are designed to be asynchronous. I understand the issue with one thread. It would break in-order pipeline execution. Maybe something like DirectX command buffers can make it better.

  3. #3
    Intern Newbie
    Join Date
    Oct 2007
    Posts
    47

    Re: Nvidia Dual Copy Engines

    Also being this a Nvidia specific optimization would'n be better to use CUDA OpenGL interop and use two streams using async memcpies from/to pinned mem so we would make use of two copy engines with only one OGL context?
    I have not thinked throughly about that but that should work..
    even better this would work also in Teslas which by the way are more economical..
    This of course would not work with AMD cards and code from whiteppaper although complex optimization for a simple problem works on AMD also..
    But wait OCL supports OGL interop and even I think dual dma copy is usable from OpenCL world but I can't be sure as I don't own a Quadro/Tesla to test..
    So best solution seems to use OCL OGL interop that should provide two benefits:
    better:
    *one OGL context.
    *works on tesla line also
    equal:
    *works on AMD also
    worse:
    *have to manage OCL context

    Perhaps there could be some problems due to "hard" synchronization between OCL/OGL interop but I think with not yet implemented ocl 1.1 and ogl 4.1 advanced OGL/OCL interop should fix al possible issues..
    What do you think?
    Can someone at Nvidia speak about my reasoning?

  4. #4
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,220

    Re: Nvidia Dual Copy Engines

    Quote Originally Posted by Chris Lux
    But, the thing that annoys me is the note on page 14: "Having two separate threads running on a Quadro graphics card with the consumer NVIDIAŽ Fermi architecture or running on older generations of graphics cards the data transfers will be serialized resulting in a drop in performance."

    Why in hell not enable it for consumer products (if, and i may be wrong here, the hardware feature is present on all high end Fermi chips)? Texture streaming is extremely important there to. I am working in a scientific visualization context and we do have access to Quadro Boards, but we can not afford these a lot cards for every workstation where we develop and demonstrate large volume and image rendering software. The Fermi Quadro boards currently are extremely expensive, so access to them is almost impossible to us.
    Seconded, and for exactly the same reason.

    Beyond this, a high-end consumer Fermi (GTX480) being out-benched by 2.6X by a last-gen card (GTX285) with data transfers is embarrassing (Re: slow transfer speed on fermi cards). At least make it as good as the last-gen boards.

  5. #5
    Intern Contributor
    Join Date
    May 2008
    Location
    USA
    Posts
    99

    Re: Nvidia Dual Copy Engines

    Quadro card with consumer Fermi? That's an odd modifier - so there's Quadro cards and *Quadro* cards?

    And we have to guess which it is?

    Bruce

  6. #6
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,220

    Re: Nvidia Dual Copy Engines

    Quote Originally Posted by Bruce Wheaton
    Quadro card with consumer Fermi? That's an odd modifier - so there's Quadro cards and *Quadro* cards?

    And we have to guess which it is?

    Bruce
    By "high-end consumer Fermi" I meant high-end "consumer GPU" (i.e. GeForce, as opposed to their "professional GPU" line: Quadro) with a chipset based on the "Fermi" chip line.

    They do have professional line (Quadro) Fermi-based GPU, but I wasn't referring to them.

    And as for guessing, while they do tell you on the NVidia pages what is "Fermi"-based, for more detail search the web. Reviews/wikipedia/etc. GFxxx chipset codenames are Fermi.

  7. #7
    Member Regular Contributor
    Join Date
    Nov 2003
    Location
    Germany
    Posts
    293

    Re: Nvidia Dual Copy Engines

    He was referring to the original Nvidia statement in my initial post, where they differentiate Fermi Quadros and consumer Fermi Quadros.

  8. #8
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,220

    Re: Nvidia Dual Copy Engines

    Ah, yeah. That is confusing.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •