Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 3 of 6 FirstFirst 12345 ... LastLast
Results 21 to 30 of 51

Thread: slow transfer speed on fermi cards

  1. #21
    Member Regular Contributor
    Join Date
    Nov 2003
    Location
    Czech Republic
    Posts
    317

    Re: slow transfer speed on fermi cards

    Quote Originally Posted by Dark Photon
    For the base glReadPixels it should be fine. glReadPixels must block until rendering completes
    yes, that's right, glReadPixels must block. Therefore you are not measuring performance of DMA transfer but you measure performance of rendering + DMA transfer. That is the problem of the benchmark.

  2. #22
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,190

    Re: slow transfer speed on fermi cards

    Quote Originally Posted by mfort
    yes, that's right, glReadPixels must block. Therefore you are not measuring performance of DMA transfer but you measure performance of rendering + DMA transfer.
    True, though the render time for something so trivial should be negligible overhead.

    However, if there was a card nowadays where rendering a cube was really, really slow (1-5 ms), then I'd agree your point would result in a significant timing difference. But as-is the overhead should be pretty small.

    And just to confirm here, I made this change (putting a glFinish() before sampling the start time) and did not see any timing difference.

  3. #23
    Junior Member Regular Contributor
    Join Date
    Feb 2004
    Posts
    248

    Re: slow transfer speed on fermi cards

    Quote Originally Posted by Dark Photon
    The only thing I see wrong is the glFlush -- it superfluous in both cases. Removing it does not affect the performance of either case here.
    I put that in on short notice just to avoid the usual "your timing is not correct" reply. Unfortunately for me I confused it with glFinish().
    And yes, theoretically it should be called just before and just after measuring a GL call. Who knows maybe someone wants to benchmark software rendering...

  4. #24
    Junior Member Newbie
    Join Date
    Sep 2010
    Posts
    2

    Re: slow transfer speed on fermi cards

    your application crashes on my laptop with radeon 7500 mobility.

  5. #25
    Senior Member OpenGL Pro
    Join Date
    Jan 2007
    Posts
    1,180

    Re: slow transfer speed on fermi cards

    Assuming this isn't a spambot, of course it doesn't work with a Radeon 7500, which is an OpenGL 1.3/Direct3D 8.1 card from about the year 1437.

    Out of curiosity I tried this on my laptop's 230M and got comparable results. However, switching format from GL_RGBA to GL_BGRA caused performance to almost double. Changing the type from GL_UNSIGNED_BYTE to GL_UNSIGNED_INT_8_8_8_8_REV gave a more subtle increase, but I'm assuming that the driver is recognising it's a 32-bit format and optimizing accordingly.

    It would be interesting to see benchmark results with a changed format and type for the troublesome hardware.

  6. #26
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,190

    Re: slow transfer speed on fermi cards

    Quote Originally Posted by mhagain
    switching format from GL_RGBA to GL_BGRA caused performance to almost double. Changing the type from GL_UNSIGNED_BYTE to GL_UNSIGNED_INT_8_8_8_8_REV gave a more subtle increase
    Attached updated transferBench with these mods (and Linux Makefile): transferBench_src4.zip

    Results:

    GTX285, 260.19.04b drivers, 2GHz Nehalem EP CPU:[/B]
    glReadPixels: 1.88 ms
    PBO glReadPixels: 0.92 ms (memcpy: 1.64 ms) total: 2.55 ms
    glTexSubImage2D: 3.55 ms
    PBO glTexSubImage2D: 0.07 ms (memcpy: 1.67 ms) total: 1.74 ms
    glCopyTexSubImage2D: 0.02 ms
    glGetTexImage: 8.60 ms

    memcpy speed: 2252 MBytes/sec

    Total frame: 23.69 ms (total transfer: 14.80 ms)
    GTX480, 260.19.04b drivers, 2GHz Nehalem EP CPU:[/B]
    glReadPixels: 4.82 ms
    PBO glReadPixels: 3.47 ms (memcpy: 1.12 ms) total: 4.59 ms
    glTexSubImage2D: 4.74 ms
    PBO glTexSubImage2D: 4.92 ms (memcpy: 1.11 ms) total: 6.03 ms
    glCopyTexSubImage2D: 0.08 ms
    glGetTexImage: 9.97 ms

    memcpy speed: 3303 MBytes/sec

    Total frame: 37.81 ms (total transfer: 25.49 ms)
    So faster than before, but still a 2.6X slowdown on GTX480 vs. GTX285 (before, was 3.8X slowdown).

  7. #27
    Member Regular Contributor
    Join Date
    Nov 2003
    Location
    Czech Republic
    Posts
    317

    Re: slow transfer speed on fermi cards

    So faster than before, but still a 2.6X slowdown on GTX480 vs. GTX285 (before, was 3.8X slowdown).
    I guess this improvement is only due to faster memcpy (which I cannot explain).

    Here is proof: 2.6 / 3.8 = 2252 / 3303

  8. #28
    Member Regular Contributor
    Join Date
    Nov 2003
    Location
    Czech Republic
    Posts
    317

    Re: slow transfer speed on fermi cards

    Could anyone download CUDA-Z (google it) and run in on both 480 and 285 cards in the same PC? There is a memory performance statistics when using CUDA. It should match the OpenGL (with and without PBO).

  9. #29
    Junior Member Newbie
    Join Date
    Jan 2009
    Posts
    3

    Re: slow transfer speed on fermi cards

    I thought I would jump into the fray. I too have been chasing this problem for about a week or so. Bought a 465 to test with our s/w. Linux, Centos 4.7. Our 9800 and 280 GTX cards ran circles around the 465 until we stopped the glread stuff, then the render speeds made sense.

    I have been hanging out at nvidia's site trying to get answers... no luck. Wanted to try CUDA Z, but my OS libraries are out of date. So I am building a new OS disk... Anbody have anything new to report? --Mike

    For what it is worth, I also tested a GTX 480 and got the same reuslts

  10. #30
    Junior Member Newbie
    Join Date
    Sep 2010
    Posts
    2

    Re: slow transfer speed on fermi cards

    My card is a GTX 460 which uses the Fermi GF104 GPU, a derivative of the Fermi GF100 GPU used in the GTX 465/470/480. This card unfortunately also suffers from slow glReadPixels speed, and seems to be a lot worse than the GF100 cards. In transferBench, I get 22ms speed for glReadPixels in the beginning, but the odd thing is that if I let it run for about half a minute, it will improve to 18ms but at the same time, PBO glReadPixels and glGetTexImage become slightly slower by about 2ms.

    As for memcopy speeds comparison between transferBench and CUDA-Z, CUDA-Z gives slower speed (5900MB/s Pinned, 4700MB/s Pageable) compared to transferBench (6900MB/s) for my GTX 460.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •