Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 2 of 6 FirstFirst 1234 ... LastLast
Results 11 to 20 of 51

Thread: slow transfer speed on fermi cards

  1. #11
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,183

    Re: slow transfer speed on fermi cards

    Quote Originally Posted by trinitrotoluene
    If people want the modified version of the source code to run on Linux I will post it.
    Here's a port that should run on both Linux and MSWin. Nice little test program!

  2. #12
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,183

    Re: slow transfer speed on fermi cards

    Results. Confirmed on Linux that there appears to be a major performance slowdown here. Throwing more CPU at it helps a bit, but far from brings a GTX480 up to GTX285 perf.

    Do you have an NVDeveloper account? If so, please kick in a bug report if you haven't already.

    GTX285, 256.35 drivers, 2GHz Nehalem EP CPU:
    glReadPixels: 3.86 ms
    PBO glReadPixels: 3.54 ms (memcpy: 1.24 ms) total: 4.78 ms
    glTexSubImage2D: 4.99 ms
    PBO glTexSubImage2D: 0.09 ms (memcpy: 1.43 ms) total: 1.52 ms
    glCopyTexSubImage2D: 0.02 ms
    glGetTexImage: 11.97 ms

    memcpy speed: 2984 MBytes/sec

    Total frame: 34.90 ms (total transfer: 22.15 ms)
    GTX480, 256.35 drivers, 2GHz Nehalem EP CPU:
    glReadPixels: 14.57 ms
    PBO glReadPixels: 3.25 ms (memcpy: 1.48 ms) total: 4.72 ms
    glTexSubImage2D: 2.68 ms
    PBO glTexSubImage2D: 0.05 ms (memcpy: 1.43 ms) total: 1.49 ms
    glCopyTexSubImage2D: 0.05 ms
    glGetTexImage: 9.94 ms

    memcpy speed: 2494 MBytes/sec

    Total frame: 38.26 ms (total transfer: 30.77 ms)
    GTX480, 256.35 drivers, 3.2GHz Nehalem CPU:
    glReadPixels: 8.32 ms
    PBO glReadPixels: 2.42 ms (memcpy: 0.71 ms) total: 3.13 ms
    glTexSubImage2D: 1.24 ms
    PBO glTexSubImage2D: 0.03 ms (memcpy: 0.61 ms) total: 0.64 ms
    glCopyTexSubImage2D: 0.03 ms
    glGetTexImage: 3.92 ms

    memcpy speed: 5214 MBytes/sec

    Total frame: 20.19 ms (total transfer: 16.05 ms)

  3. #13
    Member Regular Contributor
    Join Date
    Aug 2003
    Posts
    261

    Re: slow transfer speed on fermi cards

    Are any NVidia driver engineers reading this post? I upgraded from the GTX 275 to the GTX 465 and not only am I experiencing the same readback slowdowns discussed here, but slowdowns across the board. For instance, for a multi-view rendering application I'm writing, the NVidia GTX 275 could render 4032 views at 15fps. The Nvidia GTX 465 only gets 4fps.

  4. #14
    Junior Member Regular Contributor
    Join Date
    Feb 2004
    Posts
    248

    Re: slow transfer speed on fermi cards

    @trinitrotoluene:
    Thanks for the HD5870 bench data.
    Very interesting to do this direct comparison.
    What kind of quality/AA settings did you use in the driver?

    Your results look like the optimal stats of my GTX280 and GTX480 combined.

  5. #15
    Junior Member Regular Contributor
    Join Date
    Feb 2004
    Posts
    248

    Re: slow transfer speed on fermi cards

    Do you have an NVDeveloper account? If so, please kick in a bug report if you haven't already.
    No I don't. I would be glad if somebody else with an account could do this. Thanks.

  6. #16
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,183

    Re: slow transfer speed on fermi cards

    Quote Originally Posted by def
    Do you have an NVDeveloper account? If so, please kick in a bug report if you haven't already.
    No I don't. I would be glad if somebody else with an account could do this. Thanks.
    Done.

    And since I just happen to have the stats handy:

    GTX285, 256.38.02 drivers, 2.6GHz Nehalem CPU:
    glReadPixels: 2.37 ms
    PBO glReadPixels: 2.32 ms (memcpy: 0.58 ms) total: 2.90 ms
    glTexSubImage2D: 2.70 ms
    PBO glTexSubImage2D: 0.05 ms (memcpy: 0.60 ms) total: 0.65 ms
    glCopyTexSubImage2D: 0.01 ms
    glGetTexImage: 5.50 ms

    memcpy speed: 6332 MBytes/sec

    Total frame: 18.37 ms (total transfer: 11.44 ms)

  7. #17
    Member Regular Contributor trinitrotoluene's Avatar
    Join Date
    Sep 2008
    Location
    Montérégie,Québec
    Posts
    362

    Re: slow transfer speed on fermi cards

    For AA settings the quality is super-sample, I use level 8x and the filter is edge detect (24x). I noticed that my memcpy speed is slower than others. I have a Phenom II X6 1090T on ubuntu 64 bit. When I use the libc memcpy I got a transfer rate of ~2100 MB/s. When I use this:
    Code :
    void *(memcpy)(void * b, const void * a, size_t n){
      size_t i;
      char *s1 = (char*)b;
      const char *s2 = (char*)a;
     
      #pragma omp parallel for shared(s1,s2,n) private(i) schedule(static) 
        for(i=0; i<n; i++)
          {
     
    	s1[i] = s2[i];
          }
     
     
        return b;
     
    }

    I got a transfer rate ~2700 MB/s all with GCC compiler flag -O3 -msse4a -march=amd10fam. When I enable openmp (-fopenmp) I got a transfer rate of ~3700 MB/s but now the rendering have some glitches. Maybe its because all my 6 cores work 100% of the time and the driver have not much cpu time to push drawing call to the video card in this case.

    Like Dark Photon wrote before, this is a nice test program. I will profile the program to see if improving further the speed of memcpy call is important.

  8. #18
    Intern Newbie
    Join Date
    Oct 2007
    Posts
    47

    Re: slow transfer speed on fermi cards

    built on macos 10.6.4
    snow leopard graphics update beta (1.6.18.16 19.5.9f02)
    gtx 275 nehalem

    g++ transferBench.cpp -framework OpenGL -framework GLUT -lGLEW
    change time.h to sys/time.h and use gettimeofday
    Status: Using GLEW 1.5.4

    glReadPixels: 6.28 ms
    PBO glReadPixels: 6.72 ms (memcpy: 0.67 ms) total: 7.40 ms
    glTexSubImage2D: 0.92 ms
    PBO glTexSubImage2D: 0.11 ms (memcpy: 0.82 ms) total: 0.94 ms
    glCopyTexSubImage2D: 0.08 ms
    glGetTexImage: 5.31 ms

    memcpy speed: 5486 MBytes/sec

    Total frame: 25.16 ms (total transfer: 20.00 ms)

  9. #19
    Member Regular Contributor
    Join Date
    Nov 2003
    Location
    Czech Republic
    Posts
    317

    Re: slow transfer speed on fermi cards

    The benchmark code for glReadPixels is not correct.

    You should call glFinish before starting the timer. (line 212)
    Internally the glReadPixels has to wait until all the geometry is rendered. Then it starts the transfer.

    There is also small problem is with glTexSubImage2D with PBO.
    The texture is actually not loaded when glTexSubImage2D returns. It just starts the DMA to GPU memory. This transfer happens in background.

    I'd recommend calling glFinish() before every getTime().

  10. #20
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,183

    Re: slow transfer speed on fermi cards

    Quote Originally Posted by mfort
    The benchmark code for glReadPixels is not correct.
    For the base glReadPixels it should be fine. glReadPixels must block until rendering completes, all the data is back, and it has been copied into the destination array at the address indicated by the "pixels" parameter. Timing is sampled after glReadPixels.

    For the PBO glReadPixels, while the glReadPixels may pipeline, the glMapBuffer will block until rendering completes, all the data is in the buffer, and it has been transfered into a mappable CPU block. Timing is sampled after the glMapBuffer. So that should be fine as well.

    The only thing I see wrong is the glFlush -- it superfluous in both cases. Removing it does not affect the performance of either case here.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •