MRT performance problems

Has anyone used ATI_draw_buffers on Nvidia cards? I’m running into some odd peformance problems, and wondered if anyone else had seen similar…

As soon as I enable more than 1 buffer for drawing, my polygon rate drops by 5x-10x…happens with 8 bit int or 16 or 32 bit float pbuffers, and any combination of front/back/aux buffers i’ve tried. Rendering to FRONT and BACK buffers of the framebuffer (with no AA) at once works at full speed, but that doesn’t seem very useful :slight_smile: BACK + AUX0, or BACK+FRONT with AA on the main framebuffer slows down to the same speeds as the pbuffers.

I’m not sure exactly where the bottleneck is, it doesn’t seem to affect backfacing polys, or polys outside the view volume, so I assume its either triangle setup or fill rate…but neither makes much sense, given I can make 2 single buffer passes at 1200x900 and still get 2-5x the framerate I get at 400x300 with a single pass to 2 buffers.

example numbers with 66.93 windows drivers on a 6800GT, @ 400x300 8bit/color pbuffer with a simple ~500k tri scene:

  1. no polys culled, covering most of the window:
    back buffer only = ~56Mpoly/sec
    front + back buffer = ~14MPoly/sec (with more complicated FP = ~7M)

  2. same as 1, but from far away, covering a few % of the window
    back buffer only = ~56M
    front + back = ~21M

  3. same as 1 and 2 but all polys beyond far clip plane
    back buffer = ~105M
    front+back = 65M

  4. close up, most of mesh outside view, screen completely covered by ~5-10 polys
    back only, or front + back = ~180M

  5. all polys culled or clipped (looking at back of more or less flat mesh, or looking away from mesh)
    back or front+back = ~180M

Vertex program is fairly simple, basically transform and copies some stuff to 2 texcoords.
Fragment program just sets result.colors to solid colors

I could only find 1 MRT example ( here )that looked easy to modify to measure poly rate, and it seemed to have similar limitations…