Hi,
I am working on a 3D app that is CAD-like.
It is windowed and has a single main render viewport.
The OpenGL rendering consists of a mix of DisplayLists, Text, a few immediate commands, and mostly Index/Vertex/Normal/Color arrays.
I have been testing the app on a number of Windows systems, and every one that has an NVidia card is approximately 5x slower rendering the array objects than ATI. The few minor DisplayLists etc. are so few entities that they render blazingly fast on both so I cannot use them as any performance guideline.
An example, if I render an array object of 64k indexes + vertex/normal/color data, it will take approximately 1ms on systems with ATI and 5ms on systems with NVidia, this is totally consistent on every system I have tested.
If I add a second array object, I get approximately 2x the render time or 10ms on NVidia and 2ms on ATI.
If I add a third array object, I get approximately 3x on both (15ms NV, 3ms ATI).
The index/vertex/normal/color arrays are very well layed out and optimized. Even if they were a mess so that vertex-cache misses occurred there shouldn’t be such as discrepancy between NVidia and ATI.
The arrays are managed and sent individually to the GPU because I need easy modification of their data CPU-side. FYI I did try interleaving the arrays to send a single array to OpenGL and it made no difference on the NVidia performance issue. The overall performance on all systems improved by about 5%.
This performance difference occurs on all systems I have tested, which are numerous Core2Duo, Core2Quad, i3, i5, all around the 3GHz range; Windows XP Pro, Windows Vista x64, Windows 7 x64; and NVidia 8800GTS, GTX275, and ATI HD3870, HD4870, HD6870 video.
It isn’t v-sync or flush issues, I’ve already tried that, plus the render speed drop occurs pretty much constant as each additional array object is added to the render loop, which can only mean that the performance issue is with the NVidia’s handling of the arrays.
Any ideas?