Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 1 of 3 123 LastLast
Results 1 to 10 of 26

Thread: OpenGL performance - demystified

  1. #1
    Senior Member OpenGL Pro Aleksandar's Avatar
    Join Date
    Jul 2009
    Posts
    1,136

    OpenGL performance - demystified

    Hello to whole GL community,

    This post is not a question, but a try to help beginners to take an overview of OpenGL performance and to demystify some of common delusions. Everything written here is related to NVIDIA's implementation in 19x.xx drivers for Windows. I'll be glad if you would take a part and broaden this overview to other vendors (AMD/ATI at first place).

    The first three delusions I'll comment are:
    1. Using OpenGL 3.2 Core Profile significantly boosts the application speed
    2. Shaders are faster than fixed functionality
    3. The Bindless Graphics boosts up speed to the order of magnitude

    Some months ago, I have read on one of the posts that GL 3.2 Core Profile enables much cheaper function-calls. I was very excited with that, and that was prime reason to switch to a "new technology". But, after some time spent in "porting" an application to GL 3.2, I have realized that the boost is meaningless small, if it exists at all. A few percents of change were due to code reorganization, but not from cheaper function-calls. To be more direct: Using OpenGL 3.2 Core Profile on NVIDIA currently does not change the speed even a bit! Of course, my intent is not to turn you from the new programming model. Further more, I, personally, am using it. But just to emphasize that switching to GL 3.2 Core does not mean any speed boost. I have measured the speed of execution of glMultiDrawElements() function, and noticed no speed change, as well as the frame-rate of the application.

    Because fixed functionality is still supported (no matter how it is implemented), it is unlikely that those functions take other paths through the pipeline that the shaders do. It is also unlikely that you can better implement any of the functionality than the system developers can. Further more, shaders are usually used for extending standard functionality, which means more computation in the shaders. To be more direct: Shaders are at least as slow as fixed functionality if not slower! On the other hand, we can use some tricks and skip some calculations to boost the speed of shaders, but the implementation of the full fixed functionality in shaders will certainly be slower or in the best case equal to standard fixed functionality.

    The Bindless Graphics is one of the best thing happened in the OpenGL world in the previous year. Porting application to it was a very pleasant experience (although until I have resolved some bugs in my application, some very severe application crushes happened). Because I'm using (tens of) thousands VBOs in the scene, I thought the Bindless extensions were something I had to try, and I haven't regret. With 65025 VBOs in the scene (and from 600K to 10M triangles), the speed gain was from 50% to 70%. The greatest speed gain achieved with bindless extensions (in all test cases) was 2 times. Although I didn't achieved 7.5x, with "just" 1.5-2x I'm very satisfied. (For less than 1K VBOs there is no speed gain at all). Another great feature of Bindless extensions is a support for both fixed functionality and shaders.

    Table 1. shows results of the testing on the textured and lighted terrain. The values in the gray columns are pseudo frame-rates (reciprocal value of the rendering time), and the greater values are better. Values in the yellow columns are speed gain factors.

    http://sites.google.com/site/openglt...test/Tab-1.PNG

    Table 2. shows how triangles and VBOs count maps to LODx and block size(64, 128 or 256) values.

    http://sites.google.com/site/openglt...test/Tab-2.PNG

  2. #2
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948

    Re: OpenGL performance - demystified

    1. Using OpenGL 3.2 Core Profile significantly boosts the application speed
    I don't think anyone has ever argued that.

    2. Shaders are faster than fixed functionality
    Or this.

    3. The Bindless Graphics boosts up speed to the order of magnitude
    Even NVIDIA only ever claimed a 7x performance improvement. And that was involving both halves of bindless (no uniforms).

    So I'm afraid your findings are not particularly Earth shadowing. Disproving arguments nobody is making is not particularly insightful

  3. #3
    Senior Member OpenGL Guru
    Join Date
    Dec 2000
    Location
    Reutlingen, Germany
    Posts
    2,042

    Re: OpenGL performance - demystified

    I agree with Alfonse.

    1) This has always only been mentioned as a "possibility". Like "if they were to write an entirely new driver only for this profile, they COULD make it faster". It was always clear, that this is not the case at the moment and won't be for many years.

    2) Maybe on the very first shader-only GPUs. And only for a few months. You can't beat IHVs hand-tuned fixed-function shaders.

    3) I am actually _positively_ surprised about your findings. I assumed this extension would be as pointless as VAOs are.

    Jan.
    GLIM - Immediate Mode Emulation for GL3

  4. #4

    Re: OpenGL performance - demystified

    I've realised vector versions of functions, partially glVertex are very slow with pyopengl at least. Not sure about OpenGL with other languages.

    Changing glVertexfv(li) to glVertexf(*li) in python gave me a massive speed increase.

    That's something I've noticed about performance.

  5. #5
    Senior Member OpenGL Pro Ilian Dinev's Avatar
    Join Date
    Jan 2008
    Location
    Watford, UK
    Posts
    1,290

    Re: OpenGL performance - demystified

    Matthew, it could be caused by Python itself, i.e having to clone memory instead of push its value on the virtual program stack. In a binary app, it would be a face-palm if this problem existed. Anyway, I think Aleksandar didn't mean to discuss the glBegin/glEnd interfaces.

    What Aleksandar wrote may not be news to many of us, but I think it's a nice post for the google-search archive, for newcomers to directly get clarifications on performance aspects (for the current snapshot of drivers).

  6. #6
    Senior Member OpenGL Pro Aleksandar's Avatar
    Join Date
    Jul 2009
    Posts
    1,136

    Re: OpenGL performance - demystified

    Quote Originally Posted by Alfonse Reinheart
    So I'm afraid your findings are not particularly Earth shadowing. Disproving arguments nobody is making is not particularly insightful
    If I had any Earth shadowing discovery, I would write a scientific paper, not a post on the beginners forum. You have overlooked that. But my goal is achieved. You have agreed with my statements, and many beginners would see it before they fall into false estimations.

    And, yes there is a post that claims GL 3.2 calls are cheaper (not will be, but are), but I won't post the link, because it would be a "negative citation" for the poster.

    There are also question on this forum about shaders speed. So I didn't make any statement on my own.

    I have involved Bindless extensions in the previous story because I was excited by speed improvements that are not advertised enough.

  7. #7
    Senior Member OpenGL Pro Ilian Dinev's Avatar
    Join Date
    Jan 2008
    Location
    Watford, UK
    Posts
    1,290

    Re: OpenGL performance - demystified

    That poster might have been me, with some beta GF drivers, multithreading enabled, and comparing cpu+gpu cycles between slightly differently-tuned PCs and drivers, and benchmark scenes (basically I did some completely unfair comparison). I remember having regret on not realizing my mistakes soon enough while the thread was recent and on topic. Though it was a post about 3.1 iirc.

    Aleksandar, actually writing a PDF and uploading it anywhere seems to get higher google-rank when searching for tech topics.

  8. #8
    Senior Member OpenGL Pro Aleksandar's Avatar
    Join Date
    Jul 2009
    Posts
    1,136

    Re: OpenGL performance - demystified

    Don't worry Ilian, and thank you for many vary useful posts.

    Writing a pdf is maybe better for Google-search, but eliminates possible feedback. Official OpenGL forum is better for that purpose. And I'm also curious about how ATI deals with OpenGL performance issues. Intel is "out of the game", but it will be also interesting to see how i5 with integrated graphics chip deals with OpenGL.

  9. #9
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,183

    Re: OpenGL performance - demystified

    Quote Originally Posted by Aleksandar
    Intel is "out of the game"...
    ...maybe not out for good, but definitely gonna be really, really late to the party.

  10. #10
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,183

    Re: OpenGL performance - demystified

    Quote Originally Posted by Aleksandar
    The greatest speed gain achieved with bindless extensions (in all test cases) was 2 times.
    So just to quantify it, 50% reduction in draw time. I've definitely seen ~15-20% here, and that's only for vertex attribute bindless.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •