Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 2 of 2 FirstFirst 12
Results 11 to 15 of 15

Thread: VAO framerate issues

  1. #11
    Junior Member Newbie
    Join Date
    Jul 2014
    Posts
    15

    try this

    try:
    glBindBuffer(GL_ARRAY_BUFFER, vertexbuffer);
    glBufferData(GL_ARRAY_BUFFER, sizeof(g_vertex_buffer_data),NULL, GL_STATIC_DRAW);

    and then:


    Code :
    //upload data
     
    glGenVertexArrays(1, &VertexArrayID);
    glBindVertexArray(VertexArrayID);
     
    glEnableVertexAttribArray(0);
    glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0,(void*)0   );
     
    int attemps=0;
    do{
        float * ptr = (float*)  glMapBuffer(GL_ARRAY_BUFFER,GL_WRITE_ONLY);
        for(int i=0; i<sizeof(g_vertex_buffer_data); i++)
            ptr[i]=g_vertex_buffer_data[i];
     
        if(++attemps>5){
            printf("given up after %d attemps", attemps);
            exit(EXIT_FAILURE);
        }
    }
    while( glUnmapBuffer(GL_ARRAY_BUFFER)==GL_FALSE );
     
     
    //...
     
    //render loop
    glBindVertexArray(VertexArrayID);
    glBindBuffer(GL_ARRAY_BUFFER, vertexbuffer);
    glDrawArrays(GL_TRIANGLES, 0, vaoSize/3);



    to upload data to GPU. this should force the drivers to put the VBO in GPU memory (if any, remember that laptops may have integrated graphics card or no dedicated memory wich is an issue you can't fix in any way but changin computer)
    Last edited by DarioCiao!; 07-05-2014 at 06:13 AM.

  2. #12
    Junior Member Newbie
    Join Date
    Jun 2014
    Posts
    5
    Back again after some days.. thanks for your clever answers,
    Still stuck on the issue,

    What I've done so far:
    - rewrite the code to make it the simplest possible and joined at the end of this post for you to review if you want.
    - used a profiler (CodeXL which works on Linux) but didn't help me much

    This is extremely unlikely to be your bottleneck because you're using a VBO, which should therefore be storing your vertex data on the GPU itself. In other words: with a VBO there should be no transfer between system memory and GPU memory happening at all.
    I'm starting thinking same as you, I'm investigating other possibilites but don't see hat else it could be and not sure 100% data are indeed stored on GPU.

    If you're drawing terrain, there are lots of algorithms out there for not drawing the full 2001*2001 vertices, which would probably provide the best speed-up.
    I totally agree and the solutions you gave are meaningful, but I really want to know why I can only display such low number of polygons on a gaming card (1.5 GB dedicated memory).

    To resume, up to date here is what perfs I have:
    - VAO only: 24 millions vertices | 8 millions triangles -> 20 FPS (terrain: 2001*2001 quads | 1 quad made of 2 traingles using 6 vertices):
    - VBO : 96 millions indices + 16 millions vertices | 32 millions triangles => 15 FPS (terrain: 4001*4001 quads | 1 quad made of 2 triangles sharing indexed vertices)

    You don't describe your data much aside from saying it's terrain. What kind of terrain is it? Are you drawing all ~24 million vertices per frame? Are all ~24 million visible on-screen at the same time? Are you getting much overdraw? Have you a complex vertex shader? Have you a complex fragment shader? There are all far more likely to be causing poor performance.
    You can check my code below, this is the simplest scene you can make: No Shaders, No textures (wireframe), Yes all 24 million visible at the same time,
    And here is a screenshot to make it more concrete:

    Click image for larger version. 

Name:	Game screenshot.jpg 
Views:	38 
Size:	13.9 KB 
ID:	1357

    Finally here is my whole code (compiled under linux) but should compile also under windows (not tested) with maybe some minor changes as I'm using only opensources frameworks..
    (rename files with the right extensions instead of .txt or download zip including source + linux bin)
    If anyone willing to review, compile or test the code, let me know what you think and if it reproduce the same issue.
    Many thanks,
    Attached Files Attached Files

  3. #13
    Junior Member Newbie
    Join Date
    Jul 2014
    Posts
    15
    right now I don't know why but I can't get my linux VM to apt-get glfw, by the way copy pastin the relevants GL parts in SFML I get no strange performance penalty. Have you tried to upload data with
    glBufferData (,,NULL,);
    and
    glMapBuffer ();
    glUnmapBuffer();

    ?
    glBufferData with the NULL parameter avoid the creation of clientside arrays (wich is what should limit the bandwith). If that don't work then just go core profile 3.3 and stop use deprecated functionalities

  4. #14
    Member Regular Contributor malexander's Avatar
    Join Date
    Aug 2009
    Location
    Ontario
    Posts
    328
    Perhaps try breaking up the VBOs into smaller buffers and issue several draw calls. I ran into a similar performance issue with a very large mesh (20M points, 80M verts). Breaking the mesh into submeshes of 1M points improved performance by 2-3x on Nvidia hardware. On AMD hardware, it'll be the difference between taking many seconds to display, and realtime display. Hardware seems to prefer drawing with many smaller meshes than an equivalent large one.

    For example, you could break your grid into 100x100 subgrids, arranged in a 20x20 fashion. Toss each subgrid into its own VAO, then just bind/draw & repeat. This would have the added benefit of being easy to frustum cull the subgrids that are not visible. You get some duplication of data along where the subgrids meet up, but the small batch and cull optimizations should more than make up for it.

  5. #15
    Junior Member Newbie
    Join Date
    Jun 2014
    Posts
    5

    some update

    I did some more test and I could make some progress
    At least to discover that my perfs were even worse than I thought! Indeed I didn't see that I was doing clipping in my code so NOT all polygons were displayed on screen.
    After disabling clipping, now my framerate goes down to even less than 4 FPS to display 32 millions triangles (16 millions vertices, 96 millions indices)!!
    As it was guessed, bottleneck is not bandiwth between system and GPU, at least this is one hypothesis eliminated.
    But the question remains: than who is the culprit?

    @DarioCiao: thanks for trying, if you need help I can post the packages needed to compile under Linux
    And I didn't try to use glMapBuffer as I know now, bandwith is not the issue.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •