Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 1 of 2 12 LastLast
Results 1 to 10 of 14

Thread: OpenGL v2.1 + GLSL v1.2 => OpenGL v3.1 + GLSL v1.4

  1. #1
    Intern Contributor
    Join Date
    Jan 2008
    Location
    phobos, mars
    Posts
    75

    OpenGL v2.1 + GLSL v1.2 => OpenGL v3.1 + GLSL v1.4

    We've been creating a portable 3D game/graphics/simulation engine for nearly two years now. Our computers were limited to OpenGL v2.10 and GLSL v1.20 until just today when we upgraded to GTX285 cards and drivers that support OpenGL v3.10 and GLSL v1.40.

    We worked hard to make our application as fast and efficient as possible given the OpenGL/GLSL versions we had, and now want to upgrade any part of our engine that can be made substantially faster or more flexible. Our current design is 100% based upon the fastest approach we could concoct with IBOs/VBOs/FBOs and v1.20 shaders.

    Our question now is this. What new capabilities and features in OpenGL v3.10 and GLSL v1.40 are potentially most fruitful to explore to make our engine faster and more flexible?

    Of course, feel free to link to articles and PDFs that already address these questions.

    Thanks in advance for all suggestions.

  2. #2
    Member Regular Contributor
    Join Date
    Oct 2006
    Posts
    352

    Re: OpenGL v2.1 + GLSL v1.2 => OpenGL v3.1 + GLSL v1.4

    Off-hand, some new stuff you can take advantage of:

    Vertex Array Objects (VAOs) can reduce the cost of changing vertex streams.

    Uniform Buffer Objects (UBOs) can reduce the cost of uniform updates (one of the slowest parts in modern OpenGL programs).

    DrawElementsInstanced can accelerate the rendering of similar models.

    You can rely on new texture formats (e.g. filtered, 32bit floating-point, RG-channel textures). You can take advantage of this to reduce memory consumption or increase speed e.g. in variance shadow mapping.
    [The Open Toolkit library: C# OpenGL 4.4, OpenGL ES 3.1, OpenAL 1.1 for Mono/.Net]

  3. #3
    Super Moderator Frequent Contributor Groovounet's Avatar
    Join Date
    Jul 2004
    Posts
    934

    Re: OpenGL v2.1 + GLSL v1.2 => OpenGL v3.1 + GLSL v1.4

    MappedBufferRange is a kick ass feature!

    But It can involved significant code changes for the buffer management.

  4. #4
    Intern Contributor Abdallah DIB's Avatar
    Join Date
    Feb 2009
    Location
    France
    Posts
    70

    Re: OpenGL v2.1 + GLSL v1.2 => OpenGL v3.1 + GLSL v1.4

    bindless buffer object can increase application performance up to 7x.
    no need to bind ur buffer at each frame. there is a feedback from the driver to GL application so u can use the C style pointer to access buffers.

    ref http://developer.nvidia.com/object/b..._graphics.html
    GL_NV_shader_buffer_load and GL_NV_vertex_buffer_unified_memory

  5. #5
    Member Regular Contributor
    Join Date
    Oct 2006
    Posts
    352

    Re: OpenGL v2.1 + GLSL v1.2 => OpenGL v3.1 + GLSL v1.4

    Quote Originally Posted by Abdallah DIB
    bindless buffer object can increase application performance up to 7x.
    I'm pretty sure the OP asked for OpenGL 3.1 / GLSL 1.4 features, not vendor-specific extensions.

    If you have the motivation and man-power to utilize this short of thing, you should also check EXT_geometry_shader and custom multisample resolves (both are NV-only at the moment).
    [The Open Toolkit library: C# OpenGL 4.4, OpenGL ES 3.1, OpenAL 1.1 for Mono/.Net]

  6. #6
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948

    Re: OpenGL v2.1 + GLSL v1.2 => OpenGL v3.1 + GLSL v1.4

    Quote Originally Posted by Abdallah DIB
    bindless buffer object can increase application performance up to 7x.
    To use this, you must limit your program to NVIDIA hardware. You must also also rewrite all of your shaders to use a completely different style of data access.

    If you want to avoid the platform restriction, you must implement and maintain 2 different sets of shaders. One set works with bindless; the other set works with standard OpenGL.

  7. #7
    Junior Member Regular Contributor Heiko's Avatar
    Join Date
    Aug 2008
    Location
    the Netherlands
    Posts
    170

    Re: OpenGL v2.1 + GLSL v1.2 => OpenGL v3.1 + GLSL v1.4

    You could use texture arrays to limit the texture bind calls. In your shaders you can specify which texture layer from the texture array you want to use.

  8. #8
    Intern Contributor
    Join Date
    Jan 2008
    Location
    phobos, mars
    Posts
    75

    Re: OpenGL v2.1 + GLSL v1.2 => OpenGL v3.1 + GLSL v1.4

    Thanks for all the great feedback. Keep it coming! I'll ask followup questions to everyone in this one reply.

    --- VAOs ---
    Would I be correct to say that VAOs are simply VBOs with the offsets/strides/datatypes of the contained vertices permanently bound? Presumably this is done to eliminate the need to specify offsets/strides/datatypes each time you make a VBO active and render it. Is that all VAOs are, or did I miss something?

    --- UBOs ---
    Would I be correct to say that a UBO is the complete set of uniform variables the shaders expect? If I understand this correctly, an application would need to define a set of offsets/strides/datatypes for the individual elements in a specific "uniform buffer object" just once. Then just before the application calls an OpenGL render function, it would update the values in its "uniform buffer object" image in CPU memory, then tell the driver where it is before the render call. The driver would then load all uniform variables into the GPU before rendering begins. Is this approximately correct?

    --- DrawElementsInstanced ---
    I recall reading somewhere that a new built-in shader variable came into existence in some version of OpenGL/GLSL after v2.10/v1.20 --- called a vertexID number or similar. My assumption is, this vertexID number identifies which vertex in the VBO is currently being processed by each vertex shader (starting at zero, I assume). I guess this would be the value fetched from the IBO (the VBO that contains indices into the vertex VBO). That would seem to provide what is required for "instancing". Thus I don't see a need for special instancing draw calls. What am I missing?

    --- MappedBufferRange ---
    This I do not understand. Our engine calls glBufferSubData() regularly, which we assume updates a portion of a VBO (sometimes the entire IBO or VBO in our case). I must be missing something about the intent of this function.

    --- OpenGL standard versus extension ---
    You are both correct. We are interested in opportunties that are extensions today IF they are fairly likely to become core eventually (in similar form). As long as ATI remains a viable and popular source of high-performance video cards, we prefer not to lock our software into nvidia (even though we have been 100% nvidia since the beginning). nvidia has been great, but we have nothing against AMD --- all our CPUs are Phenom2s!

    --- bindless graphics ---
    I read the nvidia PDF and the two extension text files, but have not gotten my brain around this yet. First, I find it difficult to believe that cache misses in the driver caused by looking up GPU addresses can slow any application by 7%, much less 700%. However, I applaud on principle the practice of letting CPU software control the GPU on the lowest feasible level.

    It appears that VAOs eliminate the need to specify the offsets/strides/datatypes before each render. How much more efficiency does this extension offer over VAOs (which presumably are standard OpenGL)?

    --- texture arrays ---
    Is a texture array [?object?] different from a 3D texture? Are they different in the sense each texture in a texture-array can be different size [and format]? That would be very nice indeed, and much more convenient than our "hack" with 3D textures.

    --- maximum speed techniques ---
    Currently our engine has large IBOs and VBOs (65536 elements each), and typically we render each IBO/VBO pair in one or two OpenGL glDrawElements() or glDrawRangeElements() calls. We can do this because we make the CPU transform every vertex to world coordinates (because we need world-coordinates for collision detection and simulations of several physical processes). Our vertex contains a 16-bit (now integer!!!) of flag bits that can change the behavior of the shader. All this combines to let us render up to 65536 vertices per draw call, thereby amortizing the overhead involved in state changes over 65536 vertices. Every once in awhile we think "maybe this way is a mistake", but so far our analysis and tests say this way is best, all things considered (for our engine, anyway).

    Lately we have been wondering whether we should take this approach even further, switch to 32-bit indices, and put all our vertices into one huge IBO/VBO pair (up to ~30 million vertices). We could render large subsets of the IBO/VBO by calling glDrawRangeElements(), then update vertices outside that range by calling glBufferSubData(). That's what we do now, except we always update the contents of each VBO before we render it (we never modify an IBO or VBO being rendered).

    Our main motivation is not to increase performance, since our batch size is already huge, so further increases would likely not improve throughput measurably.

    Instead, our main motivation is flexibility - to allow our engine to dynamically regroup "objects" in any way it wishes, simply by reloading modest subsections of the IBO only (vertices in the VBO never need to move when they are all inside one VBO).

    Why would we want to do this? Here is one possibility, for example. Imagine a cube/tetrahedron/icosahedron/opportunistic centered around the camera/viewpoint with the camera pointing through the center of one face (or through a vertex). This divides the universe into 6/8/20/more volumes, each containing the centroid of some subset of all [game/simulation] objects. The objects in several to many of these volumes are not visible given the direction the camera is pointing (and the field-of-view of the camera). The engine can simply NOT DRAW the objects in any portion of the IBO that corresponds to these invisible volumes.

    As objects move around in the environment from frame to frame, zero to a few objects will pass from one volume into another volume on each frame. The object can be removed from one volume and put into another simply by moving the object indices from one section of the IBO to another (and recompacting the "from" section of the IBO).

    This is just one of several opportunities we find interesting, none of which work without switching to a single huge IBO/VBO pair. Any ideas and comments are welcome.

  9. #9
    Junior Member Regular Contributor
    Join Date
    Aug 2007
    Posts
    121

    Re: OpenGL v2.1 + GLSL v1.2 => OpenGL v3.1 + GLSL v1.4

    Quote Originally Posted by bootstrap
    --- texture arrays ---
    Is a texture array [?object?] different from a 3D texture? Are they different in the sense each texture in a texture-array can be different size [and format]? That would be very nice indeed, and much more convenient than our "hack" with 3D textures.
    Texture arrays are different because the 3rd dimension is not reduced with increasing mipmap levels. The different layers are really 2D textures with no filtering between them. It is really an array of 2D textures

  10. #10
    Junior Member Regular Contributor Heiko's Avatar
    Join Date
    Aug 2008
    Location
    the Netherlands
    Posts
    170

    Re: OpenGL v2.1 + GLSL v1.2 => OpenGL v3.1 + GLSL v1.4

    Correct. But all textures must have the same dimensions and format if I recall correctly. So the only difference (I think) is that no filtering is possible in the 3rd dimension. And there is no such thing as border layers (not sure if these are available in 3d textures...).

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •