Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 1 of 2 12 LastLast
Results 1 to 10 of 14

Thread: Gemoetry shader view frustum culling

  1. #1
    Senior Member OpenGL Pro BionicBytes's Avatar
    Join Date
    Mar 2009
    Location
    UK, London
    Posts
    1,169

    Gemoetry shader view frustum culling

    Hi everyone,

    I am currently rendering between 1000 and 10,000 instances of an 1000+ triangle object using instancing techniques (texture buffer object) and this works just fine. However, at any one time only a fraction of these instances are visible and I would now like to try and experiment with transform feedback and GPU frustum culling to try and generate a list of only visible instances.
    My initial thoughts are to render each instance as a GL_POINT and use transform feedback to store the visible instance ID into a buffer object (storing the gl_InstanceID). By emitting the gl_InstanceID into a (texture)Buffer Object I can seamlessly integrate this new step into my existing pipeline and efficiently render the scene no matter how many instances are drawn.

    I have no experience with transform feedback, but this is not the area wich I foresee as the main stumbling block. It's the Gemoetry shaders which are something completely new to me and I just don't have a clue how to write one. The goal of the Geometry shader is to frustum cull the Points and emit the gl_InstanceID. Each object instance would have the same radius which could be passed as a uniform parameter to aid in the culling process.

    I've Googled a lot and can't really find any decent tutorials on Geometry shaders and/or Geometry view frustum culling. Transform feedback examples are on the net as someone has nicely written the OpenGL Samples pack series of GL 3/4 examples, and there's the Particle system transform feedback link on the front page of OpenGL.org.

    So my question is: Can anyone help explain how to implement Gemoetry shader view frustum culling with a mind on instanced rendering, perhaps with some examples.
    A secondary help would be if someone could post some reader-freindly geometry shader articles to help idiots like me with the syntax and rules (apart from the geometry_EXT spec).

    OpenGL 3.3 compatibility profile
    GLSL 1.5/330 compatibility shaders

  2. #2
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    985

    Re: Gemoetry shader view frustum culling

    You are lucky, as I wrote a demo that does what you are looking for and there are also accompanying articles.

    First of all, I would suggest using instanced arrays instead of texture buffer object because on DX10 class hardware it is slightly faster than texture buffer object usage (though this difference is not present on DX11 class hardware).

    Here you can find the necessary information:
    - Instance culling using geometry shaders (http://rastergrid.com/blog/2010/02/i...metry-shaders/)
    - Instance Cloud Reduction reloaded (http://rastergrid.com/blog/2010/06/i...tion-reloaded/)
    - Nature Demo (http://rastergrid.com/blog/downloads/nature-demo/

    And if you are even interested in further improving the technique with Hi-Z occlusion culling and/or geometry LOD then here are some further articles and another demo:
    - OpenGL 4.0 Mountains Demo (http://rastergrid.com/blog/2010/10/o...demo-released/)
    - Hierarchical-Z map based occlusion culling (http://rastergrid.com/blog/2010/10/h...usion-culling/
    - GPU based dynamic geometry LOD (http://rastergrid.com/blog/2010/10/g...-geometry-lod/)

    The later uses GL4 tech, however can be implemented as well using GL3-only stuff.
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

  3. #3
    Senior Member OpenGL Pro BionicBytes's Avatar
    Join Date
    Mar 2009
    Location
    UK, London
    Posts
    1,169

    Re: Gemoetry shader view frustum culling

    @aqnuep
    Actually I have seen your demos and articles; very nice and well done to you! You seem to be the only person I can see implmementing something close to what I need.

    Thinking back, your articles motivated me to introduce instanced rendering in my engine and subsequently I posted my (dissapointing) results based on these forums - "instancing sucks".

    Regarding instanced arrays. I felt these were much more cumbersome to implement that I had originally thought. Two issues with this - a 4*4 matrix is required in my case and this means 4 additional vertex attribute streams to setup. My Models already have their own vertex attribute streams so this is over and above the usual vertex,normal, tangent and texture. Although my engine supplied an integer list of all the instances to draw I felt the additional CPU time to fetch each instance 4*4 matrix and then pack into a buffer object was too much, so I ended up sending all instance data to the GPU.

    Utimately I ended up using TBO insead. Texture buffer objects objects can hold massive amounts of (static) instance data which would never need changing. Each frame I only need to dynamically update a separate integer texture buffer object containing the visible gl_Instance IDs based on some culling method as before, or just send all of them knowing that at least the amount of data being uploaded would be smaller than the entire 4*4 transform matrix in the case of Instanced Arrays.

    I'll be interested to hear you views on that...


    That was then, and now I have a slightly different case. I have no CPU culling available and many times the number of instances to draw.
    My concern with Instance Arrays is as before; sending all instance attibute data but let the GPU cull this time. However sending 50,000+ sets of instance data just to ultimately discard is not very efficient. So I'm looking to TBO as the source of the instance data with a single large static TBO holding the transform data, and an integer TBO to hold the list of visible instance IDs.

    In your articles which you point to, you said you performed visibility testing in the Vertex shader. I've just looked at your cull.vert shader and can see that you construct a BBOX from the MVP and instance position. That seems like a lot of extra matrix multiplies and if my models contain between 1000-2000+ triangles, that's a lot of vertex work to perform. Is there some advantage over doing the same in the Geometry shader?

  4. #4
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    985

    Re: Gemoetry shader view frustum culling

    Quote Originally Posted by BionicBytes
    Regarding instanced arrays. I felt these were much more cumbersome to implement that I had originally thought. Two issues with this - a 4*4 matrix is required in my case and this means 4 additional vertex attribute streams to setup. My Models already have their own vertex attribute streams so this is over and above the usual vertex,normal, tangent and texture.
    Well, that's a good point, but if the total number of vertex attributes is below 16 then I wouldn't worry about it. Also, you can use quaternions and then reduce the 4x4 floats to 4x2 floats.

    Although my engine supplied an integer list of all the instances to draw I felt the additional CPU time to fetch each instance 4*4 matrix and then pack into a buffer object was too much, so I ended up sending all instance data to the GPU.
    I think you don't do something correctly here. When culling, you should emit the actual instance data that was feed to the culling pass. This way you can avoid the additional indirection that is needed to load the instance data index first. Of course, the amount of data to be emitted by the culling pass increases this way, however, culling is performed on a per-object basis so you shouldn't worry about that, while the indirection introduces some slowdown to each vertex shader invocation during the actual rendering and that can be a visible performance hit.

    In your articles which you point to, you said you performed visibility testing in the Vertex shader. I've just looked at your cull.vert shader and can see that you construct a BBOX from the MVP and instance position. That seems like a lot of extra matrix multiplies and if my models contain between 1000-2000+ triangles, that's a lot of vertex work to perform. Is there some advantage over doing the same in the Geometry shader?
    Well, you misunderstood something here as well. That vertex shader is actually the vertex shader executed before the geometry shader that culls the instances thus it is executed only once per instance, not for all the vertices of the actual objects, so it doesn't matter how complex is the actual scene geometry.

    Okay, the BBOX MV multiplication is a bit costly but don't forget that this is done per object, not per actual vertex of the geometry. If you wish, of course you can use a sphere as bounding volume and you don't need to do that much multiplications. However, as a general rule, don't try to optimize the culling shader, rather sophisticate it even further as the cost of the culling pass is so small that optimizing it won't change the overall performance of the rendering but by simplifying the culling you may end up with more potentially visible objects and thus even decrease the overall rendering performance.

    The original reason why I performed the actual culling in the vertex shader was that there was some bug in the geometry shader compiler of the AMD drivers and this was the only workaround. However, in practice msot probably you should do the same thing as well, as, again, on early geometry shader implementations (most DX10 class hardware, especially on NVIDIA) geometry shaders couldn't be executed paralelly so the more complex the geometry shader is the more it becomes the bottleneck of the pipeline. But as I said, not the culling pass will determine the overall performance, so do it in a way that fits the best your design.
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

  5. #5
    Senior Member OpenGL Pro BionicBytes's Avatar
    Join Date
    Mar 2009
    Location
    UK, London
    Posts
    1,169

    Re: Gemoetry shader view frustum culling

    When culling, you should emit the actual instance data that was feed to the culling pass. This way you can avoid the additional indirection that is needed to load the instance data index first
    Yes, agreed that may save all the messing about with fetching instance data and packing into buffer and/or creating an index list.

    Okay, the BBOX MV multiplication is a bit costly but don't forget that this is done per object
    Yes, I thought about this after I had posted. I forgot i'd be sending GL_POINTS and not the actual geometry so the vertex shader with the culling code will only be executed once per instance.
    As a side note, if the TBO contains all the instance data when I render the cull pass with GL_POINTS, I don't have a vertex array containing the position - that's all in the TBO. Is there an easy way to reuse the TBO for the points rendering or do I have to create another buffer object containing just the point positions?
    Perhaps I could re-use the TBO as the source for the vertex array and specify a stride to offset the fact that I'm sending in 4*4 matrix data?

  6. #6
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    985

    Re: Gemoetry shader view frustum culling

    Quote Originally Posted by BionicBytes
    As a side note, if the TBO contains all the instance data when I render the cull pass with GL_POINTS, I don't have a vertex array containing the position - that's all in the TBO. Is there an easy way to reuse the TBO for the points rendering or do I have to create another buffer object containing just the point positions?
    Perhaps I could re-use the TBO as the source for the vertex array and specify a stride to offset the fact that I'm sending in 4*4 matrix data?
    Exactly as you say. I've done it in the very same way in the Nature and Mountains demo. You can reuse anytime a buffer object both as a VBO and TBO, only the vertex attrib specifications have to be correct, as you said.

    Actually you'll have to make vertex attribs out of your instance data anyway (or at least from part of it, unless you do attrib-less rendering), thus instanced arrays are trivial to be used in the very same fashion.
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

  7. #7
    Senior Member OpenGL Pro BionicBytes's Avatar
    Join Date
    Mar 2009
    Location
    UK, London
    Posts
    1,169

    Re: Gemoetry shader view frustum culling

    given the following VS and GS snippets, how can I make the GS emit more instance data than the original input instance data.
    In other words, I input a vec4 as instance data and would like to emit 2 * vec3 as output. How do I do that?

    VS snippet:
    Code :
    #version 330 compatibility
     
    uniform vec3 objectextent;		//BBOX extents of model
    uniform vec3 origin;			//to add to every instances' position; eg Sun's position
    uniform mat4 modelmatrix;		//model scale
    uniform mat4 cameraviewmatrix;
     
     
    in vec4 instanceposition;			// X=orbital angle (radians); A=orbital distance
    flat out vec4 emit_InstancePosition;		//pass through to Geometry shader
    flat out int objectVisible;			//geometry shader cull flag
     
    void main()

    GS snippet:
    Code :
    #version 330 compatibility
     
    layout(points) in;
    layout(points, max_vertices = 1) out;
     
     
    flat in vec4 emit_InstancePosition[1];
    flat in int objectVisible[1];
     
    out vec4 vertex_out;	//emit data to buffer object
     
    void main()

  8. #8
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948

    Re: Gemoetry shader view frustum culling

    The same way you have your vertex shader emit more than one value. Or your fragment shader emit more than one value. You declare multiple output values.

    Now, if you're doing transform feedback, you need to associate those outputs with buffers using the TF mechanisms.

  9. #9
    Senior Member OpenGL Pro BionicBytes's Avatar
    Join Date
    Mar 2009
    Location
    UK, London
    Posts
    1,169

    Re: Gemoetry shader view frustum culling

    Right I am using transform feedback.
    If I declare two OUT variables in the geometry shader how do these get written to the buffer object. Do they get written sequentially or into separate buffers. When I compiled the GS I specified separate buffers but now I'm thinking I should be using interleaved buffers instead as I want to pack both OUT variables as two consecutive Vec3.

  10. #10
    Senior Member OpenGL Pro Ilian Dinev's Avatar
    Join Date
    Jan 2008
    Location
    Watford, UK
    Posts
    1,290

    Re: Gemoetry shader view frustum culling

    http://www.opengl.org/sdk/docs/man3/...ckVaryings.xml
    It's quite easy.
    Code :
    // geometry shader
    out vec3 x1;
    out vec3 x2;
    ....

    Code :
    const char* vars[2]={"x1","x2"};
    glTransformFeedbackVaryings(prog,2,vars, GL_INTERLEAVED_ATTRIBS);
    glLinkProgram(prog);
     
    ...
    glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER,0,mybuffer);
    glBeginTransformFeedback(GL_POINTS);
    glDrawArrays(....); // bind some VBO beforehand for these
    glEndTransformFeedback();

    Done, the "mybuffer" buf has your data. With a query, you can get the number of vertices written.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •