Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 12 of 17 FirstFirst ... 21011121314 ... LastLast
Results 111 to 120 of 166

Thread: Official feedback on OpenGL 3.1 thread

  1. #111
    Advanced Member Frequent Contributor
    Join Date
    May 2001
    Posts
    566

    Re: Official feedback on OpenGL 3.1 thread

    Why goddamn I have 2 stars now instead of 3? Because saying the truth? or because someone did not like my opinions and strong points? I thought these forums are open to ideas and it's something "open" as long as we stay polite and within the boundaries of inter personal communication. Whatever...

  2. #112
    Senior Member OpenGL Guru
    Join Date
    Dec 2000
    Location
    Reutlingen, Germany
    Posts
    2,042

    Re: Official feedback on OpenGL 3.1 thread

    How would such an instance shader work? Just curious how the GPU should be able to decide to render an instance or not (occlusion query?)
    GLIM - Immediate Mode Emulation for GL3

  3. #113
    Intern Newbie
    Join Date
    Jun 2008
    Location
    Russia, Moscow
    Posts
    40

    Re: Official feedback on OpenGL 3.1 thread

    Quote Originally Posted by Jan
    How would such an instance shader work? Just curious how the GPU should be able to decide to render an instance or not (occlusion query?)
    You should read about NV_conditional_render...

  4. #114
    Senior Member OpenGL Guru
    Join Date
    Dec 2000
    Location
    Reutlingen, Germany
    Posts
    2,042

    Re: Official feedback on OpenGL 3.1 thread

    I know conditional render and use it myself. The thing is, if i render 100 instances, i can do an occlusion query to reject ALL of them. I cannot, however, somehow reject only single instances, because when i render them instanced, there is no way to do separate occlusion queries for each instance. It's always an all-or-nothing decision.

    I agree, that such a feature would be interesting and useful, but having dealt with that problem myself, i wonder how EXACTLY such a feature should work.

    Jan.
    GLIM - Immediate Mode Emulation for GL3

  5. #115
    Intern Newbie
    Join Date
    Jun 2008
    Location
    Russia, Moscow
    Posts
    40

    Re: Official feedback on OpenGL 3.1 thread

    Quote Originally Posted by Jan
    I know conditional render and use it myself. The thing is, if i render 100 instances, i can do an occlusion query to reject ALL of them.
    Does DX10 support different occlusion query for instansing ?

  6. #116
    Intern Contributor
    Join Date
    Aug 2004
    Posts
    52

    Re: Official feedback on OpenGL 3.1 thread

    3.1 is a nice release... and the spec is cleaner than I expected.

    Just one thing: Clean headers please. I wait for them since 3.0 forward compatible.

    Until then, staying with D3D9.

  7. #117
    Senior Member OpenGL Guru
    Join Date
    Dec 2000
    Location
    Reutlingen, Germany
    Posts
    2,042

    Re: Official feedback on OpenGL 3.1 thread

    Quote Originally Posted by innuendo
    Quote Originally Posted by Jan
    I know conditional render and use it myself. The thing is, if i render 100 instances, i can do an occlusion query to reject ALL of them.
    Does DX10 support different occlusion query for instansing ?
    No, OQ and conditional render support is identical on OpenGL and D3D.
    GLIM - Immediate Mode Emulation for GL3

  8. #118
    Member Regular Contributor
    Join Date
    Oct 2006
    Posts
    353

    Re: Official feedback on OpenGL 3.1 thread

    Quote Originally Posted by EvilOne
    3.1 is a nice release... and the spec is cleaner than I expected.

    Just one thing: Clean headers please. I wait for them since 3.0 forward compatible.

    Until then, staying with D3D9.
    Can you define clean? Because I *might* be able to do that.
    [The Open Toolkit library: C# OpenGL 4.4, OpenGL ES 3.1, OpenAL 1.1 for Mono/.Net]

  9. #119
    Junior Member Regular Contributor
    Join Date
    Aug 2007
    Posts
    121

    Re: Official feedback on OpenGL 3.1 thread

    Quote Originally Posted by Jan
    I know conditional render and use it myself. The thing is, if i render 100 instances, i can do an occlusion query to reject ALL of them. I cannot, however, somehow reject only single instances, because when i render them instanced, there is no way to do separate occlusion queries for each instance. It's always an all-or-nothing decision.

    I agree, that such a feature would be interesting and useful, but having dealt with that problem myself, i wonder how EXACTLY such a feature should work.
    Conditional rendering relies on occlusion queries and could work pretty well. This could be implemented by an occlusion query per-instance on its bounding sphere. This could maybe be coupled with testing the bounding sphere against the clip planes before going through the actual occlusion query to save the rasterizing stage. I think only complex meshes would benefit from this though since it could be faster to directly instantiate simple meshes and let them go through normal frustum clipping and early-Z tests.

    After writing all this however, I wonder: why not do coarse culling on the CPU and send the visible instance positions directly to the shader?

  10. #120
    Senior Member OpenGL Pro Ilian Dinev's Avatar
    Join Date
    Jan 2008
    Location
    Watford, UK
    Posts
    1,290

    Re: Official feedback on OpenGL 3.1 thread

    Quote Originally Posted by Jan
    I agree, that such a feature would be interesting and useful, but having dealt with that problem myself, i wonder how EXACTLY such a feature should work.
    Hmm. Scene graphs and their natively sequential traversal can be troubling. But if the cpu makes traversal linear and parallelisable (quite possible), considering what gpus can/should do, here's an idea on how an Instance Shader can realistically (imho) look in a next-gen gpu:

    Code :
    // this instance-shader is called once per instance
    // all of these uniforms below are user-specified, not expected by GL
    // added tokens: gl_OcclusionBBMin and gl_OcclusionBBMax
     
    uniform mat4  uniMVP; // matrix projection-view, or projection-view-world (in case of portals, clustering)
    uniform vec4  uniFrustumPlanes[6];
    uniform float uniBoundingSphereRadius;
     
     
    bindable uniform vec3 buniInstancePosition[]; // element at index gl_InstanceID is used here
    bindable uniform mat3 buniInstanceRotation[]; // element at index gl_InstanceID is used here
     
     
    uniform vec3 uniBoundingVolumeVerts[3*12]; // a convex box in this case. Could be something more obscure. Could be dependent on gl_InstanceID. 
     
     
     
     
    void main(){
    	vec4 pos = uniMVP * buniInstancePosition[gl_InstanceID];
     
    	if(m_ClipSphereByFrustrumPlanes(pos)){
    		clip();return;
    	}
     
    	mat3 rot = buniInstanceRotation[gl_InstanceID];
    	mat4 nodeTransform = uniMVP * m_Make4x4FromPosAndRot(pos,rot);
     
    	vec4 minXYZW = vec3(1.e+5,1.e+5,1.e+5,1.e+5);
    	vec4 maxXYZW = vec2(-1.e+5,-1.e+5,-1.e+5,-1.e+5);
     
    	//------[ secondary rough occlusion test via a lowest-poly mesh ]--------[
    	// a box, consisting of 12 triangles is used here, and 12 can be the 
    	// imposed maximum count of triangles to test occlusion with.
    	// Uses ZCULL and optionally EarlyZ
    	// (ZCULL being roughest, fastest z-culling test,
    	//  EarlyZ being fast but less rough z-culling test)
     
     
    	for(int tri=0;tri<12;tri++){
    		for(int v=0;v<3;v++){
    			vec4 vpos = nodeTransform * uniBoundingVolumeVerts[tri*3+v];
    			gl_Position = vpos;
    			minXYZW = min(minXYZW,vpos);
    			maxXYZW = max(maxXYZW,vpos);
    			EmitVertex();
    		}
    		EndPrimitive();
    	}
    	//----------------------------------------------------------------------/
     
    	//----[ primary, roughest occlusion test via a screen-aligned quad ]---------[
    	// uses only ZCULL. If it doesn't pass ZCULL, the secondary test is skipped. 
     
    	gl_OcclusionBBMin = minXYZW;
    	gl_OcclusionBBMax = maxXYZW;
    	//---------------------------------------------------------------------------/
     
     
     
    }
     
     
    bool m_ClipSphereByFrustrumPlanes(in vec4 pos){
    	// here use uniFrustumPlanes and uniBoundingSphereRadius to do preliminary frustum culling
    }
     
    mat4 m_Make4x4FromPosAndRot(in vec4 pos,in mat3 rot){
    	// some maths
    }
    Notes:
    The triangles from the secondary rough-occlusion test do not modify z-buffer, color-buffers or stencil-buffer. Those triangles are not further transformed by the currently-bound vertex shader, and do not use the currently-bound fragment shader. The result from the shader is a single bool (stored internally in a bit, byte, int). The shader is executed before drawing the mesh-instance, or preemptively executed for several mesh-instances. The latter version improves usage of paralellism, but can give false positives (i.e instance 4 is occluding instance 7, but #7 being regarded as visible, as we have batch-computed the visibility of instances 0..10).

    Further improvement:
    Addition of "int gl_IBO_FirstIndex=0", "gl_IBO_Length" and "int gl_VBO_FirstIndex=0", to specify what range of the VBO and IBO (index buffer) this mesh-instance should use.
    This can be used to let the shader select a LOD version of a model, or use a different model altogether (but still with the same bound shaders, render-states and render-targets).

    Further optional improvement:
    Have the gpu write results from the instance-shader to a byte-buffer-object, created by the user. That buffer is initially reset to "true" for all instanceIDs, and is required to be at least NumInstances big. If an object is occluded (as decided roughly by the instance-shader and its querying of ZCULL via those triangles), then gl_InstanceVisible[gl_InstanceID]=false; . The user can then use glMapBufferRange to retrieve occlusion info.
    This can be used as feedback on which instances were drawn, and to do cpu-side computation regarding the result.

    Btw, http://www.delphi3d.net/forums/viewtopic.php?t=183 . 13 million unique triangles in a very complex mesh, running at 200fps even on a GF8600GT. Sure, it's just one material - but awesome results nonetheless. Other than that, useful stuff are portals with hidden encapsulated volumes (i.e apprimation of big occluders like walls as quads) drawn before everything else; and using many octrees to group stuff, all occlusion results being deciding which instances you should put in a rough bucket-sort (PSX OrderingTable-style) by material.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •