Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 1 of 2 12 LastLast
Results 1 to 10 of 17

Thread: AMD Transform Feedback GPU Memory Leaky

  1. #1
    Junior Member Newbie
    Join Date
    May 2013
    Posts
    17

    Question AMD Transform Feedback GPU Memory Leaky

    Hi everyone.

    The scenario is the following:
    I wrote a very simple example program where I generate five tris from five points with a geometry shader. I stream those five tris to a vertex buffer via transform feedback and render it with glDrawArrays. The loaded vertex buffer for tfb is just as large as the geometry output requires, i.e. 5*3*24Bytes. 24Bytes are two vec3. Position and color. The output is as expected. The five tris are rendered.

    The observed problem:
    As I figures out, exactly when calling glDrawArrays DURING transform feedback(I DONT mean rendering the result)run through, the observed GPU memory footprint of the described little program grows by approx. 150 MB. I checked this behavior with Process Explorer by putting a glFlush and glFinish after the mentioned glDrawArrays for the input vertices to the transform feedback stage(!).

    Pseudo:

    Code :
    // during initialization of program
    // transform feedback stage
     
    bind feedback
    render five points and stream out to vertex buffer// here the memory grow by 150 MB
    unbind feedback
     
    // in runtime loop
    // render stage
     
    render result of transform feedback

    When I check the GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN by glGetQueryObjectuiv, I correctly get 5. When I release the transform feedback buffer along with the streamed out vertex buffer, the 150 MB footprint doesn't disappear. When I release every single gl object in that simple program, the footprint of 150 MB doesn't disappear.
    When I do not render with transform feedback, i.e. render an empty buffer, the memory does not grow by 150 MB.

    Additionally, I just observe that memory grow on my PC with AMD Radeon HD 7700 Series.
    When I check the same program on my other PC with nvidia card(gforce 650GTX TI), there is no grow in GPU memory.

    The real problem is in my production code, where I stream out approx. 350000 Primitives(I assume a lot more) and the freaking GPU Memory(1GB) is running full and havok!
    350000 Prims * 3(Tri) * 12 Byte(1xvec3) = 12.6 MB right?

    Does someone observed the same problem?
    I updated to the latest drives available today. Catalyst Version 13.4

    Thanks
    Alex

  2. #2
    Junior Member Newbie
    Join Date
    May 2013
    Posts
    17
    Ok, one step ahead.

    When I disable the geometry shader, and just stream out two triangles via vertex shader only, those 150 MB are not allocated. Here is the geometry shader I use, that produces 150 MB GPU memory allocation on my AMD card:
    Code :
     
    #version 150
     
    // can be specified here or is retrieved from programparameteri
    //layout(points) in;
    layout (triangle_strip) out;
    layout (max_vertices=3) out;
     
    in VertexData{
    	vec3 color ;
    } vertices_in[1] ;
     
     
     
    out vec3 oa_position ;
    out vec3 oa_color ;
     
     
     
    uniform mat4 u_mat_proj ;
    uniform mat4 u_mat_view ;
     
     
    void main()
    {
    	//for( int i=0; i<1;/*gl_in.length();*/ ++i )
    	int i=0;
    	{
    		oa_position = vec3(gl_in[i].gl_Position) ;
    		oa_color = vertices_in[i].color ;
    		EmitVertex() ;
    		oa_position = vec3(gl_in[i].gl_Position + vec4(1.0,0.0,0.0,1.0));
    		oa_color = vertices_in[i].color ;
    		EmitVertex() ;
    		oa_position = vec3(gl_in[i].gl_Position + vec4(-1.0,1.0,0.0,1.0));
    		oa_color = vertices_in[i].color ;
    		EmitVertex() ;
    	}
     
    	EndPrimitive() ;
    }

    It should just produce one tri for one input primitive(point).

    Any idea?

    Thanks
    Alex

  3. #3
    Junior Member Newbie
    Join Date
    May 2013
    Posts
    7
    My guess would be that the additional GPU memory is needed for internal purposes by the driver for geometry shaders. I'd bet that if you would have multiple of different geometry shaders the GPU memory usage won't grow with it. The reason for not freeing that memory is probably to avoid keep allocating/deallocating those internal buffers every single time you use geometry shaders.

    Btw, I wouldn't consider it a memory leak if the memory is freed once you delete the context. Try deleting the context and checking the memory usage. I'd bet it will disappear.

    You cannot expect a driver not needing some GPU memory for internal purposes for whatever reason, you also cannot expect a feature to work exactly the same on two GPUs from different vendors.

  4. #4
    Junior Member Newbie
    Join Date
    May 2013
    Posts
    17
    Hi random_dude.

    Since I can exactly specify how much memory my app should use for gs and transform feedback, there is no need to allocate as much memory. The transform feedback is limited to the amount of memory you give it via the output VBO and the gs is limited by that implicitly and the the output layout specifier. there you specify how many output vertices you give the pipe per input primitive! 150 MB would equal to some million vertices...

    Plus, that behavior is getting worse if I use multiple tfb buffers. As said, the GPU memory is running full in no time.
    Last edited by noopnoop; 05-29-2013 at 08:22 AM.

  5. #5
    Junior Member Newbie
    Join Date
    May 2013
    Posts
    7
    Once again, I was talking about internal purposes, not about your application's data.

    If you would take the time and look up some publicly available documents, you would see that some GPUs require some internal buffers for communication between the geometry shader and previous/subsequent shader stages.

  6. #6
    Junior Member Newbie
    Join Date
    May 2013
    Posts
    17
    First of all, 150 MB is nothing I would consider some internal storage since GPU memory is very rare. Plus, reference that "publicly available documents" and specify the amount of memory required by the shader stage instead of just claim that rudely.
    Second. Playing around with the gl context is something I do not consider. I dont think it is very smart to invalidate or recreate a gl context just after using a gs in tfb.

  7. #7
    Junior Member Newbie
    Join Date
    May 2013
    Posts
    7
    I never told you that you should recreate the context, I just told you that it's not a memory leak if the memory gets released with the context.
    And seriously, yes, 150MB is not a small chunk of memory, but is it really that much with today's GPUs having GBs of it?
    I'm not trying to justify the use of that 150MB of memory, what I'm saying is that t cannot be considered memory leak if releasing the context frees it up.

  8. #8
    Junior Member Newbie
    Join Date
    May 2013
    Posts
    17
    Releasing the context would mean to break the rendering loop. I dont want to do that. Plus, think about it, in what relation stands the execution of a random gs/tfb combination with the freeing of a whole context? For me, the rendering context is untouchable. It doesn't need to be released for the whole duration of the application. Because there is simply no need for that. I never consider touching glx/wgl because of pure gl calls. For me, that would be a bug. Plus, why is it possible for nvidia to do it without using 150 MB of GPU memory. Plus, my graphics card does not have tons of "GBs", I only have 1 GB and the system uses 300 MB of it constantly, making my GPU memory limited to 700 MB for me and other apps!

    So please just specify those documents of yours if you like to make a honest contribution, so maybe my questions will be answered!
    Or if you dont want to reveal those docs, at least specify what for that amount of memory would be used "internally". All data that a gs produces goes straight to the buffer. So the buffer are the "internal" storage and those buffers are also limiting the output of those stages. Read the specs for that.

    http://www.opengl.org/registry/
    and then
    GL_ARB_transform_feedback2 and
    GL_ARB_vertex_buffer_object and
    GL_ARB_geometry_shader4

  9. #9
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948
    If you would take the time and look up some publicly available documents, you would see that some GPUs require some internal buffers for communication between the geometry shader and previous/subsequent shader stages.
    And if you would take the time and look up some publicly available documents, you would see that these "internal buffers" are internal to the GPU. They don't take up GPU memory and they don't take up system memory. They don't get reallocated every frame, and they don't hang around after a frame and accumulate.

    So no, this is not due to those "internal buffers" for inter-stage communication.

    what I'm saying is that t cannot be considered memory leak if releasing the context frees it up.
    That's a pointless conversation of semantics. It's clear what he means: the driver is constantly allocating more and more memory. Whether that memory is forgotten about or not, it shouldn't be constantly allocating new memory like that every frame.

    You can argue that it's technically not a "memory leak" by some arbitrary definition, but it's not helping to solve the problem.

  10. #10
    Junior Member Newbie
    Join Date
    May 2013
    Posts
    7
    Quote Originally Posted by Alfonse Reinheart View Post
    And if you would take the time and look up some publicly available documents, you would see that these "internal buffers" are internal to the GPU. They don't take up GPU memory and they don't take up system memory. They don't get reallocated every frame, and they don't hang around after a frame and accumulate.

    So no, this is not due to those "internal buffers" for inter-stage communication.
    You are confusing things with LDS or GDS. The communication buffer between geometry shaders and other shader stages is not on-chip memory but actual video memory.

    Quote Originally Posted by Alfonse Reinheart View Post
    That's a pointless conversation of semantics. It's clear what he means: the driver is constantly allocating more and more memory. Whether that memory is forgotten about or not, it shouldn't be constantly allocating new memory like that every frame.
    That's a pointless comment. He didn't say it at all that it allocates it every frame. It allocates it only once and it never allocates any more, indifferent of how many frames/shaders you have.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •