Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Results 1 to 4 of 4

Thread: Performance of FBOs in various situations.

  1. #1
    Intern Newbie
    Join Date
    Feb 2008
    Posts
    45

    Performance of FBOs in various situations.

    Hi! I remember reading in some old ATI or nVidia papers that said that, in your Application, FBOs had to be created as early as possible for maximum performance.
    Is this still true today? If i want to create a large FBO for a shadowmap after having loaded a few resources, buffer objects, textures, etc. Does it still hold?

    To add a bit more depth to the discussion, I am familiar with how ATI and nVidia hardware works and I know that framebuffers are allocated in a video memory region called "Tiled Memory", which ensure that access to pixels for reading/writing linearly is more cache friendly. I know textures are not stored there because they are, instead, swizzled on upload and uploading buffer objects or shaders to tiled memory doesn't make any sense. Is this why papers recommend creating FBOs as early as possible? Is there any other reason? or is it just not important/necesary nowadays?

    Thanks!

  2. #2
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    941
    There is no such thing as a "large FBO" as FBOs only hold state, not actual resources. The textures and renderbuffers you use as FBO attachments are the resources that can be "large". Probably it is still preferable to allocate the textures and renderbuffers that you plan to use as FBO attachments as soon as possible as then it is more likely they will fit in video memory, rather than GPU addressable system memory. However, as long as you don't overrun your video memory budget, that shouldn't be an issue.

    Also, there is no such concept as "tiled memory". Textures and renderbuffers do have a tiled internal structure, i.e. the texels are not stored linearly but in a swizzled/tiled layout, however, that has no connection with the actual memory location of the resource itself, so they can be either in video memory or system memory. Buffers and 1D textures, in fact, have a linear layout, obviously, but every other texture and renderbuffer will most likely use a different (tiled) layout.

    So to sum it up:
    - Not FBO creation what matters but resource creation (i.e. texture and/or renderbuffer creation).
    - FBOs hold only state data, not resource data.
    - You shouldn't worry about the memory type used for the FBO attachment creation as long as you don't overrun your video memory budget with your buffers and textures.
    - There is no such thing as "tiled memory" but "tiled layout" which is independent of the memory type used.

    What paper did you read that recommends creating FBOs as early as possible?

    I think what made you confused is that earlier hardware had limit on how many depth textures could have Hi-Z support (due to special on-chip memory used for them). If that's the case, you shouldn't worry about it, modern GPUs have a unified handling of resources and do the Hi-Z construction (including compression and decompression) on-demand.
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

  3. #3
    Advanced Member Frequent Contributor
    Join Date
    Jan 2007
    Posts
    964
    That's most likely based on advice from the DirectX SDK which recommends the very same for render target textures (and other default pool resources), with the vendors extrapolating to OpenGL too. Yes, the thinking was to ensure that they have a higher chance of being allocated in GPU memory.

  4. #4
    Newbie Newbie
    Join Date
    Aug 2012
    Location
    Germany
    Posts
    2
    Hi there,

    i'm trying to use OpenGL for video frame processing (inside a filter for frameserving). For this purpose, i wrote following class for an offscreen OpenGL context on windows:

    OGLContext.h
    Code :
    #pragma once
     
    #include <GLEW/glew.h>
    #include <GLEW/wglew.h>
    #include <GL/glu.h>
    #include <string>
     
    class OGLContext
    {
    	public:
    		OGLContext(unsigned int, unsigned int, GLenum, unsigned char);
    		~OGLContext();
    		void Activate(bool);
    		void ReadPixels(GLubyte*);
    		void DrawPixels(GLubyte*);
     
    	private:
    		// Windows resources
    		std::wstring inst_name;	// Unique class name
    		HWND hwnd;
    		HDC hdc;
    		HGLRC ctx;
    		// FBO resources
    		GLuint tex_color, fbo_transfer, rbo_color, rbo_depth_stencil,  fbo_render;
    		// Context
    		unsigned int width, height;
    		GLenum colorspace;
    };
    OGLContext.cpp
    Code :
    #include "OGLContext.h"
    #include "resources.h"	// holds DLL module handle 'void *dll_module'
     
    //Window callback
    static LRESULT CALLBACK WndProc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam){return DefWindowProc(hwnd, msg, wParam, lParam);}
     
    // Create and activate OpenGL context
    OGLContext::OGLContext(unsigned int width, unsigned int height, GLenum colorspace, unsigned char antiAliasing) : width(width), height(height), colorspace(colorspace){
    	// Find unique instance name
    	WNDCLASSEX wcx;
    	this->inst_name = L"YoukaOffscreen00";
    	for(unsigned char i = 0; i <= 100; i++){
    		if(i == 100)
    			throw "Cannot use more than 100 instances!";
    		inst_name[14] = 48 + (i/10);
    		inst_name[15] = 48 + (i%10);
    		if(!GetClassInfoEx(reinterpret_cast<HINSTANCE>(dll_module), this->inst_name.c_str(), &wcx))
    			break;
    	}
    	// Window class
    	wcx.cbSize = sizeof(WNDCLASSEX);
    	wcx.style = CS_OWNDC;
    	wcx.lpfnWndProc = WndProc;
    	wcx.cbClsExtra = 0;
    	wcx.cbWndExtra = 0;
    	wcx.hInstance = reinterpret_cast<HINSTANCE>(dll_module);
    	wcx.hIcon = LoadIcon(NULL, IDI_APPLICATION);
    	wcx.hCursor = LoadCursor(NULL, IDC_ARROW);
    	wcx.hbrBackground = (HBRUSH)GetStockObject(BLACK_BRUSH);
    	wcx.lpszMenuName = NULL;
    	wcx.lpszClassName = this->inst_name.c_str();
    	wcx.hIconSm = LoadIcon(NULL, IDI_WINLOGO);
    	RegisterClassEx(&wcx);
    	// Create window
    	this->hwnd = CreateWindowEx(0, this->inst_name.c_str(), this->inst_name.c_str(), WS_POPUP, 0, 0, this->width, this->height, NULL, NULL, reinterpret_cast<HINSTANCE>(dll_module), 0);
    	// Get window context
    	this->hdc = GetDC(this->hwnd);
    	// Set window context pixel format
    	PIXELFORMATDESCRIPTOR pfd;
    	memset(&pfd, 0, sizeof(PIXELFORMATDESCRIPTOR));
    	pfd.nSize = sizeof(PIXELFORMATDESCRIPTOR);
    	pfd.nVersion = 1;
    	pfd.dwFlags = PFD_DRAW_TO_WINDOW | PFD_SUPPORT_OPENGL;
    	pfd.iPixelType = PFD_TYPE_RGBA;
    	pfd.cColorBits = 32;
    	pfd.cRedBits = 8;
    	pfd.cGreenBits = 8;
    	pfd.cBlueBits = 8;
    	pfd.cAlphaBits = 8;
    	pfd.cDepthBits = 24;
    	pfd.cStencilBits = 8;
    	pfd.iLayerType = PFD_MAIN_PLANE;
    	int pformat = ChoosePixelFormat(this->hdc, &pfd);
    	if(!SetPixelFormat(this->hdc, pformat, &pfd)){
    		ReleaseDC(this->hwnd, this->hdc);
    		DestroyWindow(this->hwnd);
    		UnregisterClass(this->inst_name.c_str(), reinterpret_cast<HINSTANCE>(dll_module));
    		throw "Couldn't find a fitting pixel format!";
    	}
    	// Create OGL context
    	this->ctx = wglCreateContext(this->hdc);
    	// Initialize glew for OpenGL >1.1 and check needed version & extensions
    	this->Activate(true);
    	if(glewInit() || !GLEW_VERSION_2_1 || !GLEW_ARB_framebuffer_object){
    		this->Activate(false);
    		wglDeleteContext(this->ctx);
    		ReleaseDC(this->hwnd, this->hdc);
    		DestroyWindow(this->hwnd);
    		UnregisterClass(this->inst_name.c_str(), reinterpret_cast<HINSTANCE>(dll_module));
    		throw "Couldn't initialize GLEW or OpenGL 2.1 & ARB_framebuffer_object isn't supported!";
    	}
    	// Create transfer FBO
    	glGenTextures(1, &this->tex_color);	// Color
    	glBindTexture(GL_TEXTURE_2D, this->tex_color);
    	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
    	glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, this->width, this->height, 0, GL_RGBA, GL_UNSIGNED_BYTE, NULL);
    	glBindTexture(GL_TEXTURE_2D, 0);
    	glGenFramebuffers(1, &this->fbo_transfer);	// Attach
    	glBindFramebuffer(GL_FRAMEBUFFER, this->fbo_transfer);
    	glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, this->tex_color, 0);
    	if(glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE){
    		glBindFramebuffer(GL_FRAMEBUFFER, 0);
    		glDeleteFramebuffers(1, &this->fbo_transfer);
    		glDeleteTextures(1, &this->tex_color);
    		this->Activate(false);
    		wglDeleteContext(this->ctx);
    		ReleaseDC(this->hwnd, this->hdc);
    		DestroyWindow(this->hwnd);
    		UnregisterClass(this->inst_name.c_str(), reinterpret_cast<HINSTANCE>(dll_module));
    		throw "Bad framebuffer status!";
    	}
    	// Create render FBO
    	glGenRenderbuffers(1, &this->rbo_color);	// Color
    	glBindRenderbuffer(GL_RENDERBUFFER, this->rbo_color);
    	glRenderbufferStorageMultisample(GL_RENDERBUFFER, antiAliasing, GL_RGBA, this->width, this->height);
    	glGenRenderbuffers(1, &this->rbo_depth_stencil);	// Depth & stencil
    	glBindRenderbuffer(GL_RENDERBUFFER, this->rbo_depth_stencil);
    	glRenderbufferStorageMultisample(GL_RENDERBUFFER, antiAliasing, GL_DEPTH_STENCIL, this->width, this->height);
    	glBindRenderbuffer(GL_RENDERBUFFER, 0);
    	glGenFramebuffers(1, &this->fbo_render);	// Attach
    	glBindFramebuffer(GL_FRAMEBUFFER, this->fbo_render);
    	glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER, this->rbo_color);
    	glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_RENDERBUFFER, this->rbo_depth_stencil);
    	if(glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE){
    		glBindFramebuffer(GL_FRAMEBUFFER, 0);
    		glDeleteFramebuffers(1, &this->fbo_render);
    		glDeleteRenderbuffers(1, &this->rbo_color);
    		glDeleteRenderbuffers(1, &this->rbo_depth_stencil);
    		glDeleteFramebuffers(1, &this->fbo_transfer);
    		glDeleteTextures(1, &this->tex_color);
    		this->Activate(false);
    		wglDeleteContext(this->ctx);
    		ReleaseDC(this->hwnd, this->hdc);
    		DestroyWindow(this->hwnd);
    		UnregisterClass(this->inst_name.c_str(), reinterpret_cast<HINSTANCE>(dll_module));
    		throw "Bad framebuffer status!";
    	}
    	// All done; deactivate context for now
    	this->Activate(false);
    }
     
    // Deactivate and destroy OpenGL context
    OGLContext::~OGLContext(){
    	// Free FBOs
    	glBindFramebuffer(GL_FRAMEBUFFER, 0);
    	glDeleteFramebuffers(1, &this->fbo_render);
    	glDeleteRenderbuffers(1, &this->rbo_color);
    	glDeleteRenderbuffers(1, &this->rbo_depth_stencil);
    	glDeleteFramebuffers(1, &this->fbo_transfer);
    	glDeleteTextures(1, &this->tex_color);
    	// Free OGL context
    	this->Activate(false);
    	wglDeleteContext(this->ctx);
    	// Free window context
    	ReleaseDC(this->hwnd, this->hdc);
    	// Free window
    	DestroyWindow(this->hwnd);
    	// Unregister window class
    	UnregisterClass(this->inst_name.c_str(), reinterpret_cast<HINSTANCE>(dll_module));
    }
     
    // (De)Activates OpenGL context for current thread
    void OGLContext::Activate(bool active){
    	if(active)
    		wglMakeCurrent(this->hdc, this->ctx);
    	else
    		wglMakeCurrent(this->hdc, NULL);
    }
     
    // Reads image from framebuffer
    void OGLContext::ReadPixels(GLubyte *image){
    	glBindFramebuffer(GL_READ_FRAMEBUFFER, this->fbo_render);
    	glBindFramebuffer(GL_DRAW_FRAMEBUFFER, this->fbo_transfer);
    	glBlitFramebuffer(0, 0, this->width, this->height, 0, 0, this->width, this->height, GL_COLOR_BUFFER_BIT, GL_NEAREST);
    	glBindFramebuffer(GL_FRAMEBUFFER, this->fbo_render);
    	glBindTexture(GL_TEXTURE_2D, this->tex_color);
    	glGetTexImage(GL_TEXTURE_2D, 0, this->colorspace, GL_UNSIGNED_BYTE, image);
    	glBindTexture(GL_TEXTURE_2D, 0);
    }
     
    // Sends image to framebuffer
    void OGLContext::DrawPixels(GLubyte *image){
    	glBindTexture(GL_TEXTURE_2D, this->tex_color);
    	glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, this->width, this->height, this->colorspace, GL_UNSIGNED_BYTE, image);
    	glBindTexture(GL_TEXTURE_2D, 0);
    	glBindFramebuffer(GL_READ_FRAMEBUFFER, this->fbo_transfer);
    	glBindFramebuffer(GL_DRAW_FRAMEBUFFER, this->fbo_render);
    	glBlitFramebuffer(0, 0, this->width, this->height, 0, 0, this->width, this->height, GL_COLOR_BUFFER_BIT, GL_NEAREST);
    	glBindFramebuffer(GL_FRAMEBUFFER, this->fbo_render);
    }

    It's important for me to render with multisampling and having a good performance. Regrettably, pixel transfer by member functions ReadPixels and DrawPixels is extremely slow, so streaming a video with 24 frames per second hangs a lot (by not more than simple pixel transfer per frame, no drawing).
    In comparison: before, i tried it with one single FBO without multisampling and glReadPixels+glDrawPixels for pixel transfer - much better performance, no hanging.

    I don't want to require multisampled textures, but is there an alternative for better performance?
    Last edited by Youkakun; 08-09-2012 at 02:42 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •