Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Results 1 to 10 of 10

Thread: the fastest way to upload large textures

  1. #1
    Junior Member Newbie
    Join Date
    Mar 2008
    Posts
    4

    the fastest way to upload large textures

    Hi every one,

    I'm working on a video editing project which uses OpenGL to render the video to screen. As the videos that we need to process are often pretty large (HD1080p or even larger), the texture uploading performance become critical.

    Currently we are using PBOs to upload the decoded video(on every video frame, map a buffer, decode into the buffer, unmap it, then use glTexSubImage2D to complete the uploading), but it's much slower and consumes much more CPU cycles compared to its Direct3D counterpart(which basically does the same thing, only in d3d: lock a texture surface, decode into the surface, then unlock it). While the video is running, the GL code uses 20-25% CPU time while the D3D code only around 15%.

    So I would like to know is there anyway to improve OpenGL implementation to something at least on par with d3d or is OpenGL just cannot beat D3D on this particular task? Thanks.

  2. #2
    Advanced Member Frequent Contributor yooyo's Avatar
    Join Date
    Apr 2003
    Location
    Belgrade, Serbia
    Posts
    872

    Re: the fastest way to upload large textures

    Depends on PBO usage pattern. To get maximum performances you must avoid CPU/GPU stall. My suggestion is to create PBO pool (a pool with several PBO buffers). Keep in mind that PBO memory is not cacheable so it is bad idea to decode frame directly into PBO memory. Let decoder decode frame (random access) in system memory then copy image data into PBO memory (sequential access operation).
    Next problem is do not call glMapBufers just after glTexSubImage2D call. When using PBO, all pixel transfer functions become non-blockable, but there is a cacth if you try to map pbo buffer too early. If pending operation is not finished app will stall until pbo gets free (from GPU side). The best solution would be to map buffer much later or next frame. Typical pbo usage pattern can be:
    1. Create PBO
    2. map buffer
    3. give pbo pointer to decoder thread (or put it in pool of free pbo's)
    4. decoder copy frame in pbo memory and notify render thread about that, or decoder ask pool for free pbo pointer and copy image data and notify render thread about that.
    5. render thread unmap pointer and call glTexSubImage2D
    6. render thread mark that pbo to map its pointer again at next frame (or two frames later)
    7. at next frame (or two frames later) map pbo pointer and give it to decoder thread (or pool)

    Using pool you can handle multiple video stream transfers.

    Im not d3d guy, so can you do a little test for me... Is the locked pointer (from texture) changed between two consecutive locks or it is always same?

  3. #3
    Junior Member Newbie
    Join Date
    Mar 2008
    Posts
    4

    Re: the fastest way to upload large textures

    The decoder's access to the target buffer is write only, it never reads from the buffer so i think it's fine to decode directly into the PBO as this saves an extra copy from the decoding buffer to the PBO.

    Just playing the video is not a big issue, but we are also doing a lot of other processing at the same time, so we do need to squeeze out every last CPU cycle possible.

    And yes, it looks like D3D always returns the same memory address.

  4. #4
    Intern Contributor
    Join Date
    Feb 2005
    Posts
    90

    Re: the fastest way to upload large textures

    Hi,

    I can tell you that when I use pbo on our system (Nvidia Quadro) I found that map/unmap are slow, I just use glBufferData, no sub or anything, this is faster in my case. I assume map/unmap are slower because they use the same memory but glBufferData can allocate new buffer if the old one is in use.

    Hope it helps.
    Ido

  5. #5
    Junior Member Newbie
    Join Date
    Feb 2008
    Posts
    1

    Re: the fastest way to upload large textures

    Quote Originally Posted by Ido Ilan
    Hi,

    I can tell you that when I use pbo on our system (Nvidia Quadro) I found that map/unmap are slow, I just use glBufferData, no sub or anything, this is faster in my case. I assume map/unmap are slower because they use the same memory but glBufferData can allocate new buffer if the old one is in use.

    Hope it helps.
    Ido
    You can speed up glMapBuffer for VBOs and PBOs considerably if you call glBufferData and pass in null for the data before calling glMapBuffer. Nulling the data essentially flags the driver that the data in the buffer is invalid and it doesn't have to stall attempting to preserve it. This is of course only useful when you don't need the old data in the buffer.

    This is essentially the same thing as passing the D3DLOCK_DISCARD flag when you a lock a vertex buffer or texture resource in direct3D.

  6. #6
    Junior Member Newbie
    Join Date
    Jun 2008
    Posts
    1

    Re: the fastest way to upload large textures

    Hi

    I have pretty much the same question, but I'll try to be a bit more specific. How can I obtain a pointer into video memory? This is what you get when you LockRect a surface in D3D8, if the texture was created in the default pool. I wouldn't like there to be a buffer in system memory, and also no copying.
    So, I think using PBO-s is not the solution I'm looking for.
    Thanks,
    Kornel

  7. #7
    Advanced Member Frequent Contributor yooyo's Avatar
    Join Date
    Apr 2003
    Location
    Belgrade, Serbia
    Posts
    872

    Re: the fastest way to upload large textures

    There is no way to get pointer into vid-mem.

  8. #8
    Senior Member OpenGL Pro Zengar's Avatar
    Join Date
    Sep 2001
    Location
    Germany
    Posts
    1,932

    Re: the fastest way to upload large textures

    Mapping a buffer object may give you a pointer to video memory, but it depends on the driver so there is no way to be sure if the buffer obejct is really in vram.

  9. #9
    Super Moderator OpenGL Guru
    Join Date
    Feb 2000
    Location
    Montreal, Canada
    Posts
    4,264

    Re: the fastest way to upload large textures

    I don't see how you can get a pointer to video memory. Your own process exists in it's own memory space and that's all RAM.
    VRAM is only accessed by kernel mode applications.
    ------------------------------
    Sig: http://glhlib.sourceforge.net
    an open source GLU replacement library. Much more modern than GLU.
    float matrix[16], inverse_matrix[16];
    glhLoadIdentityf2(matrix);
    glhTranslatef2(matrix, 0.0, 0.0, 5.0);
    glhRotateAboutXf2(matrix, angleInRadians);
    glhScalef2(matrix, 1.0, 1.0, -1.0);
    glhQuickInvertMatrixf2(matrix, inverse_matrix);
    glUniformMatrix4fv(uniformLocation1, 1, FALSE, matrix);
    glUniformMatrix4fv(uniformLocation2, 1, FALSE, inverse_matrix);

  10. #10
    Senior Member OpenGL Pro Zengar's Avatar
    Join Date
    Sep 2001
    Location
    Germany
    Posts
    1,932

    Re: the fastest way to upload large textures

    I though the driver was able to map parts of VRAM to virtual memory...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •