Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 2 of 2 FirstFirst 12
Results 11 to 19 of 19

Thread: Creating objects in another thread

  1. #11
    Junior Member Newbie
    Join Date
    Oct 2013
    Posts
    13
    Yeah, the AMD guy said the same thing too.
    It's a bummer, but thanks
    Last edited by septag; 01-28-2014 at 02:20 AM. Reason: Ranting sucks

  2. #12
    Junior Member Newbie
    Join Date
    Oct 2013
    Posts
    13
    Now I'm creating resources in the main thread and fill them in loader, here is the procedure :

    • Loader Thread: Load the file and read the buffers, pass their data/params to a queue for main thread
    • Main thread: Read the queue and create pending objects, for example VBO is created by: glGenBuffer(1, &buff); glBindBuffer(GL_ARRAY_BUFFER, buff); glBufferData(GL_ARRAY_BUFFER, size, NULL, GL_STATIC_DRAW);
    • Main thread: map created buffers glMapBufferRange(GL_ARRAY_BUFFER, 0, size, GL_MAP_WRITE_BIT | GL_MAP_UNSYNCHRONIZED_BIT);, and pass the mapped pointer to loader thread
    • Loader thread: memcpy buffers into mapped GL buffers
    • Main thread: Unmap all buffers and the data is ready for render glUnmapBuffer(GL_ARRAY_BUFFER);


    Now, without the extra contexts, the frame rate is stable.

    But on loads I still get huge spikes in frame time just like the previous method.
    I use mutexes and conditions to sync the stuff, but I don't use any blocking in the main thread, all happens in loader thread where performance doesn't really matter for me.
    In OpenGL docs it recommends that don't ever create objects in the update/render loop, that's what I'm doing now. So anyone experimented with this ?
    Is this a problem with creating objects ? Do I have to create all GL objects on init time and manage them in some way so there is no glGen/glData... ?!

  3. #13
    Intern Contributor
    Join Date
    Nov 2011
    Posts
    51
    I'm already using your method from months but I'm not sure this method is REALLY thread-safe (even if it seems to work fine).
    Follow my thought:
    • Main Thread : glMapBuffer/glMapBufferRange....
    • Uploading/Loading Thread : memcpy(...)
    • Main Thread : glUnmapBuffer()

    While CPU executing memcpy(...) in another thread, MainThread continue to run drawing graphic using a buffer WHILE other thread may change its data...I'm not really sure if it is good.

    PS: Your "blocking situation" may dipends of Drivers: they may block MainThread (during rendering) because glMapBuffer refers to an address space of memory IN-USE by other thread during upload with memcpy...If this behavour will be confirmed, using this method is near the same as execute memcpy() in the MainThread.....

    PS2: (ref: http://www.opengl.org/sdk/docs/man/x...lMapBuffer.xml) "A mapped data store must be unmapped with glUnmapBuffer before its buffer object is used" so, in theory, Binding VBO/VAO without have unmapped the buffer could generate an error, and this case could became real at every new frame until other Thread has finished to execute memcpy()....
    Last edited by tdname; 01-29-2014 at 04:32 AM.

  4. #14
    Junior Member Newbie
    Join Date
    Oct 2013
    Posts
    13
    I'm already using your method from months but I'm not sure this method is REALLY thread-safe (even if it seems to work fine).
    Follow my thought:
    I'm using the method you said, check the previous post

    But I still get severe spikes, I suspect they are from glGenBuffer/glBufferData funcs
    And also, I'm mapping with GL_MAP_UNSYNCHRONIZED_BIT flag, which doesn't suppose to do any blocking on the main thread.
    And I think it's thread safe to memcpy in another thread, the AMD driver developer recommended the same thing too.

    Also CPU side during the mentioned GL calls are fast (By checking the delta time between the whole create/map operation), because obviously those commands are streamed to driver for later processing. It seems that the stall occurs on SwapBuffers or something. I don't have the proper tools/knowledge to detect where exactly it stalls.
    Last edited by septag; 01-29-2014 at 06:07 AM.

  5. #15
    Intern Contributor
    Join Date
    Nov 2011
    Posts
    51
    I was just saying that I was using the same your method, so I know something about that and I don't answer you just with theory

    GL_MAP_UNSYNCHRONIZED_BIT is not safe without using multiple buffers (use n.1 while uploading while n.2 is used for rendering, then switch n.1 with n.2 in next operation) or doubled size buffer (use first half during upload while second half during rendering, then switch the parts) for the same reason I wrote before: you are risking to change a buffer/memory while it is in use by GPU.
    Using those two methods the upload-to-GPU procedure is really async without problems, otherwise the CPU, GPU and/or the Driver should block and executes a SYNC until you release mapped buffer with glUnmapBuffer(). It is irrilevant if the block occurs during SwapBuffer or MapBufferRange/memcpy, because indeed it is always a "synced" method for the GPU-side.

    (sorry for my English but it isn't my first language)

  6. #16
    Member Regular Contributor
    Join Date
    Apr 2004
    Posts
    250
    I was looking to find the most asynchronous and efficient way to upload data to a GPU buffer and tried all sorts of variants i could think of. I played with the various usage and map flags.
    In the end the most efficient turned out to be plainly using glBufferSubData. Any attempt to use glMapBufferRange produced worse results.
    The GL_MAP_UNSYNCHRONIZED_BIT, at least on nvidia, desn't seem to do what is is supposed to do. That is, their driver apparently still does nasty synchronizations that cripple frame rate.

    When i did the above "research" i was on nvidia so the result may not be applicable for ati. Now i have an ati too, but im too lazy to repeat the experiments again.
    Last edited by l_belev; 01-30-2014 at 07:44 AM.

  7. #17
    Intern Contributor
    Join Date
    Nov 2011
    Posts
    51
    I'm not sure about your last sentence because GL_MAP_UNSYNCHRONIZED_BIT is designed to work within an environment that unmaps buffers after they're used, but the case discussed here supposes to leave the buffer mapped "indefinitely" while MainThread (the Rendering one) needs to use that buffer.

    GL_MAP_UNSYNCHRONIZED_BIT is a good flag when combined with Multiple PBO ("ping pong" method) or Doubled (in size) PBO as I wrote before, in this case you can execute glMapBufferRange()+memcpy()+glUnmapBuffer() in the same function without ruins GPU's Rendering procedure (which uses other PBO or just "the other half of PBO").

  8. #18
    Member Regular Contributor
    Join Date
    Apr 2004
    Posts
    250
    Are you sure a buffer can stay permanently mapped with GL_MAP_UNSYNCHRONIZED_BIT even while the GPU is using it?
    I remember i was wondering about that too but then i read the specification and it sounded like that is impossible, so i gave up this idea.
    Here is a citation of the relevant text in the 4.4 spec:

    6.3.1 Unmapping Buffers

    After the client has specified the contents of a mapped buffer range, and before the
    data in that range are dereferenced by any GL commands, the mapping must be
    relinquished by calling

    boolean UnmapBuffer( enum target );

    it unconditionally says that the buffer must be unmapped. There is no exception for the GL_MAP_UNSYNCHRONIZED_BIT flag.
    Have you really tried drawing from mapped buffer and got it working?


    Even it it happens to work on your driver, it's probably not a good idea to rely on that because it looks to go against the specification
    and other drivers may not support it.

    I think allowing the GPU to use GL_MAP_UNSYNCHRONIZED_BIT-mapped buffer would be very useful feature, well worth posting a suggestion about it
    Last edited by l_belev; 01-30-2014 at 12:27 PM.

  9. #19
    Member Regular Contributor
    Join Date
    Apr 2004
    Posts
    250
    I just read in the spec that there is a new flag in 4.4 to allow permanent mapping: GL_MAP_PERSISTENT_BIT.
    The spec is not clear, but it sounds like there may be performance implications about this flag. For example there is also a function FlushMappedBufferRange, which is mentioned together with GL_MAP_PERSISTENT_BIT at one place.
    I wonder why is this function necessary if not to fix some possible performance problems.
    Anyhow, i don't know about you, but i am currently limited to opengl version 3.3 because i need to support older hardware but this flag GL_MAP_PERSISTENT_BIT is only available starting with 4.4

    Another consideration is that if it was really possible (for the nvidia driver) to let the CPU access the same memory from which the GPU is currently drawing, then there shouldn't be any synchronizations
    when you map a buffer with GL_MAP_UNSYNCHRONIZED_BIT and then unmap before drawing from it. But according to the frame rate there are synchronizations for sure (as i mentioned earlier).

    If it is not possible for the driver to do it then GL_MAP_PERSISTENT_BIT will be emulated by memcpy-ing on the CPU from the mapped buffer any time you issue a drawing operation that is supposed to read data from that buffer,
    which can not be any better than plain glBufferSubData (but can be worse). The mapped buffer itself will be just plain system memory.
    Last edited by l_belev; 01-31-2014 at 03:00 AM.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •