Hi All,

i experienced an interesting issue when working with mapped buffers and multithreaded writes to these buffers, at least with the latest ATI drivers on windows ( others need to be verified). We have a multithreaded data producer ( unpacker ) that writes to mapped memory. when requesting read AND write memory, the call is 30x faster.


Code :
// alloc buffer
glBindBuffer( GL_PIXEL_UNPACK_BUFFER, buffer_id );
glBufferData(GL_PIXEL_UNPACK_BUFFER, size, NULL, GL_STREAM_DRAW);

This scope takes 20ms to complete
Code :
{
void* b = glMapBufferRange( GL_PIXEL_UNPACK_BUFFER, 0, size, GL_MAP_WRITE_BIT|GL_MAP_READ_BIT );
copyTo( b, getRGBASize() ); // <- ca. 8 threads writing ca. 64MB data
glUnmapBuffer( GL_PIXEL_UNPACK_BUFFER );
}

This scope takes 600ms to complete
Code :
{
void* b = glMapBufferRange( GL_PIXEL_UNPACK_BUFFER, 0, size, GL_MAP_WRITE_BIT );
copyTo( b, getRGBASize() ); // <- 8 threads writing ca. 64MB data
glUnmapBuffer( GL_PIXEL_UNPACK_BUFFER );
}

Is it possible that we get video memory in the latter case and the threaded write trashes the memory access?