I am encountering some strange behaviour when working with atomic buffers.
If i'm only Writing to them while creating them with GL_DYNAMIC_DRAW and mapping them with GL_MAP_WRITE they work perfectly.
If i'm trying to *also* read what is in them, by creating them with GL_DYNAMIC_COPY and mapping with GL_MAP_READ | GL_MAP_WRITE i'm getting a varying mapping time. The time increases in a linear manner with the number of fragments that access the atomic counter gpu side.

App working fine, no gl errors.
I really don't understand this difference in behaviour.

Buffer creation (once):
Code :
glGenBuffers(1, &atomicbuffer);
glBindBufferBase(GL_ATOMIC_COUNTER_BUFFER,0, atomicbuffer);

Buffer usage (per frame):
Code :
glBindBuffer(GL_ATOMIC_COUNTER_BUFFER, atomicbuffer);
GLuint* ptr = (GLuint*)glMapBufferRange(GL_ATOMIC_COUNTER_BUFFER, 0, sizeof(GLuint),GL_MAP_WRITE_BIT | GL_MAP_READ_BIT); <------------------ not ok speed
//GLuint* ptr = (GLuint*)glMapBufferRange(GL_ATOMIC_COUNTER_BUFFER, 0, sizeof(GLuint),GL_MAP_WRITE_BIT);                           <------------------- OK speed, comment'd
memory_fragmentcount = ptr[0];                                                                                                                                        
memory_necessary = memory_fragmentcount*FRAGMENTSIZE;
ptr[0] = 0; 
glBindBufferBase(GL_ATOMIC_COUNTER_BUFFER, unit_atomic, atomicbuffer);

Btw, there is no difference whatsoever in using glMapBuffer on the entire buffer or glMapBufferRange.

Really, if it would be a sync problem (more fragments/hardware units/etc) modifying the same resource (the atomic buffer) shouldn't the writing be more problematic/slow or at the very least equally bad?