Implementation to achieve fastest brush drawing performace

Hi,
I am drawing into a 3D texture with a brush and I am looking for the most efficient way to implement this so that there is minimal lag. My 3D texture can be 512^3 and the brush typically 1-30 pixels squared.
The brush can be any shape, so I do need to perform and OR operation upon pasting the brush into the target texture.

Currently I use glGetTeximage to retrieve what is already in the texture and then I OR that with the Brush (CPU side), finally I send back the result withglTexSubimage3D.
Is this the quickest way?
I’m not seeing unreasonable performance (if the user moves the cursor quickly, the stroke will be discontinuous, but I guess this is due to the frequency of the mouse-motion call-backs) but I’d like know if there is a quicker way to do it!

Thanks,
Soren

Here’s a quicker way: stop reading from OpenGL.

You gave OpenGL that pixel data. You therefore know exactly what it is. So… why are you reading it back if you already have it? And if you threw it away, why would you throw something away when you know that you’re going to have to get it back almost immediately?

Just keep around a copy of the pixel data and modify that. Then send it to OpenGL to use.

Thanks! Makes sense. I think I had some irrational fear the CPU and GPU representation would come out of sync since I only transmit small subsections to the GPU on every draw, but this should not really be a concern so I will go with what you suggest.
Soren