glTexSubImage performance

I need to display some very large luminance images(nbX=1000 x nbY=2500 typical). The image has to be refreshed each time the user modifies contrast/illumination.

I allocate an array m_pTextures of k subtextures of dimension GL_MAX_TEXTURE_SIZE x GL_MAX_TEXTURE_SIZE (enough of them to hold the large texture).
I do:

glPixelStorei(GL_UNPACK_ROW_LENGTH, nbX);
glPixelTransferf(GL_RED_SCALE, scale);
glPixelTransferf(GL_RED_BIAS, bias);
glPixelTransferf(GL_BLUE_SCALE, scale);
glPixelTransferf(GL_BLUE_BIAS, bias);
glPixelTransferf(GL_GREEN_SCALE, scale);
glPixelTransferf(GL_GREEN_BIAS, bias);

Then, for each subtexture I do:

glBindTexture(GL_TEXTURE_2D, m_pTextures[k]);
glTexImage2D(…NULL)
glPixelStoref(GL_UNPACK_SKIP_PIXELS, m_pOffsetTextureX[k]);
glPixelStoref(GL_UNPACK_SKIP_ROWS, m_pOffsetTextureY[k]);
glTexSubImage2D(…, data);

I could store data in a variety of types. Is there one that would speed-up the transfer? I tought that GL_FLOAT would be faster, as the blue book says that the first step performed by glTexSubImage2D is conversion to floating-point, but performance improvement is very small.

try UNSIGNED_BYTE
there is a program that will let u benchmark all the various methods search ffor glperf

I believe that using bias and scales really slows things down, as it’s no longer a straight copy - just like if you use zoom in the pixel transfer.
BTW, I also understand gl_float to be the fastest datatype for image transfers.

When the documentation says that conversion to float is the first step, they actually mean that conversion to the 0.0->1.0 floating-point range is CONCEPTUALLY the first step.

Most current hardware (except Radeon 9700 and some really high-end boards) store textures as 8888 or 1010102, and do their math in some fixed point format. Even if they support the imaging subset, they’re still likely to be faster at fixed point formats.

Also, if you think that 4 floats is faster to send across the bus than 4 bytes, then you ought to grab your calculator and try again :slight_smile:

Now, it’s quite possible that a majority of consumer cards may implement RED_SCALE and RED_BIAS and friends using software (driver) scaling, rather than hardware. Performance in that case is dependent mostly on the quality on the driver (and a bit on the total data size of your texture).

Yes it defies logic that float pixel transfer
would be faster. Certainly not the
case on SGI or pre nv30 cars.

On an nvidia geoforce2go (yes I know)
Float is about 1/5 the speed of 8888
for glDrawPixels. Add a BIAS
and Float and 8888 are almost identical.
(ie real slow)

Really wish these card makers would
look into hardware packing algs.
Entire Arb Imaging path on Nvidia
appears to be written in software.