PDA

View Full Version : accessing an array faster



divide
02-09-2004, 11:56 AM
i noticed that accessing
float test;
is at least 2 time faster than accessing an element of
float test[260000]

however i need to have a storage place in memory working like an array but which could be accessed as fast as a simple float.
Do you know a method which could solve this performance issue ?

crystall
02-09-2004, 01:57 PM
Originally posted by divide:
i noticed that accessing
float test;
is at least 2 time faster than accessing an element of
float test[260000]

however i need to have a storage place in memory working like an array but which could be accessed as fast as a simple float.
Do you know a method which could solve this performance issue ?

You should really post more of your code... From what I can see in the first case you are using a single variable, which will probably end up in a register were it is accessed directly during the calculations. In the second case the compiler will generate a load from memory to fetch the data before using it, there is no way you can speed this up except by minimizing the data you have to load from memory.

madmortigan
02-09-2004, 03:00 PM
Hi

I agree with crystall here, on telling more about the way and the way you noticed it. How did you come up with the speed comparison? You may also find that the answer depends on the compiler you're using.

mad

divide
02-09-2004, 09:50 PM
here's the code which has a performance issue:

for (ScanPixel=ScanStart;ScanPixel<ScanStop;ScanPixel++)
{
CurrentCoord=ScanPixel+ScanLine*512;
CurrentLayerOffset=bNumID[CurrentCoord]*LayerOffsetCoeff;
bObject[CurrentLayerOffset+CurrentCoord]=ObjectRef;
t2Current=(float)(ScanPixel-ScanStart)*t2InvertSize;
t2PrimeCurrent=FastTPrime(t2Current,e2);

bZ1[CurrentLayerOffset+CurrentCoord]=CutZ[0]*(1-t2PrimeCurrent)+CutZ[1]*t2PrimeCurrent;
bU1[CurrentLayerOffset+CurrentCoord]=CutTex[0].U*(1-t2PrimeCurrent)+CutTex[1].U*t2PrimeCurrent;
bV1[CurrentLayerOffset+CurrentCoord]=CutTex[0].V*(1-t2PrimeCurrent)+CutTex[1].V*t2PrimeCurrent;
bW1[CurrentLayerOffset+CurrentCoord]=CutTex[0].W*(1-t2PrimeCurrent)+CutTex[1].W*t2PrimeCurrent;
}

(all this inside another loop, which increment ScanLine var)

filling arrays bZ1,bU1,bV1 and BW1 is way longer than if bZ1,bU1,bV1 and BW1 were simple float.

crystall
02-10-2004, 02:06 AM
Originally posted by divide:
here's the code which has a performance issue:

-- snip --

(all this inside another loop, which increment ScanLine var)

filling arrays bZ1,bU1,bV1 and BW1 is way longer than if bZ1,bU1,bV1 and BW1 were simple float.

This is normal, if you just change bZ1, bU1, bV1 and BW1 to floats they will probably be allocated to registers and none of your data will be written to memory. One thing which comes to my mind is that your code could have some problems when storing data to memory and that it can be made faster. For example, you are writing to 4 different arrays simultaneously and in some cases this could be trashing your processor caches quite badly. In this case if you could write all your data sequentially into one array it may be faster (if you don't need the four arrays to be separated that is).

divide
02-10-2004, 07:34 AM
thanks, I'll try with one array with a pointer to navigate in (if that's what you mean by sequential access)

crystall
02-10-2004, 03:34 PM
Originally posted by divide:
thanks, I'll try with one array with a pointer to navigate in (if that's what you mean by sequential access)

Yes that's what I meant. Remember if you don't need to vectorize your code later try to stick all the data you can together.