Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 2 of 2 FirstFirst 12
Results 11 to 16 of 16

Thread: Fast normalization - slightly off topic

  1. #11
    Senior Member OpenGL Pro
    Join Date
    Jun 2000
    Location
    Shreveport, LA, USA
    Posts
    1,757

    Re: Fast normalization - slightly off topic

    Here is a 3DNow! vector normalization:
    Code :
     
    #include <AMD3D/amd3dx.h> // 3DNow! opcode macros
     
    void Normalize3f_3DNow(float *vec)
    {
    	_asm
    	{
    		femms
    		mov	eax, dword ptr [vec]
    		movq	mm0, [eax]
    		movq	mm3, mm0
    		pfmul	(m0,m0)
    		movd	mm1, [eax+8]
    		movq	mm4, mm1
    		pfmul	(m1,m1)
    		pfacc	(m0,m0)
    		pfadd	(m0,m1)
    		pfrsqrt (m1,m0)
    		movq	mm2,mm1
    		pfmul	(m2,m2)
    		pfrsqit1 (m2,m0)
    		pfrcpit2 (m2,m1)
    		punpckldq mm2,mm2
    		pfmul	(m3,m2)
    		movq	[eax],mm3
    		pfmul	(m4,m2)
    		movd	[eax+8],mm4
    		femms
    	}
    }
    I believe I simply copied that routine from the 3DNow! SDK. To get maximum performance when working on a bunch of vectors it is much better to use most of the above code inlined and use prefetch (or prefetchw) to fetch the next vector into cache while working with the current vector. Oh and you can lose the pfrsqit1, pfrcpit2, and one pfmul instruction if 15 bit precision is good enough.


    [This message has been edited by DFrey (edited 02-13-2001).]

  2. #12
    Senior Member OpenGL Guru Relic's Avatar
    Join Date
    Apr 2000
    Posts
    2,527

    Re: Fast normalization - slightly off topic

    There is a fastmath.cpp source on NVIDIA's developer relations page containing an approximation method for sqrt(): http://www.nvidia.com/Marketing/deve...Frame?OpenPage

  3. #13
    Senior Member OpenGL Guru Humus's Avatar
    Join Date
    Mar 2000
    Location
    Stockholm, Sweden
    Posts
    2,444

    Re: Fast normalization - slightly off topic

    DFrey:
    While your code probably works I don't understand why you start with FEMMS? It should only be at the end. You signal that you're exiting the multimedia state before you entering it?

  4. #14
    Senior Member OpenGL Guru Relic's Avatar
    Join Date
    Apr 2000
    Posts
    2,527

    Re: Fast normalization - slightly off topic

    No, that's ok.
    Read the 3Dnow! specs coming with the SDK:

    "Like the EMMS instruction, the FEMMS instruction can be used to clear the MMX
    state following the execution of a block of MMX instructions. Because the MMX
    registers and tag words are shared with the floating-point unit, it is necessary to clear
    the state before executing floating-point instructions. Unlike the EMMS instruction,
    the contents of the MMX/floating-point registers are undefined after a FEMMS
    instruction is executed. Therefore, the FEMMS instruction offers a faster context
    switch at the end of an MMX routine where the values in the MMX registers are no
    longer required. FEMMS can also be used prior to executing MMX instructions where
    the preceding floating-point register values are no longer required, which facilitates
    faster context switching.
    "

  5. #15
    Senior Member OpenGL Pro
    Join Date
    Jun 2000
    Location
    Shreveport, LA, USA
    Posts
    1,757

    Re: Fast normalization - slightly off topic

    That's how it was in the 3DNow! SDK. From my understanding, they tacked it onto the beginning just to put the mmx registers into a known (undefined ) state. I understand perfectly why it is on the end, and thought it odd at first when I saw it at the start too. But the white paper on it says the FEMMS instruction is to facilitate "Faster Enter/Exit of MMX or floating-point state".

  6. #16
    Senior Member OpenGL Guru Humus's Avatar
    Join Date
    Mar 2000
    Location
    Stockholm, Sweden
    Posts
    2,444

    Re: Fast normalization - slightly off topic

    Hmm ... that's cool .

    One learns something new each day

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •