# Thread: Spectrogram and FFT using OpenGL

1. Originally Posted by The Little Body
I don't understand why cos(angle) and cos(angle) + i sin(angle) give now exactly the same FFTs
Because adding the coefficients alone simply gives you the real part.

If you add the complete terms, you get:

(a+bi).e-ix
= (a+bi).(cos(-x)+i.sin(-x))
= (a+bi).(cos(x)-i.sin(x))
= (a+bi).cos(x)-(a+bi).i.sin(x)
= a.cos(x)+bi.cos(x)-ai.sin(x)+b.sin(x)
= (a.cos(x)+b.sin(x))-i.(a.sin(x)-b.cos(x))

(a-bi).eix
= (a-bi).(cos(x)+i.sin(x))
= (a-bi).cos(x)+(a-bi).i.sin(x)
= a.cos(x)-bi.cos(x)+ai.sin(x)+b.sin(x)
= (a.cos(x)+b.sin(x))+i.(a.sin(x)-b.cos(x))

Note that these two expressions are conjugates (identical real parts, imaginary parts have the same magnitude but opposite sign).

(a+bi).e-ix + (a-bi).eix
= (a.cos(x)+b.sin(x))-i.(a.sin(x)-b.cos(x)) + (a.cos(x)+b.sin(x))+i.(a.sin(x)-b.cos(x))
= (a.cos(x)+b.sin(x))+(a.cos(x)+b.sin(x)) - i.(a.sin(x)-b.cos(x))+i.(a.sin(x)-b.cos(x))
= 2.(a.cos(x)+b.sin(x)) + 0.i

Note that a.cos(x)+b.sin(x) = r.sin(x+p) where r2=a2+b2 and tan(p)=a/b.

But if you only want the magnitude, you can just find the magnitude of one of the coefficients and double it.

Essentially: if you graph y=e^ix (one axis for real x, two for complex y) you get a helix. If you graph y=e^-ix you get a helix with the opposite rotation (CW vs CCW). In much the same way that, for real numbers, f(x)=k+x is a translation by k and f(x)=k*x is a scaling, for complex numbers f(z)=k+z is a (2D) translation while f(z)=k*z is a rotation and scaling. So multiplying a helix by a complex amplitude rotates it, and rotating a helix is the same as translating it (i.e. a phase shift). If you multiply the helix e^-ix by (a+bi) you shift it in one direction; if you multiply the opposing helix e^ix by the conjugate (a-bi) you shift it in the same direction as the first (like turning a right-handed screw clockwise and a left-handed screw counter-clockwise). Regardless of the values a and b, the two always shift by the same amount, and their sum always results in the imaginary parts cancelling out to produce a real sinusoid.

2. Thanks for yours explanations, I understand now a lot more that before how this work

I have first thinking about using reals parts for to represent the left channel and the imaginary parts for the right channel on a stereo signal but this can't work as simply as this

I have too dream before about the use of something like quaternions for to can handle a quadriphonic channels signal but now I know that is not possible because a more simple stereo signal can not to be handled on a FFT like "simples" complex values

But if you only want the magnitude, you can just find the magnitude of one of the coefficients and double it.
It's now what I think to make because a simple multiply by two is very more speed than a squared root of the sum of two squared value

Can I delete alls computations on the y array into the FFT( int dir, int m, double *x, double *y ) call for to have a call that is about 2x more speed but with the same x output values because it work only with reals values and imaginary parts are "auto-canceled" because opposites ?

3. Originally Posted by The Little Body
It's now what I think to make because a simple multiply by two is very more speed than a squared root of the sum of two squared value
You need to do that anyhow to get the magnitude. If you want the square of the magnitude (which is quite common, as it represents the power), you can omit the square root and just multiply by 4.

Originally Posted by The Little Body
Can I delete alls computations on the y array into the FFT( int dir, int m, double *x, double *y ) call for to have a call that is about 2x more speed but with the same x output values because it work only with reals values and imaginary parts are "auto-canceled" because opposites ?
No. It is possible to optimise the FFT in the case of a real signal. You still get a complex output, but it only has N/2+1 elements (e.g. 256 real input samples gives you 129 complex amplitude samples).

4. FWIW, here is some GLSL FFT code I dug up. It isn't optimised at all.

Code :
```#version 430

layout(local_size_x=16) in;
layout(local_size_y=16) in;

layout(rg32f, binding=0) readonly uniform image2D grid_in;
layout(rg32f, binding=1) writeonly uniform image2D grid_out;

uniform int N;
uniform int Ns;
uniform int dir;

const float pi = 3.141592653589793;

vec2 cmul(vec2 a, vec2 b)
{
return vec2(a.x * b.x - a.y * b.y, a.x * b.y + a.y * b.x);
}

void main()
{
ivec2 ij = ivec2(gl_GlobalInvocationID.xy);
int i = ij.x;
int j = ij.y;
int k = Ns / 2;
int a = (i / Ns) * k;
int b = i % k;
int i0 = a + b;
int i1 = i0 + (N / 2);
float t = 2*pi*i/Ns;
vec2 f = vec2(cos(t), sin(t));
f.y *= dir;
vec2 x0 = imageLoad(grid_in, ivec2(i0, j)).xy;
vec2 x1 = imageLoad(grid_in, ivec2(i1, j)).xy;
imageStore(grid_out, ij, vec4(x0 + cmul(f, x1), 0, 0));
}```

Application code (Python):
Code :
```def do_fft(dir, tex):
w, h = 512, 512
gw, gh = 16, 16 # local size

program = glCreateProgram()
raise RuntimeError(glGetProgramInfoLog(program))

glUseProgram(program)

tex1,tex2 = glGenTextures(2)
glBindTexture(GL_TEXTURE_2D, tex1)
glTexStorage2D(GL_TEXTURE_2D, 1, GL_RG32F, w, h)
glBindTexture(GL_TEXTURE_2D, tex2)
glTexStorage2D(GL_TEXTURE_2D, 1, GL_RG32F, w, h)
glBindTexture(GL_TEXTURE_2D, 0)

log_w = int(math.ceil(math.log(w, 2)))

u_dir = glGetUniformLocation(program, "dir")
u_N   = glGetUniformLocation(program, "N")
u_Ns  = glGetUniformLocation(program, "Ns")

glUniform1i(u_dir, dir)

for logstep in xrange(log_w):
src, dst = tex1, tex2
if logstep == 0:
src = tex
step = 2 << logstep
glBindImageTexture(0, src, 0, False, 0, GL_READ_ONLY,  GL_RG32F)
glBindImageTexture(1, dst, 0, False, 0, GL_WRITE_ONLY, GL_RG32F)
glUniform1i(u_N , w)
glUniform1i(u_Ns, step)
nx, ny = w / gw, h / gh
glDispatchCompute(nx, ny, 1)
tex1, tex2 = tex2, tex1

glBindImageTexture(0, 0, 0, False, 0, GL_READ_ONLY,  GL_R8)
glBindImageTexture(1, 0, 0, False, 0, GL_READ_ONLY,  GL_R8)

glUseProgram(0)

return dst```

Input and output are GL_RG32F textures (R = real, G = imaginary); dir should be -1 for forward FFT, 1 for inverse FFT. Each row of the output is the FFT of the corresponding row of the input. The results agree with NumPy's FFT to within ~0.05% for most of the range (bear in mind that the GPU is using single precision).

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•