Part of the Khronos Group

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Results 1 to 2 of 2

Thread: Convolution performance w/ non-separable filters

  1. #1
    Intern Newbie
    Join Date
    Jul 2002

    Convolution performance w/ non-separable filters

    I have a fragment shader that uses 2 textures. One is the base image and the other a convolution kernel. The filter that the kernel is generated from in non-separable. Therefore I'm just using a simple brute force approach of a nested loop to iterate over the image/kernel and summing the results. The kernel can be up to 256x256. Rendering is quite slow on my 7800 GTX. Any shader 'tricks' that I should try to improve performance?

  2. #2
    Intern Newbie
    Join Date
    Apr 2004
    Cape Town

    Re: Convolution performance w/ non-separable filters

    If you don't want to go the full FFT route (which I think can and has been hardware accelerated BTW), consider just a brute force Fourier Transform. It's a separable process, so by computing the FT of the image and kernel, multiplying them and then taking the inverse FT you're avoiding any 2Dx2D loops.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts