Depth-of-field: almost there

I’m implementing DoF with separable gaussian blur. The
radius of the CoC (derived from depth buffer) maps to
width of the blur kernel, therefore I have variable sized
blurs per pixel. If the pixel is in focus, I skip the blur.

Notice that I’ve solved the ‘halo effect’ for the image
in focus (pink flower in middle): its edge remains sharp.
I do this by checking each sample’s Z during blurring
against the in focus Z-range. If the sample falls inside
this range (i.e. inside the DoF), then it doesn’t
contribute to the blur. In other words, I do a check for
every sample inside the blur kernel, adjusting the
normalising divisor accordingly.

Here’s the problem: the edge of the image (orange flower)
that is in front of the focused image is also sharp.
It should be blurred and there should be ‘pixel bleeding’
into the focused image. The front-most small image (red
flower) shows the desired effect on the edges. Any ideas?
Thanks in advance!

.rex

The problem stems from “If the sample falls inside
this range (i.e. inside the DoF), then it doesn’t
contribute to the blur” . Replace the [z1;z2] “range” with [z3;infinity)

Or if the current fragment’s z is <z1 , disable that range-check.

Thanks, Ilian.

I was thinking of that and was hoping there’s a
better way, since to solve this, it means there’s
no opportunity of early-out for pixels in focus.
Instead, it has to do the widest blur to make sure
the pixel bleed occurs.

Do you think there’s a way to make this DoF approach
faster? I started with correctness (knowing using
gaussian is not technically correct) and worry about
speed later.

.rex

With early-outs, I think you can mostly hurt the performance. Really really hurt it.
Inside the shader keep the number of texels fetched the same, just vary the CoC radius as you’re already doing. This approach is perfectly scalable with max-gpu performance, and modern GPUs can handle this most easily.
The only real way to optimize the thing is to select a nice kernel-bin size; provide smaller array-size and smaller Max_CoC for slower gpus.

I’ve removed the early-out for pixels in focus and
search around it for samples that should bleed into
the pixel. But a new problem has occurred: the blurred
edges are blurs in one direction and the corner case
(bottom right corner of front-most orange image)
completely misses the blur. This is the result of
using a separable blur: in each pass, it doesn’t
detect samples on the other axis.

Is there any solution to this? Is this a fundamentally
flawed approach? The only obvious solution is not to
use the separable blur and fall back to the O(n^2)
method for pixels in focus, but that is slow.

.rex

Yes, the two pass has that flaw (it’s useful only when on each pass you squash the image along an axis).

Here’s an idea on how to get those wide-CoC blurs for cheap: create mipmaps in a special way. 4x4 downsizing per pass instead of 2x2. On every pass, do that “coverage+normalization” thing, and put the minZ of those 4x4 texels into gl_FragColor.a . The first pass will be slightly different from the others, of course - fetching the frag_z from depthbuffer instead of current fragment’s alpha.

There’s a cool technique from Pixar that uses of all things heat diffusion as the basis for DOF in interactive film previews. Gist of the bleed workaround is a layered render to handle the huge COC blur-behind artifacts. Back, middle and foreground layers are dealt with separately then composited.

Don’t have a link handy but the 2006 paper is “Interactive DOF Using Simulatd Diffusion on a GPU”.

Does this apply even to NV 200-series too?
What I’d imagine is that the warp in processing will run
as slow as the slowest fragment. And if the average number
of samples the scene needs to take is less than a fixed
sample size, then there’s still an overall savings.

The Pixar paper describes 2 things. The first is using
heat diffusion equations that are able to deal with depth
boundaries, so you don’t have the halo and incorrect
bleeding effects. A coverage-based gaussian blur can do
the same thing, as I’ve shown above.

The second is that you still have the issue of missing
information due to (blurred) foreground objects occluding
pixels behind them. The heat diffusion method doesn’t
solve this. Two separate passes are still required to
capture the otherwise occluded pixels, which is what’s
described in the paper. The configuration that shows
this worst-case scenario is: blur, sharp, blur, where
you have midground objects that are sharp.

Here’s a new paper due for Siggraph Asia 09:
http://www.mpi-inf.mpg.de/~slee/pub/
“Depth-of-Field Rendering with Multiview Synthesis”

It uses a combination of rasterising and ray-tracing
but the key element of the technique is an expanded
version of the Pixar one, which is to split the view
frustum into N partitions, each covering a subset of
depth range from [near, far]. Each partition has its
own RGB layer (buffer) and pixels are sent to layers
based on its depth. The blur is then done by casting
rays thru these layers and accumulating their
contributions.

I’ve managed to move forward by separating the
foreground objects and mid+background objects,
rendering them to separate layers. The foreground
layer is blurred on its own and then composited
with the mid+background layer. The previously
hidden pixels now shown thru after the foreground
object’s edges are blurred. The corners are fixed
and there’s no more incorrect bleeding.

Congrats :)!