Radiosity questions

Hi, I have a couple of questions about radiosity. Well actually it’s about the hemicubes often used in radiosity, but close enough to justify the attractive topic name.

My first concern is, that when the light source gets further from the hemicube’s center, it’s projection on the hemicube gets smaller. Does this correctly account for attenuation, or should I do something about it?

Second, what is the mimimum acceptable hemicube resolution? And how many hemicubes can you draw per second with a relatively simple scene, say 1000 faces? Does frustum culling with bsp/octree speed things up with that small scenes? I’d be interested if someone had some results from the latest radeon/geforce cards.

Last, has anyone tried any alternatives to hemicubes? Like paraboloid projection, which would allow to draw your scene only once per sampling point as opposed to the five times required by a hemicube. Or at least using the top of a tethraedr instead of a cube, that would be three faces.

-Ilkka

Originally posted by JustHanging:

My first concern is, that when the light source gets further from the hemicube’s center, it’s projection on the hemicube gets smaller. Does this correctly account for attenuation, or should I do something about it?

Yes it does account for attenuation.


Second, what is the mimimum acceptable hemicube resolution? And how many hemicubes can you draw per second with a relatively simple scene, say 1000 faces? Does frustum culling with bsp/octree speed things up with that small scenes? I’d be interested if someone had some results from the latest radeon/geforce cards.

This is subjective but I would say 128 is the minimum for good quality, but if you are using larger patches you can get away with less. The smaller the patches the more resolution you will need. Quality is also effected by whether you have some kind of AA on or not (either polygon smooth or fsaa). So if you had FSAA on you might get away with less res.

On an XP2000 GF4 I can render 70,000 times a second i.e. 14000 hemicubes a second. This is CPU limited. That is for a 60 poly scene, I can increase the polys to about a 1000 before it starts effecting performance. I do use culling but not octree. Culling allows me to have more polys in the scene before I hit a tnl bottleneck but it doesn’t effect the more important cpu bottleneck.


Last, has anyone tried any alternatives to hemicubes? Like paraboloid projection, which would allow to draw your scene only once per sampling point as opposed to the five times required by a hemicube. Or at least using the top of a tethraedr instead of a cube, that would be three faces.

I haven’t tried it but I have thought about using a vertex program for a hemispherical projection. I discounted it for the following reasons.

  1. You would need to tesselate the scene geometry. Compromising on tesselation would have an impact on quality.
  2. You are no longer using fixed function tnl so there would be a performance hit.
  3. Some of my optimisations are tied in with the hemicube method, so there would be further performance impact for me.
  4. Its difficult to be efficient about reading a spherical part of the frame buffer when trying to calculate the form factor.

[This message has been edited by Adrian (edited 02-05-2003).]

Originally posted by JustHanging:
My first concern is, that when the light source gets further from the hemicube’s center, it’s projection on the hemicube gets smaller. Does this correctly account for attenuation, or should I do something about it?

As long as you apply the proper compensation for the hemicube shape (its difference to the hemisphere shape), and Lambert’s cosine law (attenuation caused by light dir vs normal angle), you are set. See this exellent page for more information: http://freespace.virgin.net/hugo.elias/radiosity/radiosity.htm

Second, what is the mimimum acceptable hemicube resolution?

That one is not very easy to answer. I think it depends very much on how small details you need to catch. A rule of thumb is that the smallest light source visible to a patch on a surface should be at least a few pixels in size in the hemicube rendering. If not, you may get irregular lighting across a surface due to the poor hemicube resolution. One way to solve this is to avoid small light sources (e.g. approximations of point lights). Another way is to calculate point lights separately from the hemicube rendering. I.e. use some technique similar to raytracing, but ONLY for the point light sources, and sum up the results from all light sources and the hemicube rendering.

And how many hemicubes can you draw per second with a relatively simple scene, say 1000 faces?

That depends on what technique you use. I have experimented with it a bit, and I used 256x256 hemicube maps (I think) and a VERY simple scene (~10 surfaces). I also applied the “multiplier map” (hemicube compensation * Lamberts cosine law) in hardware using blending. Finally, I used a custom floating point format (alpha channel = exponent) - be careful, as this requires some thinking to get it right. All in all, the only post-processing that had to be done was the custom floating point => IEEE floating point conversion and the sum of all pixels. Everything else was done in hardware using OpenGL.

So, what speed did I get? About 100-200 patches per second (on a Ti4200 + Athlon 700 MHz). A major bottleneck was glReadPixels and the sum-loop performed by the CPU. I also rendered all five sides of the hemicube in “one go”, by using scissor tests to place them all in one frame buffer before reading back the pixels (five rendering passes, but only one glReadPixel call).

Last, has anyone tried any alternatives to hemicubes? Like paraboloid projection, which would allow to draw your scene only once per sampling point as opposed to the five times required by a hemicube. Or at least using the top of a tethraedr instead of a cube, that would be three faces.

No, but it should be possible for sure. You just have to figure out the compensation map.

Thanks a lot. That’s a great link, marcus.

As I said, this is not really about radiosity. The idea was to first render the scene without lighting, outputting depth and normal at each pixel. Then I would use hemicubes to calculate lighting at each pixel. That would be for direct light only, but the other advantages sound interesting.

This would give a pretty constant computing time for an unlimited amount of free form area lights, allowing for many great lighting effects. In addition the lighting would be calculated with image space accuracy and only for visible pixels. I wanted to know, how close the latest graphics cards are to achieving this kind of lighting.

Adrian’s results sound quite promising, with some kind of adaptive sampling you might get close to 1 fps or so. I think the readpixels hassle and the hacked floats thing isn’t necessary here, since you can sum the light with multiplicative blending (like in my “lighting with exposure” thread) and get the exposured values straight from there.

-Ilkka

Adrian, that’s impressive. Did you use textured surfaces (I mean, in addition to the light maps?). Obviously you don’t use any “trick” to get a better range than 8 bits per channel (?) What kind of results do you get then? Acceptable? Limitations?

I suppose the CPU helps alot too. I did not get much of an improvement moving from GeForce256 to GeForce4 Ti4200, which indicates that the CPU is doing most of the work. Plus I have slow PC100 memory, which is probably limiting glReadPixels etc too.

[This message has been edited by marcus256 (edited 02-05-2003).]

Originally posted by JustHanging:
[b]Or at least using the top of a tethraedr instead of a cube, that would be three faces.

-Ilkka[/b]

The single-plane method is very fast but has of course more aliasing problems than the hemicube. A modified single-plane method was presented by Recker in:
“Advanced Techniques for Progressive Refinement Radiosity”. SIGGRAPH 1990.The algorithm increases the resolution in the middle of the single plane as a second pass.

I would also try cubic tetrahedrons. Only 3 faces and they are identical. To account for the same resolution as the hemicube you would need to double the resolution of the faces compared to the top face of the hemicube.

Originally posted by JustHanging:
I think the readpixels hassle and the hacked floats thing isn’t necessary here, since you can sum the light with multiplicative blending (like in my “lighting with exposure” thread) and get the exposured values straight from there.

Would that suffice for multi-pass radiosity. I was aiming for at least three passes (with one pass, I mean updating the light maps for the entire scene). I would only use exposure for the result from the final pass. I think you need a linear scale with sufficient range in order to catch the effects of indirect lighting correctly.

One of the nice things with radiosity is that if you have static lighting, you can generate the lightmaps once, and the use them as standard modulating textures, usually with >50 FPS display. The problem of course is dynamic light sources or moving objects that should cast shadows, but my idea was to do a mix (radiosity for static lighting, and OpenGL per-vertex lighting or something else for dynamic objects and/or light sources etc).

No, I don’t think it will work for indirect lighting, at least not in that simple form. The exposure would be applied to the light emitted from the walls, which is wrong. Exposure should be applied in the camera/eye, not in the middle of the light transport.

However, you might get it work, somehow taking into account the fact that the light is accumulated in a nonlinear scale. I have no idea how, though.

-Ilkka

I do use a trick to effectively get more than 8 bits per channel.

At the moment I only calculate light contributions monochromaticly. I could probably put the full colour calculation back in without too much performance loss now I’m doing things a different way.

Here’s a short animation of my radiosity renderer. It’s in black and white because it shows the light and shadows better. It was running at 2-3 fps. http://www.business-critical.co.uk/adrian/radiosity2.m1v

Here is a screenshot showing it in colour with per pixel lighting, bump and specular. There is a light source out of view to the right of the camera. http://www.business-critical.co.uk/adrian/Radiosity0003.JPG

The slight banding on the far wall is a result of the thin light on the left wall. I could fix it by giving the llght some sides.

Originally posted by roffe:
A modified single-plane method was presented by Recker in:
“Advanced Techniques for Progressive Refinement Radiosity”. SIGGRAPH 1990.

Where can I get hold of this? I’ve tried google but no luck.

Nice shot Adrian.

I also notice some banding on the left wall (close to the camera). It looks as if it would be a result of 8 bits precision, or is it just JPEG distortion?

How many passes do you use (levels of indirect light)?

I wanted to accomodate for mutlicromatic indirect lighting, including modulation by textures (such as the checkered floor in your shot), which means that I use multitexturing (surface texture * light map).

The trick is, of course, that since I use the alpha channel of the light map (and the frame buffer) as an exponent, the alpha channel must not be modulated. This is the format I use (basically):

red_float = red_8 * 2^(alpha_8 - 128)
green_float = green_8 * 2^(alpha_8 - 128)
blue_float = blue_8 * 2^(alpha_8 - 128)

(the -128 bias is not really important - only there to get some reasonable range - I use a LUT for the exponent part anyway)

It gives a tremendous range, so should be similar to (256+8)*3 bits per pixel.

Originally posted by marcus256:
I also notice some banding on the left wall (close to the camera). It looks as if it would be a result of 8 bits precision, or is it just JPEG distortion?.

Is that the horizontal banding? I’m not sure what’s causing that, it’s not from the JPEG compression since it’s on the original bmp.


How many passes do you use?

Two.


I wanted to accomodate for mutlicromatic indirect lighting, including modulation by textures (such as the checkered floor in your shot), which means that I use multitexturing (surface texture * light map).

I was thinking of doing a similar thing.

Originally posted by Adrian:
Where can I get hold of this? I’ve tried google but no luck.

Hmm, wrong title. Here is a link: http://portal.acm.org/citation.cfm?id=91385.91419&coll=portal&dl=ACM&CFID=7491316&CFTOKEN=12633262

The article is part of proceedings of the 1990 symposium on Interactive 3D graphics, which is not available on the web. Univeristy of Washington handles the ACM depository. http://rss.lib.washington.edu/rss

And no, I don’t got the article, sorry.

I just did a quick implementation of the floating point conversion (the heaviest part of my algo), and it turns out that on a 2 GHz P4 i can do about 600 buffer translations per second for a 256x256 buffer (or 120 million floating point conversions per second). There is probably some clever way of doing this faster, but not by much, so with my implementation this is the major bottleneck.

The advantages are that you get similar resolution as if you had had a single precision floating point frame buffer, and you can still do most OpenGL:ish things with the light map (like modulating it with a surface color/texture).

What “trick” did you use, Adrian? (if you want to give it away)

hm… 120million rgbe to float4 conversions (assuming you store 3 mantissas, and one exponend, each has 8bits, that thing is, as far as i know, rgbe called) with a 2gig processor? thats 16 cycles per conversion, more or less, right?..
can that be faster?.. should be possible, yes… dunno by how much, though… and possibly you don’t get much more because of mem-bandwith anyways, dunno…

SSE3 could have an expansion from rgbe to a float4, and vise versa… would be nice

hm… 120million rgbe to float4 conversions (assuming you store 3 mantissas, and one exponend, each has 8bits, that thing is, as far as i know, rgbe called) with a 2gig processor? thats 16 cycles per conversion, more or less, right?..

Yes, more like 17 cycles, but anyway…

can that be faster?.. should be possible, yes…

I was a bit surprised myself, but it seems to be limited solely by memory bandwith, because no matter what arithmetics I did in there, I ended up with the same figure (even basic load -> store without processing). I couldn’t get it together, though, since I have 600256256 ~= 40 Mpixels/s, and each pixel requires 32 bits read and 3x32=96 bits write. In other words, 40*(32+96) ~= 5000 Mbit/s = 625 Mbyte/s total (or 5000/64 = 78 MHz 64-bit memory bus), which is much lower than the memory bus performance.

I will look at it some more…

…I know this is getting OT, but anyway…

I gave it one more try, now on my Athlon 700 MHz / PC100 memory, and I peak at 44.5 Mpixels/s (or 132 Mfloatconversions/s). Actually, I also included the sum of each component in the loop, which costs practically nothing.

If I only do an integer sum (cast 8-bit components to 32 bits each, and then sum up all the pixels), I get 210.0 Mpixels/s, which is almost 5 times higher. So an SSE3 extension would be useful

Originally posted by Adrian:
Here is a screenshot showing it in colour with per pixel lighting, bump and specular. There is a light source out of view to the right of the camera.

Very nice shot.

What’s the black “shape” on the ceiling above the black pillar? I presume that’s not the banding you are talking about (Im looking at the image on a monitor which has a nice shiny screen with a large window behind me - I think I can see some vertical banding in the shadow behind the pillar…) (squint squint)…

Hey marcus, since most of your problems are related to gaining high precision and reading back the images, you might want to give the multiplicative blending thing a try after all.

Because when you think of it, you mostly just add lighting values anyway, so actually there shouldn’t be that big a problem in doing that in the exposured space from the beginning. The only trouble part is modulating with the compensation map, I don’t know how exactly that maps to the exposured space. I would try just modulating with it, even if causes a some distortion, the results might look good anyway.

-Ilkka

Originally posted by rgpc:
What’s the black “shape” on the ceiling above the black pillar? I presume that’s not the banding you are talking about (Im looking at the image on a monitor which has a nice shiny screen with a large window behind me - I think I can see some vertical banding in the shadow behind the pillar…) (squint squint)…

The black shape is an area light that is off. In the animation that light flickers on and off. Yes the vertical banding behind the pillar is the banding I was referring to. It shows up more in the animation. It’s caused by the light polygon disappearing and reappearing when rendered from an accute angle. FSAA would help remove this but my app stops working when FSAA is switched on for some reason. I was using polygon smooth but there are issues with that specific to my app.

The biggest problem right now is the bump and specular calculations. I currently dot with an average light vector but that approach has obvious flaws. It is going to be too expensive in texture memory to store a light vector and half angle light vector for each light.