a good lighting model (dot3)

Ahoy!

The basic idea behind dot 3 lighting seems to be to have a normal map, send the notrmalized vertex to light vector throught glColor, have them dotted together, then modulate with the texture.

This already consumes 2 tex units and it only does a diffuse lighting.

Is there a better way? How do you add specular, shininess, ambient, emission?

And that method requires us to compute the vertex to light vector every frame and it is only for 1 light source.

V-man

Multipass, my friend!

On a Radeon 8500, which has 6 texture units, you can allegedly cram ambient+bumped diffuse+bumped specular with separate diffuse & specular color maps, plus projected light with look-up attenuation, into one pass.

Needing one pass per light isn’t as bad as you’d think, because you need to do that anyway to make stencil shadow volumes work right.

Hey,
As far as I understand, the dot3 lighting
takes up 3 texture units, not 2. One for the normal map, and two dot products with that map. One dot product for each coordinate of the texture to be addressed.

If I have this wrong, please let me know. thx.

As far as specular goes, it can be encoded into the same lookup texture as the diffuse. The diffuse can be stored in the rgb section, the specular in the alpha.

The ambient can be the constant in a register combiner (this constant can be changed per frame if necessary).

zroo

>>>separate diffuse & specular color maps<<<

This I don’t know about. Is there a page that explains this. How to sum up the passes and doing calculations efficiently, letting the GPU handle most parts.
No code necessary, just theory.

Kind of strange that the tex units are limited to 2 on the orignal geforce and gf2. Why not have the hardware reuse units and provide functionality for all 32 defines of gl 1.2
Performance issues stopped them or what?
If anyone knows a registry hack for geforce …

I’m aiming for 2 tex unit hardware at minimum.

V-man

Originally posted by V-man:
The basic idea behind dot 3 lighting seems to be to have a normal map, send the notrmalized vertex to light vector throught glColor, have them dotted together, then modulate with the texture.

It’s not a good idea to send the light vectors through glColor ( at least not for specular ). The reason is that the precision is 8-bits per component. For hightest quality, use texture coordinates - these are usually 32-bit floating point per component.

You don’t have to re-calculate the light vector every frame unless the light moves. If you have a lot of lights in your scene, most of them are probably stationary. You can still change brightness, range and add some flickering to make them seem more dynamic.
For specular you will need to undate the halfangle vector per frame since it’s view dependent ( and the viewer probably won’t stay in the same spot for long ). Specular is hard to get right - I have it working artifact-free on Radeon 8500 but with all other hardware there are a couple of problems ( interpolation and banding artifacts ).

Anyway, it’s a large topic and dynamic lighting is quite expensive ( I use 4-5 passes on GeForce1/2 ). The best source of infomation on this is still NVIDIA’s developers site.

Originally posted by V-man:
Kind of strange that the tex units are limited to 2 on the orignal geforce and gf2. Why not have the hardware reuse units and provide functionality for all 32 defines of gl 1.2
Performance issues stopped them or what?

Pretty much. The difficulty with using a feedback into the pipeline for a second/third/etc pair of textures is that all of these textures need to be accessible. Even texturing from video memory is painfully slow. Graphics hardware use a texture cache to buffer small parts of of a texture for extremely fast access. When you do a feedback loop, you either need to
A) Every pixel, load the first texture set, draw, flush the texture cache & load the next texture, draw, repeat.
B) Have a mechanism for sharing the cache among multiple textures
C) Have separate caches for each texture.

The problem with A is that its a performance waster. You waste tons of bandwidth loading and unloading texture from vidmem to cache. On top of that, you will have excessive idle time while waiting for the textures to load, unless you create some type of batching system in the hardware (ie: process 100 pixels partway, then switch texture and continue). Even then its still a memory access hog.

The problem with B is that you then only have half the cache available for each texture stage (or if you want to loopback more than once, you only get 1/3, 1/4, etc the ammount of cache available).

C is about the best option, but I think by the time you get that far in the hardware, I think you probably a large portion of the way to just making those fully separate texture units.

>>>It’s not a good idea to send the light vectors through glColor ( at least not for specular ). The reason is that the precision is 8-bits per component. For hightest quality, use texture coordinates - these are usually 32-bit floating point per component.<<<<

You lost me. What do you do with the texture coordinates and what extension are you talking about. texture shader?
Another question that recently got me rethinking is that should I use glColor4ub? or are there some hardware that could benifit from floats. The same thing could be said about textures. It may be time to have more precision for textures, the various buffer.
Video memory is getting plentiful.
V-man

Originally posted by LordKronos:

A) Every pixel, load the first texture set, draw, flush the texture cache & load the next texture, draw, repeat.
B) Have a mechanism for sharing the cache among multiple textures
C) Have separate caches for each texture.

I don’t know much about the general architecture of 3D processors, but the processor clock and the memory clock seems to be quite close. I wish I could disable the cache and see it’s effect. How does overclocking the GPU/memory work anyway? The driver sends a signal???

Also, does each tex unit have it’s own cache, or is it one combined cache for texture/data/framebuffers

V-man

Im not a hardware designer or anything, so dont take what I say as bible…
The main problems are bandwidth and latency.

Bandwidth:
Lets just look at raw texture sampling, and ignore everything else that has to contend for memory bandwidth. With a 32-bit texture, bilinear filtering requires 4-samples of 4 bytes for every pixel drawn. For a 100 MPixel system (very low by todays standards) you need 100M * 4 * 4 bytes/sec = 1.6GB/sec of bandwidth. Now, lets go to a 1 GPixel system and you need 16GB/sec. You want to make that 4 texture units? 64GB/sec. Want to make that trilinear filtered? 128GB/sec. Oh, whats that? You want anisotropic filtering too? And dont forget to count framebuffer and depth/stencil reads and writes. Dont forget your vertex and index traffic. Starting to get pretty high bandwidth requirements, huh?

Latency:
Even video memory has several cycles of latency to it (not sure exactly how much). Every one of those cycles is time the pipeline is sitting around waiting (unless you design some type of “threading” system into the hardware, which I doubt would be practical). Cache usually has a single cycle of latency, and thus reduces the number of cycles it takes to complete a pixel.

[This message has been edited by LordKronos (edited 06-19-2002).]

I haven’t done much lighting(at least real-time) till now.I know it is generally better to do you own lighting but won’t opengl lighting with TnL cards be faster.This method doesn’t take advantage of that,does it?What’s the best way?

Originally posted by V-man:

You lost me. What do you do with the texture coordinates and what extension are you talking about. texture shader?
Another question that recently got me rethinking is that should I use glColor4ub? or are there some hardware that could benifit from floats. The same thing could be said about textures. It may be time to have more precision for textures, the various buffer.
Video memory is getting plentiful.
V-man

The texture coordinates can be used with a normalization cubemap but you could use them with texture shaders for even higher quality calculations. I think floats in glColor are converted to bytes, so if you use 4ub you’ll probably save some work. Higher precision textures are good for specular - for diffuse, even GeForce1 is enough for good results.