Originally posted by jwatte:
[b]> However, the real value of ‘state-driven’
> mechanisms over ‘compiled’ instructions is
> that it is often cheaper/easier to
> dynamically & programmatically alter them.
Perhaps its my tools/compilers background, but I don’t buy that. Ever heard of “sprintf()” ?
[/b]
Hi jwatte,
Guess i didn’t make my point clear about ‘Permuation Management’ being better with ‘state-driven’ changes than with ‘compiled symbols.’ Note:: this is only in the case where the formula for object shading changes dynamically over time.
I’ll try an example using NV_Register_Combiners as a ‘state-driven’ example and DX8 pixel shaders as a ‘compiled symbols’ example.
To make this example simple, we’ll assume all the vertex processing is done in custom software and we’re just interested in the pixel processing.
If we have an object that we
decal + diffuse&specular bumpmap + 1 shadowmap, we can easily write a DX8 pixelshader and less easily (until we’ve written some decent support macros) code that in RegisterCombiners.
All set. But let’s say that on the fly we may add a second or third light which changes the number of shadows and affects specular bumps. We’ll need to write another set of shaders for 2 lights and another for 3 lights (Yes, arguably we could code a pixel_shader for 3 lights and use bogus values to simulate just 1 or 2, but that’s rather impractical).
Now let’s say that each light can project its image, in a ‘slideshow’ projector fashion. Now we need shaders for 1, 2, or 3 lights with 0, 1, 2, or 3 projectors. With reordering, that’s 12 potential shaders. As we add more possible combinations, such as blending between 2 decal or bump maps (versus just a static one), we create a lot of permuations.
It just isn’t practical to precompute and store them all, so we need another solution. Our engine stores shader fragments (‘sub-shaders’) and links/generates the complete set of shaders/passes on the fly.
Here’s the catch though:
With DX8 we have to generate the program_shader_symbols and compile them, which can cause stalls/poor_performance if you’re generating many shaders (even when caching a set of 32 MRU shaders).
With Register Combiners, we generate our shaders from linked lists of state changes and simply apply them. Not much of a performance cost here and the code to handle this is much simpler than the DX8 version as we have a rule_set that handles the quirks of Register_Combiners and linking their states.
If your 3d scene doesn’t require dynamic shading creation, then a compiled-symbols approach is great. If you do require mixing and matching shading fragments on the fly, then a state-driven works well.
Note:: the only ‘state-configurable’ set of graphics HW controls that i know of is the 'traditional fixed pipeline’and RegCombiners, whereas everything else is compiled_symbols, such as DX8 & NV_Vertex_Program. Thus the reason i was excited that maybe ATI’s approach would enable better ‘permutation management’ by being easier to generate shaders on the fly.
See Also::
NVLINK/NVASM
Final Note::
Its not the ‘state_driven’ versus ‘compiled_symbols’ issue that matters here, its simply how easy/costly it is to generate new permuations on the fly.
Sorry this is long, but please lemme know if this is clear-(or if i’m missing something