PDA

View Full Version : difference between register combiners and fragment programs



default
03-27-2003, 10:16 AM
Hi i'm just wondering the differences between fragment programs and register combiners. are combiners used only when combining multiple textures in the same pass? It seems to me that fragment program can do everything register combiners can do. Am i correct in this? If so does fragment programs replace the need to use register combiners. thanks

Mark Kilgard
03-27-2003, 10:34 AM
Fragment programs are more powerful than register combiners, but fragment programs (whether using the multi-vendor ARB_fragment_program extension or the more powerful NV_fragment_program extension) require more recent GeForceFX GPUs for hardware acceleration (though users of earlier GeForce GPUs can enable the extensions via SW emulation for development purposes - SW emulation is too slow for production use however).

If you want to target the widest range of NVIDIA GPUs, use NV_register_combiners. If you want all the latest functionality, use ARB_fragment_program, or better yet, NV_fragment_program.

If you program with Cg (C for Graphics), you can write your programs in Cg and then simply indicate with your target profile whether to target register combiners or fragment programs. The "fp20" profiles targets register combiners and texture shaders; the "arbfp1" profile targets ARB_fragment_program; and the "fp30" profile targets NV_fragment_program.

One problem is how to deal with this variety of profiles supported by various GPU generations. One solution is writing your Cg shader programs such that a single program can be recompiled, via conditional compilation, to target a basic profile such as fp20 for basic rendering but use the fp30 or arbfp1 profiles if more capable hardware is detected. This can allow you to adapt the shader quality to the available hardware.

The CgFX file format is really useful for this purpose. You can write a shader with CgFX and write multiple "techniques" in Cg embedded within the CgFX file for various GPU capabilities.

I hope this helps.

- Mark

Arath
03-27-2003, 11:06 AM
Mark, the Cg solution doesn't work with the ati 8500 at this time (and don't tell me that one can use the arb_texture_env_combine). BTW is the final combiner the only difference between DX8.1 pixel shader and register combiner? I mean, nvparse can parse either code for a GF3, so is there any difference in term of capacities?

Arath

jra101
03-27-2003, 12:01 PM
Register combiners are a bit more powerfull than DX8 pixelshaders as they directly expose the GeForce3+ hardware.

vincoof
03-27-2003, 01:07 PM
Jason, correct me if I'm wrong but the NV_register_combiners doesn't expose the dot3_rgba functionality which is available in DX8.1 ps

jra101
03-27-2003, 01:12 PM
dp4 in ps1.2+ is essentially the same as this:

dp3 r0.rgb, t0, t1
mad r0.a, t0.a, t1.a, r0.b

its just internally changed into multiple instructions (counts as 2 instructions).

Coriolis
03-27-2003, 01:25 PM
You can do a 3 component dot product whose results get stored in all components of rgba with register combiners. If you use two combiners, you can also do a full 4-component dot product.

vincoof
03-27-2003, 11:45 PM
Originally posted by Coriolis:
You can do a 3 component dot product whose results get stored in all components of rgba with register combiners.

With two combiners I clearly see how, but I'd be glad to know how with one combiner.


Jason, if the dot4 instruction can be emulated using 2 combiners, imagine a pixel shader that calls 8 consecutive dot4 instructions : that means GF3-4 need 16 combiner stages to emulate such shader ?

[This message has been edited by vincoof (edited 03-28-2003).]

Asgard
03-28-2003, 12:52 AM
You can't have 8 dp4s in a D3D 8.1 pixel shader version 1.2 or 1.3.
Quote from the Direct3D documentation:


For pixel shader version 1.2 and 1.3, dp4 counts as two arithmetic instructions.
Regards.

Asgard
03-28-2003, 02:17 AM
On the topic of reg combiners/D3D pixel shaders: Does anybody know how to properly implement the D3D pixel shader 'cmp' instruction with register combiners?

V-man
03-28-2003, 08:43 AM
In reg combiners, I think all you can do is mux to have a cmp like function.

Asgard
03-28-2003, 09:02 AM
Hmm, can you please be a bit more specific on how to do this? I'm looking for a good (or at least very close) replication of cmp's functionality.
If D3D pixel shaders can do it with two instructions, there should be a way to do it in two combiners, but I don't see how. Or maybe D3D pixel shaders use some GF3/4 feature not available through register combiners...

jra101
03-28-2003, 09:13 AM
On the topic of reg combiners/D3D pixel shaders: Does anybody know how to properly implement the D3D pixel shader 'cmp' instruction with register combiners?

You would use the mux operation. Here's an example that chooses between two textures depending on the value stored in the alpha of the primary color:



!!RC1.0
{
alpha
{
spare0 = col0.a;
}
}
{
rgb
{
discard = tex1.rgb;
discard = tex0.rgb;
spare0 = mux();
}
}

if col0.a is greater than 0.5, tex0 is stored in spare0, otherwise tex1 is used.

Asgard
03-28-2003, 09:18 AM
Thanks Jason, that emulates D3D's cnd instruction, but not the cmp instruction (which performs a component-wise comparison >= 0). I don't really see a good way how this could be done (even when using more than one combiner stage).

vincoof
03-28-2003, 10:40 AM
With lots of combiner stages you could subtract the two colors to compare, then add 0.5, and for each component you try the mux() instruction.
For a four-component component-wise comparison you need at least 5 combiner stages but still it is possible.

jra101
03-28-2003, 02:31 PM
Originally posted by Asgard:
Thanks Jason, that emulates D3D's cnd instruction, but not the cmp instruction (which performs a component-wise comparison >= 0). I don't really see a good way how this could be done (even when using more than one combiner stage).

This is ps1.2 (NV25+) functionality that isn't exposed in OpenGL. You'll need to do it using multiple mux's.

Asgard
03-28-2003, 03:08 PM
Jason,
Thanks for the info. And I thought I was just too stupid to find a solution with two combiners :-)
Cheers.

jra101
03-28-2003, 03:12 PM
I was thinking the same thing, had to ask around to figure out what was up http://www.opengl.org/discussion_boards/ubb/smile.gif

vincoof
03-31-2003, 01:33 AM
If OpenGL vendor-specific extensions directly expose the hardware functionality, how does it come that some D3D features aren't available to OpenGL ? Do we need some kind of NV_register_combiners3 extension ?

Korval
03-31-2003, 09:30 AM
If OpenGL vendor-specific extensions directly expose the hardware functionality, how does it come that some D3D features aren't available to OpenGL ? Do we need some kind of NV_register_combiners3 extension ?

Like what? What does D3D expose that OpenGL does not? If you're talking about that one opcode (the name of it slips my mind), bear in mind that it takes 2 instruction slots in D3D, which is exactly what RC's would take.

jra101
03-31-2003, 02:28 PM
Originally posted by Korval:
Like what? What does D3D expose that OpenGL does not? If you're talking about that one opcode (the name of it slips my mind), bear in mind that it takes 2 instruction slots in D3D, which is exactly what RC's would take.

The cmp instruction equivalent in ps1.2 is not exposed in OpenGL register combiners.

Asgard, was there a particular problem you needed the GL equivalent to the cmp instruction for?

Asgard
03-31-2003, 03:02 PM
Asgard, was there a particular problem you needed the GL equivalent to the cmp instruction for?

Not really a particular problem. I just wanted to implement support for the cmp instruction in my DirectX pixel shader -> OpenGL reg combiners/texture shader translator that is part of my XEngine project (http://xengine.sourceforge.net).
It's not really all that important. Just a nice-to-have feature.

cass
03-31-2003, 03:23 PM
That's one of those things that I'd rather *not* have a special-purpose extension for. http://www.opengl.org/discussion_boards/ubb/smile.gif

MZ
03-31-2003, 04:44 PM
Well, I used to have a problem. IIRC, I tried to implement addition with 16-bit per channel precison. Mux was needed to handle carry, and without component-wise mux I could hardly fit all computations in 8 combiners. Anyway, I abandoned this area of research.

V-man
03-31-2003, 08:02 PM
Originally posted by cass:

That's one of those things that I'd rather *not* have a special-purpose extension for. http://www.opengl.org/discussion_boards/ubb/smile.gif



In other words, the list of extensions isn't going to be skyrocketing anymore.

Ah crap! that's the measure I used to buy my card.

cass
03-31-2003, 08:35 PM
I didn't say that either. :-)

There will still be extensions, but I think we've reached the end of the road for exciting variations on NV_register_combiners. New functionality will concentrate on extending the programming models in ARBvp and ARBfp and introducing new programming models as more units become programmable.

There will certainly be interesting features that will be vendor-specific, at least for a time. And, of course, some extensions will remain vendor- and/or hardware-specific.

Cass

Zengar
03-31-2003, 10:37 PM
Asgard, what's about NVParse? I thought it can parse DirectX shaders.

Arath
03-31-2003, 11:13 PM
NVparse can only parse pixel shader 1.1 like it is written in the doc.

Cheers
Arath

vincoof
04-01-2003, 12:15 AM
Korval, the register combiner's mux instruction operates on the alpha component of the spare register, which means one comparison at a time. The pixel shader's cmp instruction performs a four-component component-wise comparison. So, IMHO, you need at the very least 4 combiner stages to compare four components, whereas the pixel shader count two instructions. Maybe there's a trick that allows the same operation to be performed in two combiner stages, and in that case I'd be very glad to know that trick !

Anyway, I think that cass answered my question indirectly.
Thanks cass !

Asgard
04-01-2003, 01:50 AM
Asgard, what's about NVParse? I thought it can parse DirectX shaders.

Like Arath said, it only supports up to ps_1_1 and there are quite a few bugs in the ps_1_1 support of nvparse. Additionally, nvparse doesn't perform all necessary semantic checks (if you look at my parser it is quite a bit more complicated than nvparse).
Therefore, I wrote my own translator quite a while back, but it's not possible to support all ps_1_2 and ps_1_3 features with register combiners/texture shaders.

In particular:
- Only the texm* texture addressing instructions can use the _bx2 modifier (just as in ps_1_1).
- The texm3x3 and cmp instructions are not supported.
- It's not possible to use the negation (-r0) or negation+bias (-r0_bias) source register modifiers on a source register that was used as a destination register of an instruction using the _sat modifier before (just as in ps_1_0 and ps_1_1).

Maybe I'll implement the cmp instruction one day using multiple combiners. Also the last issue could probably be resolved by using an intermediate combiner. I don't know why I didn't do that when I wrote the translator...it's been a while since then and I can't remember all the details.

[This message has been edited by Asgard (edited 04-01-2003).]

Arath
04-01-2003, 02:16 AM
Asgard, you're right many thing stuff is not possible in openGL with rc/ts. But on the other hand, we can do double instructions in one stage (like double dotproduct) and we've got control on the final combiner. In fact, with the register combiners it's like the pixel shader instruction for the XBOX. And as cass suggested, everyone should update to fragment program soon, but I still code for the GF3 owner (as I own a GF4 at this time).

Cheers,
Arath

Asgard
04-01-2003, 02:22 AM
Arath, register combiners are in a lot of ways more powerful than D3D shaders. Don't get me wrong. I'm not complaining ;-) I just wanted to know if somebody knew how to implement the D3D cmp instruction with register combiners, because I thought I was too stupid to see how it could be done with 2 combiners (since that's what the D3D docs imply). Now that I know that D3D uses functionality not exposed in the register combiners I can happily live with it...knowing that I wasn't too stupid to see a solution with 2 combiners ;-)

vincoof
04-01-2003, 02:27 AM
fragment programs are not going to be widely supported before a while. Lots of ppl bought a GF3/4 or Radeon 8500 recently and won't "upgrade" before one year or two (or even more).

ScottManDeath
04-02-2003, 01:48 AM
Hi

there is the ATI_text_fragment_shader extension for thr R2xx chips.

Perhaps nVidia can make a {ARB|NV}_fragment_program based (program management ...) extension with a text interface for NV2X (RC, TS) chips. It could be based on the nvparse syntax and add further hardware functionality (as the mentioned cmp instruction). IMO it would be useful for collaboration with fragment combiner programs and CG.

Bye
ScottManDeath

vincoof
04-02-2003, 01:52 AM
ScottManDeath, Just a side note - I've got the latest drivers for ATI Radeon 8500 and the ATI_text_fragment_shader extension is still not supported.

ScottManDeath
04-02-2003, 06:33 AM
Originally posted by vincoof:
ScottManDeath, Just a side note - I've got the latest drivers for ATI Radeon 8500 and the ATI_text_fragment_shader extension is still not supported.

Hi

IIRC its only supported on the MAC. Perhaps someone should persuade the ATI && nVidia driver guys to provide Win32 & linux users with a proper text based extension for DX8 class hardware with ARB program management.

Bye
ScottManDeath

MZ
04-02-2003, 07:26 AM
Perhaps someone should persuade the ATI && nVidia driver guys (...)LOL, we had been trying hard to do so, for over a year, haven't you noticed? http://www.opengl.org/discussion_boards/ubb/smile.gif. Now you may just forget it.