PDA

View Full Version : NOP equivalent for ATI_fragment_shader?



h2
09-26-2003, 03:08 AM
Hi,
What's a right way to handle situation when I don't have any color instructions in the first pass? Obviously, I can insert something like MOV R0,R0 but maybe there's a better way to tell driver that second pass is started?

Korval
09-26-2003, 07:53 AM
That's something the driver should be doing invisibly. That way, it has the freedom to put code where it wants, since the user is not necessarily aware of the performance ramifications of various actions.

[edit]
Wait. I thought you said ARB_fragment_program. Nevermind then.

[This message has been edited by Korval (edited 09-26-2003).]

vincoof
09-26-2003, 09:23 AM
I don't know about ATI_text_fragment_shader, but ATI_fragment_shader requires a no-op before going into the second pass.

zeckensack
09-26-2003, 10:15 AM
Doesn't the driver know when the second pass starts?
I mean, PassTexCoord should be enough for the driver, if it depends on a register that has already been written by a previous SampleMap.

vincoof
09-26-2003, 10:27 AM
> if it depends on a register that has already
> been written by a previous SampleMap

But then you suppose that the driver keeps track of used register in order to know when one is being used twice. Well, up to that it would be possible.

But what's annoying with this double-register usage is that you must use a register twice for the driver to detect that second pass starts. What if, for instance, you want to use registers 0 and 1 in the first pass and use registers 2 and 3 in the second pass ? How could the driver detect that the second pass should start ?

Korval
09-26-2003, 10:34 AM
What if, for instance, you want to use registers 0 and 1 in the first pass and use registers 2 and 3 in the second pass ?

There's no reason to want that. If your ALU instructions don't do anything to registers 2&3, then all you're doing by putting 2&3 in the second pass is losing performance. You don't gain anything by it; you just lose performance.

zeckensack
09-26-2003, 10:45 AM
Originally posted by vincoof:
But then you suppose that the driver keeps track of used register in order to know when one is being used twice. Well, up to that it would be possible.Not exactly. "Use twice" might refer to arithmetic, and that won't require a phase transition. Dependent reads (or simply hitting the instruction count limit) will.
A dependent read is simply using a register as a texture address. Easy enough to detect.

By extension, PassTexCoord can access only texture coordinates in the first phase. If you start passing register contents, you must be starting the second phase.

If you use registers (as opposed to interpolators) as source operands to PassTexCoord or SampleMap, you enter the second phase. It really is that simple.


[This message has been edited by zeckensack (edited 09-26-2003).]

h2
09-26-2003, 11:43 AM
Ok, thanks guys.
So I'll just move PassTexCoord/SampleMap with a register source to the beginning of the second texture sampling block.

vincoof
09-26-2003, 12:37 PM
hey h2, wait a sec. We're not even sure this will work http://www.opengl.org/discussion_boards/ubb/smile.gif
I think that we agree about the fact that a driver could be able to detect when a second pass starts without requiring a no-op, but we (at least, I) don't know if such clever behaviour is currently implemented in the driver.

Anyway, what bothers me with that kind of automatic second pass detection is that it makes the calling order of glSampleMap/glPassTexCoord very important, whereas the second pass detection by no-op leaves glSampleMap/glPassTexCoord calling order optional.

h2
09-26-2003, 02:39 PM
Indeed, our method doesn't work http://www.opengl.org/discussion_boards/ubb/frown.gif
I've ended with insertion of MOV R0,R0. After all, that's an ATI's problem if they write such crappy drivers/extensions.

vincoof
09-26-2003, 10:41 PM
First, crappy is not really the right word. The case where first pass uses no operation is pretty a limited case.
Secondly, granted that the hardware is well configured, that no-op may be free and so forth there would be no performance decrease at all.
That's just a suggestion, though, if someone from ATi cuold reply maybe they could confirm that.
Otherwise, email devrel@ati.com they will probably be able to answer to that.

h2
09-27-2003, 02:36 AM
Ok, maybe "crappy" is inadequate word. But please don't advocate ATI, it's their fault that they don't provide a handling for this case.

vincoof
09-27-2003, 03:09 AM
Overlooking at the ATI_text_fragment_shader extension it looks possible to start a second pass without useless no-ops. Though, even if this is true this limits usage to MAC OS X since it's the only platform that supports the text version of fragment shaders, as far as I know.