PDA

View Full Version : vertex/fragment program perfomance



Roman Grigoriev
10-07-2003, 10:33 PM
Hi guys!
Could you please tell me is there a difference in perfomance on nv30 in vp2.0 and fp1.0
if i use for example
MUL R1.x,R0.x,c[1].x;
and
MUL R1.xxxx,R0.xxxx,c[1].xxxx;
?

Korval
10-07-2003, 10:52 PM
If you're asking about the performance difference between a scaler operation and a vector operation, there usually isn't one.

However, an R300-based card's fragment programs can co-issue many scaler operations with 3-vector operations, such that they happen simultaneously. It has been speculated that an NV30-based card can do so as well, though there is no proof of this as of yet.

Relic
10-08-2003, 03:11 AM
The difference is that the the second line is infinitely slower. It wouldn't compile because .xxxx is not a valid optional mask. http://www.opengl.org/discussion_boards/ubb/wink.gif
.x is just an abbreviation for .xxxx in a swizzle postfix.

If you're interested in the diffs of
MUL R1.x,R0.x,c[1].x;
and
MUL R1,R0.x,c[1].x;
there are some, for example you have three components less left for other values. It can make a difference in long programs.

Aside from that, there's not much to add to the previous post.

[This message has been edited by Relic (edited 10-08-2003).]

vincoof
10-08-2003, 09:51 AM
Originally posted by Relic:
The difference is that the the second line is infinitely slower. It wouldn't compile because .xxxx is not a valid optional mask. http://www.opengl.org/discussion_boards/ubb/wink.gif


Do you mean as an input or an output ? My guess is that it is perfectly valid as input and not valid as output.