PDA

View Full Version : OT - Nvidia CineFX Architecture (R300 response?)



pocketmoon
07-22-2002, 02:01 AM
http://developer.nvidia.com/docs/IO/3121/ATT/cinefx_whitepaper.pdf

Nvidia perhaps trying to take the wind out of the R300's sails ? http://www.opengl.org/discussion_boards/ubb/wink.gif

A glorious piece of engineering none the less. *drool*

Maj
07-22-2002, 03:33 AM
Mmm, very nice.

Just one thing caught my eye: vertex shaders - temporary registers - 16 up from 12 (page 8). Seems a bit low, especially since every other number has jumped massively (1024 fragment program instructions?!).

PH
07-22-2002, 03:54 AM
That's the first time NVIDIA has disclosed anything related to future products. Well, they _have_ to exceed the Radeon 9700. ATI will apparently lauch the Radeon 10000 around the same time as the NV30.

Do the R300 + NV30 have floating point frame buffers ? A pbuffer with render_to_texture might work if not ( since they have 128bit FP textures http://www.opengl.org/discussion_boards/ubb/smile.gif ).

EDIT: They do support 128bit FP textures, right ?

[This message has been edited by PH (edited 07-22-2002).]

PH
07-22-2002, 05:05 AM
According to this article,
http://www.extremetech.com/article2/0,3973,388801,00.asp

128 bit data can be written to scratch memory in the frame buffer. Of course my interest in this is for multipass shaders.

zeroprey
07-22-2002, 08:34 AM
Is there any next gen card that will support opengl2 or at least the current specs of it? 3dlabs i think was claiming to but i just looked at the spec again and it requires float color. i thought nvidia would do it with theirs but they say no loops in fragment shaders. not to say that im not excited about this new gen coming just i thought i could see opengl2 soon.

davepermen
07-22-2002, 09:11 AM
if you read the documents and the topics and posts you would have known that the next gen will not yet provide full gl2 power, but dx9power. dx9 is near to gl2 in featurelist, except branching in pixelshaders. but they have to have floatingpointcolors and they do have it in. there will hopefully be as much as possible GL_GL2_extensions in the upcoming hw gen, and, as ati stated, this one r300 will not be gl2, but the next one will most likely be..

PH
07-22-2002, 09:16 AM
I don't know about the Wildcat VP card ( regarding floating point color ) but it seems likely since 3Dlabs designed the GL2.0 proposal.

Looping can be implemented without an explicit looping instruction ( by inlining ) so it should be possible. With high precision floating point frame buffers and dependent texture reads, a multipass shader could implement all of the GL2.0 HLSL ( I think http://www.opengl.org/discussion_boards/ubb/smile.gif ).

Korval
07-22-2002, 09:55 AM
It's kinda sad that nVidia has been forced to react to the R300 like this. This "CineFX" merely describes the capabilities of the R300, but with higher instruction counts. That is the best advantage nVidia can come up with, when their card launches 4 months after ATi's. And they didn't even increase the number of constants all that much over NV2X (256. Still not enough to store all the matrices for high-quality skinning).

JackM
07-22-2002, 08:13 PM
At appears that Nvidia will be targeting renderfarm business, as they aquired ExLuna today.

That's where those extra intructions will come in handy.

kon
07-22-2002, 11:45 PM
16 texture units ?! Start writing some macros for enabling/disabling texstages http://www.opengl.org/discussion_boards/ubb/wink.gif

Nutty
07-23-2002, 12:43 AM
What I want to know is, can we use 16 textures at full-speed? Or is it more than 8 and it's half-performance?

Majority of the time on the GF4, it's pointless using 4 textures, as it's faster to multipass, providing you can fit your algorithm into the constraints of multi-pass.

Should be sweet tho.. http://www.opengl.org/discussion_boards/ubb/smile.gif

Nutty

MZ
07-23-2002, 06:51 AM
Originally posted by kon:
16 texture units ?! Start writing some macros for enabling/disabling texstages http://www.opengl.org/discussion_boards/ubb/wink.gifoh yes, and don't forget about managing 6 texture-targets for each unit:
1D, 2D, 3D, cube, rectangle_NV, rectangle_EXT.
Great, isn't it? http://www.opengl.org/discussion_boards/ubb/rolleyes.gif

SirKnight
07-23-2002, 08:31 AM
I'm planning on getting an NV30 later when it comes out and I REALLY REALLY hope this card, and even the radeon 9700 for that matter, will have double sided stencil support. I'm just dying to use that for my shadows. http://www.opengl.org/discussion_boards/ubb/smile.gif

-SirKnight

zeroprey
07-23-2002, 10:56 AM
In "SIGGRAPH 2002: Interactive Geometric Computations Using Graphics Hardware" on the nvidia dev page they say in the "graphics hardware futures" topic that there will be 2 sided stencil testing. Now this may be beyond nv30 but all the rest of the things they meantion are discribing the same thing as the cinefx paper. So I guess a unoffical anwser to that is very likely yes.

davepermen
07-23-2002, 11:02 AM
as i remember from nvidia, even the official answer is yes. as well as ati stated it somewhere. sure, never an official press release, but if they talk about it in papers as feature of nextgen hw http://www.opengl.org/discussion_boards/ubb/smile.gif

oh, and btw. to all you fancy guys.. now with the ps2.0 powered gpu's we can do 8 or more fully accurate perpixel phongshaded lights. in one pass. what do we do about the shadows? 8 stencilvolume passes, or 24 cubemapshadowmap passesd are needed for this. we're still FAAAAAAAAAAAAAR from softshadows away as i can see that.. any ideas (except doing raytracing of the shadows at low res, there you could do all the 8 lightsources in only 2 passes.. http://www.opengl.org/discussion_boards/ubb/smile.gif

zeroprey
07-23-2002, 11:14 AM
How would you do the stencil test for multible lights? I can see shadow buffers but not more than one stencil shadowed light in one pass. Maybe you could put it in other buffers? That would be great if you can.

Nakoruru
07-23-2002, 12:26 PM
I was just thinking about how awesome it was going to be to be able to do so many lights in one pass, and now daveperman has to go and ruin it all! ^_^

But he is right. This plays right into what I was thinking the other day. I realized that shadow maps would eventually 'win' because they are analogous to the 'battle' between the z-buffer solution (which is fragment based like shadow mapping) and analytical solutions to hidden surfaces.

The z-buffer was initially slower than the painter's algorithm (or other solutions), but its complexity grew linearly with the number of polygons. The complexity of sorting polygons grows exponentially. So eventually it became cheaper and its simplicity allowed it to be hardware accelerated easily.

The same is true of shadows. While the accuracy and fill-rate requirements of shadow maps leaves a little to be desired currently (IMHO), eventually these will be solved. Also as polygon counts go the burden of silouette finding goes up significantly, and ways to hardware accelerate the process are hard to imagine (at least for me). One would not dream of using stencil shadows with hardware displacment mapping (If you have had such dreams, please tell me how ^_^).

For these reasons, I think that shadow maps will eventually be -the- solution for shadows. I think that Doom 3 is pretty much the greatest use we will ever see of stencil shadows.

davepermen
07-23-2002, 12:31 PM
Originally posted by Nakoruru:
I was just thinking about how awesome it was going to be to be able to do so many lights in one pass, and now daveperman has to go and ruin it all! ^_^
sorry.. http://www.opengl.org/discussion_boards/ubb/smile.gif

But he is right. This plays right into what I was thinking the other day. I realized that shadow maps would eventually 'win' because they are analogous to the 'battle' between the z-buffer solution (which is fragment based like shadow mapping) and analytical solutions to hidden surfaces.

The z-buffer was initially slower than the painter's algorithm (or other solutions), but its complexity grew linearly with the number of polygons. The complexity of sorting polygons grows exponentially. So eventually it became cheaper and its simplicity allowed it to be hardware accelerated easily.

The same is true of shadows. While the accuracy and fill-rate requirements of shadow maps leaves a little to be desired currently (IMHO), eventually these will be solved. Also as polygon counts go the burden of silouette finding goes up significantly, and ways to hardware accelerate the process are hard to imagine (at least for me). One would not dream of using stencil shadows with hardware displacment mapping (If you have had such dreams, please tell me how ^_^).

For these reasons, I think that shadow maps will eventually be -the- solution for shadows. I think that Doom 3 is pretty much the greatest use we will ever see of stencil shadows.
sorry again http://www.opengl.org/discussion_boards/ubb/smile.gif

i hope not.. at least, for real shadowmaps, you need cubemaps. cubemaps means 6 renderings for a single pointlight. there a simple shadowvolume pass and a lightingpass eats up much less.. on the other hand, if we can bet the boolean comparison BEFORE the filtering but right AFTER the sampling, we can get smooth shadows by sampling over the right area on a texture quite simple (with 32 samples or more from the anysotropic filter no problem to sample a long line..)

we'll see, but i think currently shadowvolumes can make the race.. for quite a while..

just depending on what lightsources (a lot of lightsources can be projective actually)

Funk_dat
07-23-2002, 12:58 PM
linky change...
http://developer.nvidia.com/docs/IO/3121/ATT/CineFX-TechBrief.pdf

Funk_dat
07-23-2002, 01:06 PM
hehe..ya gotta love Nvidia's marketing.

"NVIDIA's "CineFX" architecture enables real-time cinematic-quality rendering for the first time ever!"

Didn't they say that last time?

davepermen
07-23-2002, 01:18 PM
annoying...

good night http://www.opengl.org/discussion_boards/ubb/smile.gif

Nakoruru
07-23-2002, 02:13 PM
Yeah, 6 renderings per light is hefty. But I do not think that the cost grows as fast as scenes get more complex. That is the key.

Question, say we have a 200,000 polygon scene and a 2 million polygon scene. Is the cost of finding all the silouette edges and optimizing the shadow volumes 10 times greater for the 2 million polygon scene, or much more?

I am be pretty sure that the cost will grow only 10 times using shadow maps, because even if we have to render the scene 56 times for 8 lights, the only increase is in the number of polygons.

I am guessing that the cost will be greater than 10 times using volumes. This is because I doubt that the silouette edge finding algorithms are O(n) or better. Since I have not looked at optimized methods for finding silouette edges I admit I could be wrong.

Oh well, I'm reluctant to go into more detail because it seems this deserves its own topic ^_^

I would tend to agree that I may be premature to declare Doom 3 the great last gasp for stencil shadows, because rendering the scene 56 times for 8 point-lights seems like its a little far off ^_^

Of course, doing 8 passes when you could be doing 1 seems like a problem too.

You may be able to only render the cube faces that contain parts of the scene you are rendering. You could also cache the shadow maps and only rerender the faces which have moving objects in them. Hmmm, lots of possibilities ^_^

folker
07-23-2002, 02:37 PM
Originally posted by Nakoruru:
I am guessing that the cost will be greater than 10 times using volumes.

The typical brute-force shadow volume algorithm (including extrusion in hardware by vertex programs, see for example the ATI papter) has a trivial O(n) edge detection. And the fill-rate will remain the same. Thus, such shadow volumes scale linearly with the number of scene polygons.

Nakoruru
07-23-2002, 08:20 PM
folker,

Thats good, because it is always good to have more than one solution to a problem. My first, naive, guesses are edge detection gave me an intuition that it was not linear. But after more thinking I guess it can be done with one test per edge. But I think the CPU should be doing something better with its time.

The problems I see remaining are how to do multiple lights in a single pass, and how to do it when you have hardware displacement mapping or pn-triangles, or any other type of hardware generated geometry.

I see shadow volumes requiring more fill rate and polygon throughput than shadow maps. But, shadow maps require much more memory and texture units. Maps just seem to be a more general solution, they just shadow whatever you draw to the frame buffer (as long as its not semi-transparent, I have not heard anyone talk about a semi-transparent solution for either type of shadow).

I am thinking that next generation hardware will be able to do shadow maps with enough precision and speed that they will become the favored solution, but thats just a guess ^_^

I have favored stencil for a while, but I can't help but feel they are doomed (no pun intended ^_^)

gking
07-23-2002, 10:12 PM
Shadow maps are very nice for shadowing, but they don't solve as many problems as stencil shadow volumes can, particularly with regard to volumetric effects.

For any technique where the volume of light is important (i.e., spotlights cutting through fog), or perhaps better worded, "the distance light travels through a scattering medium," you will need to be able to extract the analytical shape to get a good answer. Deep shadow maps are an option for software renderers, but support isn't quite there in hardware yet.

davepermen
07-23-2002, 10:15 PM
Originally posted by Nakoruru:
But after more thinking I guess it can be done with one test per edge. But I think the CPU should be doing something better with its time.

the technique folker refers to is entierly on the gpu, done in a simple and quite short vertexshader.

and you can even blur the shadows afterwards with image post processing (blur them 3d in imagespace according to relation of disance to occluder and to light). will see if i can get this working when i have a vpu..

and shadowvolumes mean 1 pass each light. possibly without stencilbuffer even (we have now 4 free buffers to draw into), and possibly we can encode them all some way into the final buffer to store say 8, or up to 32 shadows in a texture. them we could do the lighting in one pass again..

davepermen
07-23-2002, 10:23 PM
Originally posted by Nakoruru:
The problems I see remaining are how to do multiple lights in a single pass, and how to do it when you have hardware displacement mapping or pn-triangles, or any other type of hardware generated geometry.
i think with the remind to my post above, its all in hw, you know the answer to the dispmap and that..


I see shadow volumes requiring more fill rate and polygon throughput than shadow maps. But, shadow maps require much more memory and texture units. Maps just seem to be a more general solution, they just shadow whatever you draw to the frame buffer (as long as its not semi-transparent, I have not heard anyone talk about a semi-transparent solution for either type of shadow).

generic shadow maps require much more fillrate and 6 times the geometry processing if you want to have them nice. as long as you can't directly render to a cubemap, no, shadowmaps are no generic solutions. and about the transparent thingies, for alphatested transparency, shadowmaps are bether, but for the rest, as shadowvolumes can solve analythical problems, shadowvolumes can help there possibly (together with a projective image of the geometry, dunno http://www.opengl.org/discussion_boards/ubb/wink.gif)


I am thinking that next generation hardware will be able to do shadow maps with enough precision and speed that they will become the favored solution, but thats just a guess ^_^
24passes on ever changing rendertargets + 1 pass for the lighting of those 8 lights = 25passes. shadowvolumes with one pass for shadowvolume and one for the lighting will be 16passes. and, as i told you, we can possibly encode the shadowvolumes of several lights into one texture (or simply bind say 8 shadowvolumetextures), then we have 8+1pass..

imho its much less stressing..


I have favored stencil for a while, but I can't help but feel they are doomed (no pun intended ^_^)
i dont think they are doomed yet. they are the only realtime useful generic solution currently existing. but yes they get doomed, its called doom3.. http://www.opengl.org/discussion_boards/ubb/smile.gif