PDA

View Full Version : when will software renderers be viable?



imr1984
02-04-2004, 09:13 AM
How soon will general purpose CPUs be fast enough at doing 3d math, that we wont require graphics cards anymore? This will be great because we will be able to implement software renderers and have absoloute control over what we want to do - we wont be limited by the gfx card's capabilites or bad driver support for example.

Im sorry this isnt specifically an OpenGL question, but it is an interesting one - dont flame me for being such a visionary http://www.opengl.org/discussion_boards/ubb/smile.gif

al_bob
02-04-2004, 09:35 AM
I'd say before 1996 or so. At that point, the release of the upcomming 3Dfx Voodoo will obliterate any form of CPU-based rendering in terms of speed (at comparable technology levels).

From then on, GPUs are poised to increase in rendering speed faster than CPUs, meaning they will likely never catch up(*). GPUs are also starting to become more programmable with time. Notice the inclusion of basic combiners in the upcomming TNT2, which will revolutionize graphics. This is only the first step: future products will increase programmability until eventually whole programming languages specifically geared towards graphics will need to be introduced.


(*) At least until "Good Enough" is reached, but that's not likely to happen any time soon.

[This message has been edited by al_bob (edited 02-04-2004).]

maximian
02-04-2004, 09:35 AM
Not for the forseable future. I read an article very recently which said, GPUs are capable of calculating at over 300x the rate of the cpu in certain special cases. Add of course, the need for high speed memory w/ low latency(rare for the cpu). I simply do not see it occurring.

mikegi
02-04-2004, 09:41 AM
Answer: never ... at least for comparable generations of hardware. Way back in the early 1990s, I had a similar discussion with coworkers. We did drivers for PC graphics hardware. Many thought that the Pentium - and certainly the Pentium II - when combined with the new PCI bus would eliminate the need for dedicated graphics chips.

My reply was that graphics chip companies very lives depended on making their systems faster than a CPU drawing on a dib. We were mainly concerned with 2d rendering. With 3d graphics the GPU utterly blows away a general purpose CPU and will for as far as I can see. There are too many advantages to a dedicated 3d graphics subsystem: memory access, parallel processing, targeted caches, etc.

If you want to be a "visionary", come up with a good, hardware friendly OpenRT (Open RayTracer) spec. Something that has most of the flexibility of a purely CPU implementation but can be accelerated with hardware.


Originally posted by imr1984:
How soon will general purpose CPUs be fast enough at doing 3d math, that we wont require graphics cards anymore? This will be great because we will be able to implement software renderers and have absoloute control over what we want to do - we wont be limited by the gfx card's capabilites or bad driver support for example.

Im sorry this isnt specifically an OpenGL question, but it is an interesting one - dont flame me for being such a visionary http://www.opengl.org/discussion_boards/ubb/smile.gif

endash
02-04-2004, 10:02 AM
I'll just add as evidence this page: http://www.gpgpu.org/ which is all about General-Purpose computation on GPUs. So yea, the for the time being, the trend is moving computation onto the GPU, not off of it. Basically, the GPU allows for paralellism that CPUs don't. For example, you may have a SIMD processor on your desktop, but it's not your CPU.

zeckensack
02-04-2004, 10:19 AM
There's 35 Gflops/s* in a Radeon 9700Pro (at its 315MHz default clock speed), and that's only counting shaders. There's yet more processing power in the rasterizer, the various interpolators, blend units etc.

I don't see general purpose processors anywhere near that. What's the peak throughput of a P4 at 3.2GHz? 6.4 Gflops/s?

*vec4 counts as four operands, obviously => 8*4*3 per clock for the fragment shader (MAD counts as two, plus a MUL in the mini-ALU); 4*4 for the vertex shader

daveperman
02-04-2004, 10:21 AM
gpus are bether only in rastericers. it doesn't mather how fast gpus are, they will never match raytracer on cpu in eighter quality or speed. raytracers are the future. cpus also have much bether precicion, which is very serious mather in raytracers http://www.opengl.org/discussion_boards/ubb/biggrin.gif

endash
02-04-2004, 10:30 AM
daveperman, GPUs can use full 32-bit floats, can they not? Granted mapping the raytracing problem into the graphics card domain isn't trivial, but there is an enormous ammount of paralellism available on a GPU that you don't have on the CPU. Just for starters, it would be easy to inintialize the rays for a raytracer on the GPU. But if you think raytracing is needed for good graphics, talk to Pixar. I don't believe RenderMan is a raytracer.

Ysaneya
02-04-2004, 11:06 AM
I'd also say never, at least as long as we'll have a separate dedicated and highly specialized graphics processor. You can always say "today" that in 5 years, CPUs will be fast enough to do what we can do now on our graphics card. But in the mean time, the standard for graphics will have evolved too. It's possible on today's CPUs to achieve Vaudoo2/TNT's level of graphics quality/performance, but who'd like to play a game with Quake 1 / 2's look when Far Cry, Half Life 2 and Doom 3 are around the corner ?

Y.

dorbie
02-04-2004, 12:27 PM
This is a pointless way of looking at the problem, the question never changes and has been asked for years. In some ways software has been viable for years (maybe even more viable in the past when considered relatively), in other ways software never will be viable.

The first 3D engines on PCs were software only and were viable for Doom & Quake style graphics.

Dedicated hardware will always outperform general purpose hardware, so software only will always be at a disadvantage compared to the GPU. For some this means it isn't viable because what they mean by viable is deterimined by currently available best hardware capabilities. GPUs outperform CPUs even with SIMD instructions in the CPU and very programmable GPUs, that's not going to change because the hardware is designed with different priorities.

Software today can outperform some of the hardware of yesterday, does that mean yesterdays hardware wasn't viable? No it means for some that software will never be viable because determining viability requires a definition and that definition changes with graphics hardware and the evolving graphics applications it enables.

So, it's a pointless question.

gltester
02-04-2004, 01:08 PM
I think the interesting question is:
"When will the CPU's 3D graphics be so good that a dedicated card has nothing extra of importance to offer us?"

My feeling is that this will occur within a few years. Have you seen the Doom 3 screenshots?

In a couple years a CPU will be able to do that without a GPU. And not long after that, it will be able to do photorealistic rendering. And then the cost of a dedicated GPU will no longer be justified for most people.

Jan
02-04-2004, 01:32 PM
The CPU is a general-purpose processor. It is able to do everything.
However it is designed to be able to do everythng, not to do a specific thing. Therefore it will ALWAYS be slower than something which is designed to do a specific thing.
This is not only the case in computing, but in everything. Evolution is the best representative. The human being is able to do everything (which other animals are capable of), but usually it takes a LOT more of energy, brainpower etc. to accomplish it.

Therefore there will always be special technologies to do the same thing with less power and usually for less money. A CPU which is fast enough to do the same what a todays combination of a CPU and a gfx-card is capable of would cost a lot more and that will never change.

However gfx-cards might change a lot. Maybe we will have chips able to do raytracing in a few years, we´ll see. That´s a big step, but it is a logic one.

And take a look at modern PCs. There is not only the CPU and the GPU. There is the CPU, the GPU, the soundcard (with its own, SPECIALIZED processor), there is the network-card (with its own SPECIALIZED processor/chip), there is a north- and a south-bridge, etc. etc. etc.

EVERYTHING in a PC is specialized. Even the CPU is specialized on being a general purpose CPU and FPU.

Having ONE processor which handles everything just doesn´t make sense. Having a specialized gfx-card which is fully programmable makes sense. It is faster and cheaper.

Jan.

gltester
02-04-2004, 02:06 PM
> Having ONE processor which handles
> everything just doesn´t make sense.

Sure it does. We got by in single processor mode for decades. We did have the FPU on a seperate chip for awhile, but look where that ended up!

The GPU will need it's own chip only as long as there is still active research & development going on in the area of 3D graphics.

Once everything settles down and people are satisfied with the level of graphics quality, Intel and AMD will be dying to move the GPU onto the CPU, too.

dorbie
02-04-2004, 02:17 PM
gltester, you should read some of the posts in this thread (again?). When CPUs can do what Doom3 does GPUs will do more than that and Doom3 GFX won't be enough. Extensions like MMX SSE and SSE2 have tried to close the gap a bit but it just can't get you there, there are very domain specific pipelined hardware optimizations dedicated to graphics performance on a GPU.

Perfectly obvious and the same argument that's been doing the rounds for years. It's a bit like saying you'll never need more than 64k. Equally shortsighted and assuredly wrong, it won't happen anytime soon.

Interestingly people are also asking when GPUs will become programmable enough that they replace the CPU for a lot of compute intensive stuff and put Intel out of business.

About this point in the discussion my head hurts as I picture two snakes swallowing each other tail first.

Ostsol
02-04-2004, 02:19 PM
Originally posted by gltester:
Sure it does. We got by in single processor mode for decades.

Um. . . yeah, but if you compare all those decades to the past 5 years, you'll have to note that there's a freaking massive difference in what we're expecting computers to do.

[This message has been edited by Ostsol (edited 02-04-2004).]

V-man
02-04-2004, 03:50 PM
Just curious. How fast can a P4 3.0 GHz (or equivalent AMD, G5, ....)
render a single texture, non-lit cube in software mode?

Can it do a minimum of 60 FPS?

I'm thinking of MESA here, but I guess MESA has little optimization.

I too see the GPU taking over the CPUs functions, but I think the current PC design will last a very long time.

EG
02-05-2004, 01:47 AM
Latest CPUs still come quite short of a TNT2 in terms of rendering speed, especially when you throw in some texture filtering or multitexturing (as far as software DirectX or MESA allows us to judge it).

And given what we have seen during the last year or so, CPUs may well have left Moore's Law curve, I reckon the gap between CPU & GPUs for 3D rendering will only get worse as GPUs are moving up faster in terms of clock speed, parallelism and features.

crystall
02-05-2004, 02:33 AM
I am in the process of writing a software rasterizer keeping it as close as possible to the OpenGL API to provide easy recompilation of apps so I have a little bit of experience with this topic. I sincerely don't think that software rendering will ever be able to match the speed of dedicated hardware simply because dedicated hardware is meant for that purpose and nothing else. Even if we had 256-bit wide buses attached to high speed memory won't put our CPUs on the same level as GPUs, their design is fundamentaly different as is their purpose. But software rendering has its advantages and is vastly superior on some fronts. Scalability is one of them, carefully optimized software rasterizers are not limited by the amount of data you push on them. I can easily increase the number of polygons I send to my rasterizer by a factor of ten and incur only in a minor performance hit (mainly due to the polygon setup routines).

Tzupy
02-05-2004, 02:34 AM
The x86 CPUs with 2-3 pipes are obviously not a challenge to the 8 pipe GPUs. But an interesting approach is to integrate the GPU together with the CPU (it's already happening with the chipsets, but at low speed and with unified memory, so performance is no great).
The advantage of the CPU versus GPU is raw clock speed:
3.4 GHZ versus 500 MHz.
What the CPU lacks is not so much instructions per second (a fine assembly program can do marvels), as memory bandwidth:
6.4 GB/s versus 32 GB/s.
Instead of increasing the L2 cache size of the CPU (or adding a 2MB L3 cache), it would be possible to integrate a 2-4 pipes programmable GPU, for the same die size.
Now, 2 pipes running at 3.4 GHz (or 4 pipes running at 1.7 GHz) would be faster than 8 pipes running at 500 MHz, IF the integrated CPU + GPU would have a 256-bit graphics memory interface, besides the 128-bit main memory one. Of course, a new motherboard design would be required to make it work (there would be a lot more pins and connections).

Won
02-05-2004, 05:16 AM
CPU pipes are not really comparable to GPU pipes, so your exercise in multiplication doesn't actually mean anything. For example, the CPU would have to serialize the various OpenGL stages (think: vertex v. primitive v. fragment &c) that execute in parallel on a GPU, so the GPU has a pretty big advantage there.

And who says even Intel would be able to make a GPU run at the speeds they get their CPUs to run? And are you counting the number of pins this package would require? Power? Cooling? Cost? Yield? Lots of issues to be addressed here.

You CAN design a processor that is parallel and pipelined enough (like the chaining on the Cray-1) to perform graphics computation efficiently. Vector or stream models come to mind. I'm guessing the IBM/Sony/Toshiba Cell processor is going to support something like this; not only are the individual processor elements pipelined, but you can build a pipeline out of the processor elements.

There are various things (such as depth buffer compression) that are very useful on GPUs that still might not be easy to implement on such an architecture. Also, a big part unique to GPU design is how they tune the FIFOs/caches to the behavior of graphics processing. Unless your caches have some kind of programmable associativity or eviction policies then you're basically SOL in that regard.

-Won

Zengar
02-05-2004, 08:14 AM
Originally posted by V-man:
Just curious. How fast can a P4 3.0 GHz (or equivalent AMD, G5, ....)
render a single texture, non-lit cube in software mode?

Can it do a minimum of 60 FPS?


I was getting about 40 fps on Pentium MMX and sofware OpenGL implementation from SGI. So I guess, P4 would be much much faster.
Half-life 2 was playable in software and looked not too ugly on a Celeron 900 640x480 some years ago.

al_bob
02-05-2004, 08:27 AM
Half-life 2 was playable in software and looked not too ugly on a Celeron 900 640x480 some years ago.
Really? http://www.opengl.org/discussion_boards/ubb/biggrin.gif

pkaler
02-05-2004, 11:51 AM
Run you're favorite app with Mesa and then with the drivers of your video card. That'll show the magnitudes of difference.

gltester
02-05-2004, 11:52 AM
> Extensions like MMX SSE and SSE2 have
> tried to close the gap a bit but it
> just can't get you there, there are
> very domain specific pipelined
> hardware optimizations dedicated to
> graphics performance on a GPU.

dorbie,

I can read, I promise. I think you are missing my point which is that there is a limit to how much graphics quality most people are going to want. How much better than photorealistic can you possibly ask for?

In only a few years, I think a general purpose CPU will provide the best possible photorealistic images possible on existing displays. At that point I think it will make sense to have an integrated GPU.

davepermen
02-05-2004, 12:04 PM
i just realised this forum has a "davepermAn" in..

don't mix us two, please.. thanks http://www.opengl.org/discussion_boards/ubb/biggrin.gif

davepermen
02-05-2004, 12:14 PM
How much better than photorealistic can you possibly ask for?

much bether than doom3.

oh, and, all in all:
dedicated hw will always outperform gpu's (general processor units http://www.opengl.org/discussion_boards/ubb/biggrin.gif http://www.opengl.org/discussion_boards/ubb/biggrin.gif) in terms of efficiency (if not performance, then energy consumation, heat, etc..)
gpu's on the other hand will always be the place to develop new stuff on, because they always outperform dedicated hw in terms of scalability of features.

the question is.. will gpu's one day be fast and effient enough to drop the need for dedicated hw?..
always depends on situation..

if it will be fast enough one day, then it will have to scale down energy consumation. for laptops to be useful there as well

if it gets that, we want it even more down, for pda's, and cellphones, and just about everything (cellphones merely voice2text text2voice and similar, all with high compression etc..

dedicated hw will never die. gpu will never die as well


this topic is not about rastericing <=> raytracing, or precicion, or anything, as davepermAn stated.

oh, and, if you know softwire (softwire.sf.net), then you know what a todays p4 can do.. yes, a simple cube can run at 60fps or more http://www.opengl.org/discussion_boards/ubb/biggrin.gif at high resolutions, too. you can have q3 levels smooth.. todays cpu's are rather fast. no gfFX/9800 replacement.. but anyways.. combined with the scalability in features (that allow for much more efficient processing of data for example.. thinking of direct access to all data from the engine and the renderer at the same time and other stuff.., it's quite nice)

Zeno
02-05-2004, 12:15 PM
Just curious. How fast can a P4 3.0 GHz (or equivalent AMD, G5, ....)
render a single texture, non-lit cube in software mode?

Pretty fast. Download the pixomatic demo and see: http://www.radgametools.com/pixomain.htm


I can read, I promise. I think you are missing my point which is that there is a limit to how much graphics quality most people are going to want. How much better than photorealistic can you possibly ask for?

In only a few years, I think a general purpose CPU will provide the best possible photorealistic images possible on existing displays. At that point I think it will make sense to have an integrated GPU.

I think you have a naive idea about the computational power required for "photorealistic" rendering. We're not two or four years away from it as you seem to think.

For example, the average render time per frame of LOTR was 2 hours (see http://www.wired.com/wired/archive/11.12/play.html?pg=2 ) . If you want to do that at 30 FPS, you need to speed things up by a factor 216000. If CPU speed doubles every two years, you'll have to wait about 35 years before those frames will come out in "real time".

You are right to say that, once photorealistic rendering can be achieved on CPU and GPU, we won't need a separate GPU. You are way off, though, in how long you think it will take for that to happen.



[This message has been edited by Zeno (edited 02-05-2004).]

gltester
02-05-2004, 12:26 PM
> For example, the average render time
> per frame of LOTR was 2 hours

LOTR was probably ray traced I'm guessing. I don't think our computers will be able to ray trace in real time with photorealistic quality anytime in the next decade, not while using known ray-tracing algorithms, anyway. That's not even on the radar in my opinion.

So when I said photorealistic, I meant high enough res and high quality enough textures that the images look photographic even using standard lighting techniques.

Just to test if I was really far off I dug out Ye Olde Winquake and ran a timedemo. Quake 1 was released when 320x240 was considered a "good" resolution and 640x480 was totally unplayable.

On my 650 mhz AMD chip PC I just ran a 640x480 winquake timedemo at 55 fps. On a hot new PC I bet winquake could easily break 100 fps. When did Quake 1 come out? mid 90's? So it's been around 8 years I guess.

All else being equal that, a simple projection suggests that in 8 years we will be able to run the current hot new game (Doom 3, almost now) in a (theoretical) software-only mode.

Of course all else is not equal but I'm guessing that the 3D industry is much more mature now than it was in the 90's. I think 3-4 years is realistic for expecting a CPU to do Doom 3 quality graphics, especially if Intel and AMD keep enhancing the SSE-type instruction sets.


[This message has been edited by gltester (edited 02-05-2004).]

ZbuffeR
02-05-2004, 12:53 PM
LOTR was probably ray traced I'm guessing.

True raytracing is almost never used in production. Too slow and not useful enough, apart from doing realistic glass or water. Pixar's Renderman is used very often, and it is not a raytracer at its core.

gltester
02-05-2004, 01:19 PM
I wonder what resolution they render to movie screens (film) at? I don't know, but it's probably massively higher than a computer screen.

Graphics cards seem to have been advancing much faster than computer screens in both image quality and frame rate, and if so GPUs will catch up (and be limited by) screen quality sometime soon.

People say "oh my L33T 3D card runs Quake 3 at 500 fps". Maybe, but their whole computer didn't. Maybe it responded to their inputs 500 times per second but the monitor or LCD has a fixed refresh rate much lower than that...


[This message has been edited by gltester (edited 02-05-2004).]

endash
02-05-2004, 01:37 PM
At the lowest level, rendering will always be a massively paralell problem. (Just think how many photons are used to "render" a scene in real life.) For that reason, it will always be massively inefficient to have a single CPU do the job. Perhaps a super-ultra-hyper-threaded processer, but then you have GPU.

Also, whereas floating point arithmetic is a solved problem, rendering may never be. You can always add more complexity to a scene and always better approximate the way light behaves, but it will always be an approximation. For that reason I don't see the FPU/GPU analogy holding water.

gltester
02-05-2004, 01:47 PM
Originally posted by endash:
You can always add more complexity to a scene and always better approximate the way light behaves, but it will always be an approximation.


That I do agree with completely. Even if the GPU is effectively limited by the resolution and speed of the display device, you can still keep adding geometry, as much as you want.

Of course, their is a limit on the geometry too. Not in how much you can add, but in how much people will notice and/or care about.

When you are to the point that you are rendering every last little nail and tack in a building or screw in a vehicle, when every bone and ligament in the human body is being modeled somewhat accurately, does anybody care if you can add more detail? Will increasing the geometry help you sell more games? Nope.

Not while we are still using sub-4-megapixel displays, not one bit.

crystall
02-05-2004, 01:47 PM
Originally posted by gltester:
I wonder what resolution they render to movie screens (film) at? I don't know, but it's probably massively higher than a computer screen.

Common movie resolution is usually (depending on your budget) 2k, 4k, 8k or 16k. That is 2, 4, 8 or 16 thousand pixels wide with the number of vertical pixels depending on the required aspect ratio.

gltester
02-05-2004, 02:19 PM
Originally posted by crystall:
Common movie resolution is usually (depending on your budget) 2k, 4k, 8k or 16k. That is 2, 4, 8 or 16 thousand pixels wide with the number of vertical pixels depending on the required aspect ratio.

So LOTR was probably 144 megapixels by my estimation (16:9 aspect ratio with a high-budget 16k pixels wide). A 1600x1200 computer screen is less than 2 megapixels. A very simplistic estimate would be that the LOTR renderer could have generated one frame at computer screen quality in around a minute and a half.

Still way too slow for animation, and they probably had a whole render farm doing processing instead of a single PC, but I hope somebody can see my point which is that we are not all that far away from a practical quality limit. In a few years, increasing your game quality will require:


1) actual game design, not just hot graphics
2) waiting for hardware advancements other than the 3D card (monitor, CPU)


In many/most cases the CPU should be able to generate as much quality as the screen can reasonably represent. Even if not, we are talking about adding a cheap $50 3D card, not a $300 monster, in order to max out the display's abilities.

my opinion anyway

dorbie
02-05-2004, 02:57 PM
Possible improvements: Increased resolution, antialiasing, lighting, global illumination, high fidelity BRDF, caustics, atmospherics & volumetric illumination, weather, motion blur, HDR & adaption, depth of field, hair & cloth with full collision, global physics and materials, fluid dynamics, all of the above interracting, the list could go on ad nauseum and all are problems that may be solved on the graphics cards of the future but always with compromises because the computational requirements are effectively limitless.

Do not assume that LOTR is the pinnacle of graphics achivement, it won't be despite how impressive we all find it. Movies make all sorts of short cuts that true 3D environments cannot, in addition they are often hand crafted in many ways, rendered in pieces and composited.

More importantly LOTR was a movie, it was not ray traced, or rendered, it was predominantly *FILMED*, it could not have been made entirely with CGI even with todays best technology. Looking at a movie and citing it as an example of where 'real-time' graphics can go shows that even the current offline rendering technology doesn't meet your standard for 'viability'. It reinforces the belief that it probably never will.

I'm not even going to comment on the jaded gameplay criticism except to say you don't get a lot of people playing space invaders these days. There's a good reason for that.

It just boggles the mind that someone would say we will be waiting on advances in the CPU rather than the graphics and that the CPU will outstrip GFX performance. Ignoring the contradiction, we already wait for CPU memory and bus advances. See all the comments above as to why this is unlikely to happen any time soon, none of which you've directly addressed.

[This message has been edited by dorbie (edited 02-05-2004).]

gltester
02-05-2004, 03:34 PM
This has drug out longer than I intended at my original comment but as an example of where I'm coming from I'll pick on your example of caustics.

I've seen pretty cool water effects using nothing but multitexturing. Sure there may be a million ways to make water look better but we are no longer talking about advances as significant as, say, transparent water was in the original GLQuake.

What I'm saying is all the low-hanging graphics fruit is already been picked. The jump from Doom 3 to Doom 5 (or whatever) will be a tiny hop compared to the jump from Doom to Quake 3 was. It is a mathematical certainty that people are going to notice less and less every time we come up with yet another effect, and not just because they are jaded, but because we are approaching a limit called "looking alot like reality".

When Space Invaders was new people were like "WOW a computer that draws pictures!!!"

Now we are at "oh this computer's pictures have better improved light reflections in the pools and look the people's joints move realistically now". Not quite as exciting and not as important to sales, IMO.


[This message has been edited by gltester (edited 02-05-2004).]

Zengar
02-05-2004, 03:54 PM
Originally posted by al_bob:

Half-life 2 was playable in software and looked not too ugly on a Celeron 900 640x480 some years ago.
Really? http://www.opengl.org/discussion_boards/ubb/biggrin.gif

Yeah, i put my hands on some secret alpha codes http://www.opengl.org/discussion_boards/ubb/biggrin.gif

Damn typos... I guess I'm be waiting just too long for this game now...

gltester
02-05-2004, 04:36 PM
Thought maybe a concrete example would help since I don't seem to have explained myself in a way that you can at all respect:
http://graphics.ucsd.edu/~henrik/images/metalring.jpg

This is basically a small ray-traced picture with some caustics. You and I look at this copper ring with it's relected light pattern and think "coooooool" but for most of the public, (the mass market), they probably wouldn't even notice if the light pattern was missing. Even without using any ray-tracing, Doom 3 looks almost as good as this picture does.

All I'm saying is we are rapidly approaching a point where GPUs will be able to easily do all of the effects that most people notice, and that CPUs won't be far behind afterwards. Somewhere in there we may move the GPU onto the same die as the CPU. Or possibly we'll just drop the GPU altogether and do CPU-only rendering. Most people will be happy with that. The rest of us will shell out for hot graphics cards.

I've known plenty of people who are totally unable to tell the difference between a game running at 25 fps or running at 100 fps. One programmer I used to work with had his damn monitor hooked up thru a damaged cable causing visual echos and also configured at 60 hz refresh rate for like a year without noticing it until finally I made him switch cables.

People are not as perceptive as we are here, and the state of the art is very close to limit of what most people will spend $300 to "fix". In a couple years, standing 10 feet away from the monitor, you won't be able to tell Doom from a photograph.

dorbie
02-05-2004, 05:07 PM
Yes but my projective texture caustics weren't real caustics. It doesn't model true refraction and that is a class of problem that if you wanted to go beyond eye candy and do it correctly would be more expensive. Consider caustics from a dynamic surface simulation with a real interracting object and accurate treatment of depth. It's a volumetric problem. With all of these things there are degrees of fidelity and you can always crank it up a notch to get it looking or behaving more accurately. For some stuff it doesn't matter for other features and scenarios it's vitally important.

gltester
02-05-2004, 06:06 PM
Originally posted by dorbie:
... For some stuff it doesn't matter for other features and scenarios it's vitally important.


Yes that's all true.

And maybe there are some games where caustics really would matter. Not your typical FPS or flight sim though. I could see how a game like Myst could look AWSOME with some good caustics, and it might even be a critical part of the game.

RTS games might reasonably need to show 100,000 troops moving in different formations on screen (much like LOTR). I can see that. But even a couple years from now, computer screens will still only be maybe say 3200x2400 if we are lucky.

That's only 77 pixels per troop, with no room left over for ground or anything. I don't care if you've got the frigging CIA's secret supercomputer in your GPU, you are still limited in visual quality by the monitor at that point, and there's nothing you can do about it. No amount of geometry jammed into the GPU will make up for the shortcomings of the monitor.

AND once we hit that point, GPUs won't need to get much better, and CPUs will start catching up in abilities.


[This message has been edited by gltester (edited 02-05-2004).]

arekkusu
02-05-2004, 09:34 PM
Originally posted by gltester:
That's only 77 pixels per troop

Don't forget to factor in the mandatory 8 subpixel bits of precision (equiv of 256x FSAA with ATI's crappy implementation) or more, once we finally ditch 24 bit framebuffers.

Also I think a lot of people in this thread are concentrating on only the APPEARANCE of the imagery and not the BEHAVIOR. It's one thing to render 100,000 troops. It's an entirely different thing to simulate the physics of them marching and fighting.

Given that accurate physics simulation can be just as computationally intensive as rendering (think: particle physics) wouldn't you rather have the CPU doing that while the GPU concentrates on tasks it is suited for? Or perhaps you'd prefer buying a dual-proc machine, or waiting an extra 18 months?

Specialized hardware isn't going away anytime soon.

jwatte
02-05-2004, 09:35 PM
Think of it from a memory bandwidth perspective.

The Pentium IV has 6.0 GB/s. The best graphics cards claim to get close to 30.0 GB/s. And, when you have that with your graphics card, you can use the CPU for animation, AI and physics.

Personally, I think the next big step will come in production values, rather than glitz. Making sure humanoids move like humans, not like robots from a 40s SciFi movie; deriving subtle facial expressions from context; that kind of thing. Making sure that all art is following a common style, is similarly proportioned/dense, etc.

Korval
02-05-2004, 10:05 PM
equiv of 256x FSAA with ATI's crappy implementation

Huh? I thought it was generally accepted that ATi had very good antialiasing. They have that shifted-grid sampling thing going on. And the gamma-correct thing.


Also I think a lot of people in this thread are concentrating on only the APPEARANCE of the imagery and not the BEHAVIOR.

That's a really good point. It takes movie animators weeks to get a 1-minute section of film together, for realistic movement. Computers have precisely 1 minute (in approximately 33 ms or less increments) to figure out how to animate everything, collide it, etc.

Better to split up the tasks and parallize.


Personally, I think the next big step will come in production values, rather than glitz.

I would say that that time has always been here. I've found that games with good art look much better than games with <insert effect here>. More important than caustics and so forth is basic consistency and beauty.

V-man
02-05-2004, 10:22 PM
gltester, I don't disagree with you when you say eventually the CPU will be fast enough, but the fact is, the GPU can get there first and may take over the CPU's functions.

I don't find Doom3 satisfactory in terms of graphics. There is much room for improvement in terms of graphics. Not to mention AI and physics.

One thing is for sure. When I look at CGI, I can tell it is CGI. Even those fancy movies aren't photorealistic, except for certain shots.

dorbie
02-05-2004, 10:53 PM
When you can have a simulated physical world, and when in it incidentally wander up to a fire engine in it grab a hose and start blasting dirt apart stone by stone and walls brick by brick meanwhile enjoying the entirely incidental rainbow effect caused by light refracting in the extraneous spray as you slip around in the mud trying to keep your footing, we'll have gone a small way towards simulation of physical reality in sufficient detail.

I think you underestimate the point at which people will say things are good enough, it is an old mistake. Even when things look and behave real it won't be good enough for all applications.


[This message has been edited by dorbie (edited 02-05-2004).]

arekkusu
02-06-2004, 01:30 AM
Originally posted by Korval:
I thought it was generally accepted that ATi had very good antialiasing.

Off-topic for this thread, but not in my experience (http://homepage.mac.com/arekkusu/bugs/invariance/FSAA.html) . Maybe it depends on your definition of "good"... compared with Quartz, or raytracing where you can shoot an arbitrary number of rays through every pixel, it totally sucks.

ZbuffeR
02-06-2004, 02:36 AM
>> you underestimate the point at which people will say things are good enough, it is an old mistake.

Completely true. I remind me seeing screenshots and videos of the original Ultima Underworld, back in 1992, and saying to myself "Wow, there is no need to have better realtime graphics than THIS !" ... I was incredibly wrong ... And more than 10 years later, see this : http://www.pocket.at/pocketpc/ultimaunderworld.htm

Tzupy
02-06-2004, 03:50 AM
Originally posted by Won:
CPU pipes are not really comparable to GPU pipes, so your exercise in multiplication doesn't actually mean anything. For example, the CPU would have to serialize the various OpenGL stages (think: vertex v. primitive v. fragment &c) that execute in parallel on a GPU, so the GPU has a pretty big advantage there.

And who says even Intel would be able to make a GPU run at the speeds they get their CPUs to run? And are you counting the number of pins this package would require? Power? Cooling? Cost? Yield? Lots of issues to be addressed here.

You CAN design a processor that is parallel and pipelined enough (like the chaining on the Cray-1) to perform graphics computation efficiently. Vector or stream models come to mind. I'm guessing the IBM/Sony/Toshiba Cell processor is going to support something like this; not only are the individual processor elements pipelined, but you can build a pipeline out of the processor elements.

There are various things (such as depth buffer compression) that are very useful on GPUs that still might not be easy to implement on such an architecture. Also, a big part unique to GPU design is how they tune the FIFOs/caches to the behavior of graphics processing. Unless your caches have some kind of programmable associativity or eviction policies then you're basically SOL in that regard.

-Won

Integrating the CPU && GPU on the same die would be a huge step indeed,
but a new design for GPUs, that would allow them to run at higher
frequencies, might come soon. Ati just licensed from Intrinsity a
technology that should allow them to quadruple the frequency at which
the processing units run. The performance of a 4-pipe GPU running at
1GHz would be better that the performance of a 8-pipe GPU running at
500 MHz, due to lesser loading of the latter. So it might be possible to
see a GPU derived from the Radeon9600 (but with 256-bit memory) that runs
at 1 - 1.5GHz, performing better than the R420.

In terms of die size, the 2MB of L3 cache of the P4EE requires some 120M
transistors. The R300 uses about 110M and the NV35 about 130M. I do believe
that once the 90 nm process will mature, it will be possible to integrate
a 4-8 pipe GPU on the same die with the CPU. Having the CPU read/write
directly to the graphics memory and the GPU to the main memory would cut a
lot of the driver and AGP latencies (which are becoming increasingly important,
read the BatchBatchBatch.pdf from GDC2003). It would also mean that one could
write assembly code that runs directly on the GPU (bypassing the driver).

If you think only about the downsizes of integrating the CPU and GPU then
there's no point to thoroughly study the issue. But if, in the next 2-3 years,
this will become feasible, and someone actually does it, then someone would
hit the jackpot!

mikegi
02-06-2004, 06:46 AM
I wrote a photon mapping raytracer shortly after reading Jensen's original paper. It could produce very nifty effects -- my favorite was having two glass spheres with the caustics from one being focussed by the other. I also had decent dispersion going although it was difficult to eliminate "speckling" in the prism's spectrum. Anyway, to see the real power of RT you need to look at the work of Gilles Tran:
3D Images 1993-2003 (http://www.oyonale.com/histoire/english/index.htm) .

The good news is that all the programmability being added to GPUs will help get us to limited hardware-assisted RT. We need a more mechanistic approach to RT to get there, though, since the nifty things in RT usually involve mathematical techniques (isosurfaces, etc.). That's why I said we need an OpenRT spec, something HW manufacturers could shoot for!

For example, with a HW interval arithmetic unit(IAU), isosurfaces could be done on the GPU. Your app sends the function to a GPU compiler and the HW calls it repeatedly to do the convergence.


Originally posted by gltester:
Thought maybe a concrete example would help since I don't seem to have explained myself in a way that you can at all respect:
http://graphics.ucsd.edu/~henrik/images/metalring.jpg

This is basically a small ray-traced picture with some caustics. You and I look at this copper ring with it's relected light pattern and think "coooooool" but for most of the public, (the mass market), they probably wouldn't even notice if the light pattern was missing. Even without using any ray-tracing, Doom 3 looks almost as good as this picture does.



[This message has been edited by mikegi (edited 02-06-2004).]

maxuser
02-06-2004, 07:23 AM
Carmack had pointed out (I think sometime during the development of Quake3) that he finds that although he had worked on first-order visual effects in the original Doom, now he focuses on second- and third-order effects, since advances in hardware have solved first-order problems with finality. And as we've all experienced in watching the evolution of games over the years, the effects are getting more subtle and more detailed, allowing increased complexity, and we're taking for granted more and more of those things.

HOWEVER...

The human visual system is perhaps the most developed and most complex of all known biological systems. It's evolved over millions of years to be able to pick out the most subtle details of the most complex scenes. We can pat ourselves on the back and marvel at our technological advances in generating life-like images on a computer screen (believe me, I'm as excited as anyone about this stuff!), but even the most impressive screen shots of DOOM 3 have fewer than a dozen characters on screen. Gollum in LotR is a *single* character, and he took hours to render for a single frame. Now imagine rendering human beings with the quality and resolution of Gollum; not one, but *thousands* as they walk and bump into each other on the bustling streets of midtown Manhattan during rush hour, all at 16Kx9K, 60 fps. 216000 times today's processing power? Try billions.

The "graphics innovations are almost dead" crowd has been reiterating that we're nearing the point where hardware is running out of interesting things to do, and CPU's will eventually catch up. The fatal flaw in that argument is that because we've come such a long way, advances are starting to plateau, therefore innovation and interest in further advancement will cease. True, advances are seeming to plateau, but it's not a plateau but an elbow in the curve. What it really means is that we're just now getting to the hard part. The easy problems have been solved, and what's left is the long, trudge ahead that step-by-step will get us ever closer to visual realism. But that path will likely not end in our lifetimes, if ever.

dorbie
02-06-2004, 08:31 AM
I have to laugh at subjective phrases like first order effects vs second order effects when used like this. I think it's a useful observation in some ways, however I would say that reasonably accurate shadows everywhere with correct dynamic lighting are indisputably a first order effect and Carmack has never delivered them in a published title, so to present this statement retrospectively as it applies to earlier titles shows how dated it is now.

What is defined to be first order vs second order vs third order also tends to be determined by the technology and the problems a particular individual is focused on. Depending on your approach second order effects can become first order effects and underpin or even replace your original first order effects.

[This message has been edited by dorbie (edited 02-06-2004).]

gltester
02-06-2004, 10:10 AM
I still think you guys are looking at the problem backwards!

I keep seeing different arguments based on the fact that current processors are eons away from being powerful enough to do *all* of the affects we wish we could do. Now look at it from the reverse (more correct) point of view. We don't generally need our pictures to look better than reality, so there's our ultimate goal. If our rendered animations on-screen look as good as a photograph/movie displayed on screen, we are done.

How close are we to that limit now?

Another way of putting that is: If you could list all the possible features of an image that make an image look good, you'd have a finite list. At the top of the list, the most important things would be "perspective", "frame rate at least 75 fps", "24-bit color", "reasonable-looking physics", "reasonable lighting" etc. etc. Somewhere in the middle are "shadows" and "curved surfaces". At the bottom end of the list, the least important to general image quality are things like "specular highlights" and "perfect lighting" and "caustics". Now to rephrase the above question: how much of this list do we already have available?

I haven't tried to write out this list, but I'd say, with Doom 3, our image quality is 75% as good as the ultimate image quality we can imagine. In two years, I predict we will be at 90%.

Images looking 90% as good as reality are PRETTY DAMN GOOD. Yes, to get that last 10% we will still need processing power to go up by a factor of like a million, and you guys can spend the rest of your lives maybe designing clever ways to tweak the last little bit of quality out of the available hardware power. But who will care? People will already will have PRETTY DAMN GOOD images on their screens, and they won't want to pay very much to get that last 10% of quality that you all seem so concerned about.

Someone mentioned being able to pick up the fire hose from a truck and turn it on and spray it around everywhere seeing rainbows. Seriously how many video games is that going to be an important feature for? Your only input method is arrow keys and a mouse!!! It's not like you can shuffle a deck of cards in-game with control over every last card. How are you going to select every feature in a game world and why would you even want to? People don't care about that stuff very much! Even if there was some adventure/Myst-type game where you really needed to do have complete fire-hose access, is it really going to hurt sales that much if the rainbows aren't there in the sprayed water? It's probably a tiny little obscure feature buried somewhere in a game.

I could see being this interested in every last image details if we were talking about games using full-sensory VR gear, or even if we were talking about generating full-size Hollywood movies in real-time for some reason (maybe a rentable video-game-playing theater?), but we are talking about PC video games, on a roughly 20-inch MONITOR, with a 15-year-old keyboard/mouse design for input. We are near the limit of this hardware, folks, no matter how fast the GPU may get.

Of course my fairly subjective guesses of 75% and 90% might be somewhat off, but I'm not that far off. And I probably agree with the mass markets perceptions more than your perceptions, they are the ones ultimately funding video game development.

It's not the end of the world. People will still need good video game programmers even after the graphics are good enough.

Won
02-06-2004, 11:16 AM
maxuser --
The human visual system is actually quite interesting. Biologically, it appears to be quite limited, but the visual part of your brain does quite a bit to make you see what you see.

gltester --
Your thesis seems to be based on the fact that there is such thing as "good enough" AND that we are most of the way there. Man, I thought the only double whopper I was going to have today was lunch! http://www.opengl.org/discussion_boards/ubb/wink.gif You are entitled to your heretic's opinion, but it shouldn't come as a surprise to you that lots of people have lots of reasonable objections.

You're right about the display hardware. For example, view resolution has been basically CONSTANT throughout the life of computer graphics. But it doesn't mean that future display technologies will require additional work, and that future consumers won't be more sensitive to various levels of realism.

Tzupy --

Things to think about:

Do you really believe you can do an apples-apples comparison between transistors in cache (regular SRAM cells) and transistors in a GPU (all computation)?

Are you suggesting that the GPU and CPU share a single memory bus? Perhaps just for communication, like fast AGP memory? Would the CPU and GPU still have their dedicated memory busses? If so, how many pins will this package be? How much would it cost in terms of packaging and system integration? If not, what limitations would the bus contension cause?

What's the differences between the CPU business and the GPU business? What are the margins on CPUs vs. GPUs? Does it make business sense for a CPU manufacturer to integrate GPUs? The other way around? What market would this hybrid G/CPU be?

-Won

pkaler
02-06-2004, 11:43 AM
Originally posted by gltester:

Someone mentioned being able to pick up the fire hose from a truck and turn it on and spray it around everywhere seeing rainbows. Seriously how many video games is that going to be an important feature for?

Speak for yourself. If the image on the display in front of me is not pixel-for-pixel as good of the view through the window beside me, then we are not there yet. The gold standard is reality.

And the 90% estimate that you talk about is an exageration.

Just sit in front of your window for 30 minutes and just watch. Or even better, go outside. Take a serious look at all the subtle beauty of the world.

You say we are 90% there. I say we are more than 90% away.

dorbie
02-06-2004, 12:07 PM
gltester, the firehose was an example of the complexity of the world. Pick any other suitable example, it's not a game pitch it's an illustration and a valid one. The point is there is immeasurable physical and visual complexity in the real world and developers will pursue that. You're just wrong about this on so many levels.

We're not almost there, we're not even close. You still haven't addressed the fact that movies are and must be *filmed*. There is a good reason for that. We're not even close in offline rendering, and that's just for prescripted non interractive stuff.

davepermen
02-06-2004, 01:26 PM
gltester. just take a simple game wich has a realistic scenario. and then try to get the best graphics possible for this realtime.. and then compare it to what the real world gives.

yes, we're possibly at 50% (or, as you say, 90%) to full realistic realtime rendering of indoor buildings with walls, and.. floors.. and.. all that. but there is at least one whole planet of other ideas on scenarios you could play in (and then there is the whole resting space of imagination). in about all other scenarios we are not even close.

to get a good example game, take zelda - the ocarina of time.

first: you're in the wood. go into the real wood, rebuild that small region, without even the deku-tree itself, and then try to give me a realtime application (with a huge pc network if you want), that gives me the illusion of looking natural. i'm not even talking about realistic.

second: link itself, and all the other characters. make them move, interact, and look natural. gollum as minimum requirement to look realistic.

third: places like the water castle, impossible at all to look realistic with todays gpu's.
not to say actually simulating the water in a way it acts natural.. urgh. have fun with 3d fft on high res 3d grids http://www.opengl.org/discussion_boards/ubb/biggrin.gif

same for the ice-palace. there, you need at least photonmapping and raytracing with not only rgb, to make it looks believable (there, to look good, the refraction and the prisme-light-colour-scattering effects could get used imense to make it look beautiful, magic, and natural, at the same time).

you can touch all other places in zelda. there is about no-one wich is even only close to realistic implementable today.

graphics are very far away from what is called realistic. doom3 doesn't look any bether than zelda for me. actually worse http://www.opengl.org/discussion_boards/ubb/biggrin.gif but thats a personal opinion.

Korval
02-06-2004, 01:27 PM
An off-topic note:


Off-topic for this thread, but not in my experience. Maybe it depends on your definition of "good"... compared with Quartz, or raytracing where you can shoot an arbitrary number of rays through every pixel, it totally sucks.

Sure, compared to raytracing which isn't designed for real-time, yes. But, for a triangle rasterizer, ATi's R300 anti-aliasing is the best around.

Now, back on-topic.


but a new design for GPUs, that would allow them to run at higher frequencies, might come soon. Ati just licensed from Intrinsity a technology that should allow them to quadruple the frequency at which the processing units run. The performance of a 4-pipe GPU running at 1GHz would be better that the performance of a 8-pipe GPU running at 500 MHz, due to lesser loading of the latter.

While it is true that fewer pipes at a higher clock speed is better, you have to realize that


Having the CPU read/write
directly to the graphics memory and the GPU to the main memory would cut a
lot of the driver and AGP latencies (which are becoming increasingly important,
read the BatchBatchBatch.pdf from GDC2003).

First, this .pdf is directed at D3D only. Secondly, the reason it is directed at D3D only is because the IHV-portion of D3D isn't allowed to do marshaling of calls itself; the D3D runtime does it (and not in a particularly thoughtful way). This has nothing to do with bandwidth between the card and the CPU.


It would also mean that one could
write assembly code that runs directly on the GPU (bypassing the driver).

Neither OpenGL nor D3D are ever going to provide an API for that. Nor should they.


That's why I said we need an OpenRT spec, something HW manufacturers could shoot for!

Why would they want to? Remember, even high-end movie FX houses don't use RayTracing that frequently. So why should they provide that ability in consumer-level cards? If it's good enough for Pixar, isn't it good enough for everybody else?


The "graphics innovations are almost dead" crowd has been reiterating that we're nearing the point where hardware is running out of interesting things to do, and CPU's will eventually catch up. The fatal flaw in that argument is that because we've come such a long way, advances are starting to plateau, therefore innovation and interest in further advancement will cease. True, advances are seeming to plateau, but it's not a plateau but an elbow in the curve. What it really means is that we're just now getting to the hard part. The easy problems have been solved, and what's left is the long, trudge ahead that step-by-step will get us ever closer to visual realism. But that path will likely not end in our lifetimes, if ever.

In general, my belief is that the only differences between various cards in 3-5 years will be performance. And that will be the only impetus to upgrade as well. Graphics cards will be feature-complete.


I haven't tried to write out this list, but I'd say, with Doom 3, our image quality is 75% as good as the ultimate image quality we can imagine.

Lol!

We aren't even 25% of the way there. What we have done is, as someone said, "picked the low-hanging fruits." We've done the easy stuff. Texturing to add detail. Surface detail interacting with lights (bump mapping). Reasonably correct shadows. Now, comes all the really hard, but really important, stuff.

Take shadows for instance. Either shadow maps or shadow volumes, neither one provides for easy soft shadows. But soft shadows are vital for photorealism. Doing soft shadows is hard. We did the easy part: hard shadows. Now, comes the difficult, yet vital, part.

These kinds of subtle things are what separates "that's pretty decent CGI" from "that's CGI?!" Without these subtle interplays of light, you aren't getting the job done.

V-man
02-06-2004, 09:29 PM
Besides what is good enough being in the eye of the beholder, the future is uncertain. Someone might decide that 1k x 1k or even 4k x 4k is not enough and they want to project real 3D images in mid space.
How advanced are those now? Not very it seems.

This discussion is very narrow minded I think.

Let me give an example. IBM was researching a kind of display that would host billions of pixels (microscropic pixels) in an effort to duplicate the paper.

<sarcastic time>
we're gone need CPU's that can execute 40 trillion instructions per nanoseconds someday!
</sarcastic time>

tayo
02-07-2004, 03:01 AM
if you are talking about real time photorealism and software rendering, let's have a look at the screenshots here http://www.openrt.de/Gallery/IGI2/.

It is based on a distributed ray tracer (up to 24 dual Atlhon MP :-)). They obtain 2fps on a very complex scene (50M of triangles!) with global illumination.
It is very interesting but it is still so far from reality...

Nutty
02-08-2004, 10:48 AM
What a stupid thread.

Custom hardware will always be better in a specialised task, therefore there will always be gpu's for graphics. Regardless of whether its scanlining or raytracing.

A P4 might be able to come close to certain operations quickly. I recall the very same comparison of Vertex Shaders on P4 and GPU. The point is, the P4 was probably at 100% usuage to match the gpu, of which theres no time left for Sound, AI, collision, input handling, simulation, networking, etc....

DarkWIng
02-08-2004, 11:17 AM
Originally posted by gltester:
I haven't tried to write out this list, but I'd say, with Doom 3, our image quality is 75% as good as the ultimate image quality we can imagine. In two years, I predict we will be at 90%.
I just hope you are joking on this one. We're more like 10% there... Well Doom3 is even a few % less.

SirKnight
02-08-2004, 01:04 PM
Originally posted by Nutty:
What a stupid thread.

Custom hardware will always be better in a specialised task, therefore there will always be gpu's for graphics. Regardless of whether its scanlining or raytracing.

A P4 might be able to come close to certain operations quickly. I recall the very same comparison of Vertex Shaders on P4 and GPU. The point is, the P4 was probably at 100% usuage to match the gpu, of which theres no time left for Sound, AI, collision, input handling, simulation, networking, etc....


Exactly.

And what is this about doom 3? Doom 3 looks good but not THAT good. As far as PC games go, yes it's one of the best looking games right now, but to say it's anywhere near "perfect reality rendering" is absurd. Doom 3 is just simple perpixel bump mapping with some shadow volumes in a pretty detailed (for a pc game currently) 3d world. The lighting model is a very simple one, not anywhere near a "real life" model. Dot 3 bump with specular and maybe a few other simple things, but it does not take into account any complex light surface interaction like you may see with BRDF and friends for instance. The lighting model does not take into account what happens to the light when it penetrates a surface and bounces around and whatnot. The list is endless. And the shadows, while makes a moody game, are not even close to how shadows are generated in the real world. If we ever can render a world in real time that makes you think you're looking into a window out at the real world with all the details dorbie mentioned (and more!), don't expect it in your life time. http://www.opengl.org/discussion_boards/ubb/smile.gif There is just way too much to do.


-SirKnight

harsman
02-09-2004, 01:11 AM
gltester has a point in that it's only natural for graphics to suffer from diminishing returns. The difference between one billion triangles in view and ten billion triangles probably isn't that big. Some of todays games can certainly be mistaken for TV if you stand a couple of metres away. We still have a long way to go though, especially if we want decent lighitng and believable characters. Animation is largely an unsolved problem for one. However, the graphics problem is inherently massively parallel and I doubt CPU's will be able to render games without hardware assitance within the next 10 years.

jeickmann
02-09-2004, 02:48 AM
This presentation covers some aspects that could be improved and how much they will cost you performance-wise: http://developer.nvidia.com/docs/IO/8343/Elegance-of-Brute-Force.pdf

I know, you can always say: "It's from NVIDIA, and they want to keep on selling GPUs!" but I think it's very interesting to read.

Jan

KuriousOrange
02-09-2004, 05:34 AM
Originally posted by SirKnight:

Exactly.

And what is this about doom 3? Doom 3 looks good but not THAT good. As far as PC games go, yes it's one of the best looking games right now, but to say it's anywhere near "perfect reality rendering" is absurd. Doom 3 is just simple perpixel bump mapping with some shadow volumes in a pretty detailed (for a pc game currently) 3d world. The lighting model is a very simple one, not anywhere near a "real life" model. Dot 3 bump with specular and maybe a few other simple things, but it does not take into account any complex light surface interaction like you may see with BRDF and friends for instance. The lighting model does not take into account what happens to the light when it penetrates a surface and bounces around and whatnot. The list is endless. And the shadows, while makes a moody game, are not even close to how shadows are generated in the real world. If we ever can render a world in real time that makes you think you're looking into a window out at the real world with all the details dorbie mentioned (and more!), don't expect it in your life time. http://www.opengl.org/discussion_boards/ubb/smile.gif There is just way too much to do.


-SirKnight

You haven't seen lord of the rings then? If it's possible offline now, then expect it in realtime within 10 years.

SirKnight
02-09-2004, 05:54 AM
Yes I have seen LotR and it was not completely computer generated. Only a few characters and some effects. Nothing big there. I'm talking about the kind of rendering where the whole movie would be computer generated (where no real actors or real cameras needed) and you would not know it unless you were told. That's what I mean by "perfect reality rendering."


-SirKnight

divide
02-09-2004, 09:50 AM
why software rendering could have avantages over hardware accelerated rendering : http://www.realstorm.de/Whyrealtime.html

maximian
02-09-2004, 10:24 AM
As I see it there are to factors.

1. Level of detail - a really powerful(advanced) graphics system would be able to dynamically scale detail, that is subdivide, or decimate meshes as you get closer and closer. I have not seen, and will not for a long time see that happen in real time. If you look at a tree bark and close in on it, you see incredible complexity of structure and color variations. Add to that lighting, and realistic collision detection and I see a need for multiple gpus + CPU(s).

2. Scalability, Doom is fantastic. But exactly how many total polygons are being rendered. Not in the scene, not culled, actually visible and rendered. Sure texture mapping + bump mapping is a good approximation. But I think everyone will agree that the best possible, most realistic game will actually have that detail, ie Millions of polygons. Consider something like an rts, with 100 + units on screen. Now we always trade of by making these units pretty simple. Less than a few hundred polygons. But we could certainly tell, even with small units, the difference between a tank with 10,000 polygons, and one with 100.

While I agree with gltester on diminishing returns, witness the fiasco from 9700-9800XT, or Geforce FX5800-FX5950. To say innovation is dead is parochial and short sighted.

Zengar
02-09-2004, 10:50 AM
Originally posted by maximian:
But exactly how many total polygons are being rendered. Not in the scene, not culled, actually visible and rendered. Sure texture mapping + bump mapping is a good approximation. But I think everyone will agree that the best possible, most realistic game will actually have that detail, ie Millions of polygons. Consider something like an rts, with 100 + units on screen. Now we always trade of by making these units pretty simple. Less than a few hundred polygons. But we could certainly tell, even with small units, the difference between a tank with 10,000 polygons, and one with 100.


I don't get your point. Per-pixel lighting on a quad with 2 tringles look the same way as on a quad with 10,000. That's the modern way of life, less polygons, more fragments.
If it goes on, we will be drawing one polygon, and building a complicated mesh out of it per-fragment. I think, it's cooler then vertices http://www.opengl.org/discussion_boards/ubb/biggrin.gif

KuriousOrange
02-09-2004, 11:01 AM
Originally posted by SirKnight:
Yes I have seen LotR and it was not completely computer generated. Only a few characters and some effects. Nothing big there. I'm talking about the kind of rendering where the whole movie would be computer generated (where no real actors or real cameras needed) and you would not know it unless you were told. That's what I mean by "perfect reality rendering."


-SirKnight

Ok then, Final Fantasy. 100% CG. Not that I've seen it, of course http://www.opengl.org/discussion_boards/ubb/smile.gif

Adrian
02-09-2004, 11:35 AM
Originally posted by divide:
why software rendering could have avantages over hardware accelerated rendering : http://www.realstorm.de/Whyrealtime.html

They make two points, one is that 'beyond a certain level of detail, a polygon has to be smaller than a pixel on the screen'. This assumes that polygons are only useful for rendering on screen. Future unified lighting/shadowing algorithms may use offscreen polygon rendering and readback. In the engine I am working on you only see .1% of the polygons rendered the rest are used in the light/shadow calculations. I could make use of >1000x the tnl power we have today.

The second argument is that some of their features arent possible with graphics hardware. That's true but the only important one missing imo is proper reflections.

Adrian
02-09-2004, 11:58 AM
Originally posted by Zengar:
I don't get your point. Per-pixel lighting on a quad with 2 tringles look the same way as on a quad with 10,000. That's the modern way of life, less polygons, more fragments.
If it goes on, we will be drawing one polygon, and building a complicated mesh out of it per-fragment. I think, it's cooler then vertices http://www.opengl.org/discussion_boards/ubb/biggrin.gif



Why would you want to reduce poly counts if you have spare tnl capacity? I see more disadvantages than advantages of using bumpmapping over true geometry.

SirKnight
02-09-2004, 01:09 PM
I saw like the first 20 min final fantasy (I had to leave, couldn't finish it unfortuantly) and yes it was all CG...BUT, one can look at it and instantly tell it's CG. Now I do think the rendering looked quite bad ass and the animation was some of the best I have seen for this kind of thing, but we are no where near the point where a movie can be rendered like that and look 100% real. Not only in how things look but how objects and things behave. Also keep in mind that there were plenty of "hacks and tricks" used in the making of that movie, just as it is in every movie with SFX. I really need to rent the movie or something because I really liked what I saw so far. But I usually never remember to look for FF when I'm out to rent something.


-SirKnight

Nutty
02-09-2004, 02:20 PM
I think there are 2 issues.

Writing "software renderers", and rendering in software on the cpu.

As far as rendering on the cpu goes, its a stupid idea. It simply wont ever be anywhere near as fast as gpu's. Eventually we will most likely have ray-tracing gpu's, so the argument that cpu ray-tracing will overtake gpu scanlining is rubbish.

The actual issue of "software renderers" is coming to pass as we speak. We're already starting to write software to govern how fragments are created, soon I suspect we'll be writing software to construct our polygons, and how to perform AA. So in actual fact we will be using software renderers, but not on the cpu.

crystall
02-09-2004, 02:26 PM
Originally posted by jeickmann:
This presentation covers some aspects that could be improved and how much they will cost you performance-wise: http://developer.nvidia.com/docs/IO/8343/Elegance-of-Brute-Force.pdf

I know, you can always say: "It's from NVIDIA, and they want to keep on selling GPUs!" but I think it's very interesting to read.

Jan

It is an interesting read but I find quite a bit of the quotes in the text misleading if not completely false. For example the depth-buffer approach is defined as 'robust'. What!? It is so robust that we still have to sort transparent polygons in back-to-front order by hand and when two of them intersect the hardware is unable to draw them properly.
Also the graphics pipeline is pictured as 'locality optimized by architecture'. Yeah, sure, I'll send you some nice slivers in and the rasterizer will be forced to walk frame-buffer memory with awful strides whithout gaining any benefits from the caches.
Quite frankly I prefer deferred tile-based rasterizers to immediate-mode rasterizers. They offer more parallelism, more locality are far more robust than depth-buffers and are well suited for both hardware and software implementations contrary to brute-force methods. They have their drawbacks (limited size of the number of primitives/textures they can buffer, etc...) but they can be solved contrary to depth buffers' ones.

[This message has been edited by crystall (edited 02-09-2004).]

dorbie
02-09-2004, 04:43 PM
You aren't saying you'd buy a tiled system in preference to a faster alternative (which I like to visualize:-)).

You're claiming tiled is innately faster, which has been the same unsupported claim some people have been making for 15 years despite famous failures. If it was faster at least one company would be winning on performance with it, not building monstrous systems that can only compete on scalability and draw enough power to cause a chassis meltdown.

When/if tiling is a win it will be because it is faster and more practical when taken as a whole, there are many considerations beyond framebuffer locality, and several solutions to the same problem of order invariance if that's your goal. If you have that kind of a huge fifo anyone can solve order invariance by one means or another, especially if you consider the state & context storage issues. Tiling of some sort may have it's day but until now nothing has worked as well or as efficiently that those systems tiling advocates complain about, despite all the unfounded hype.

As for resolving depth order, again there are solutions, but infact buffer sizes & fragment sorts hugely complicate tiled approaches not the other way around. Transparency has actually undermined some tiled architectures in the past and has caused serious fragment storage and sort problems for them. On top of this extant software implementations require conventional zbuffered rendering paradigms making a transition to fragment sort an issue of momentum, that must follow tiled implementation not preceed it, (chicken and egg).

It's fun to hear the same hoary old chestnut though, reminds me of the good old days, when I sometimes fell for it. Now it's best to ignore the hype and look at the numbers & practicality when delivered. I happily leave the implementation & architectural decisions to the guys who've been building and winning in graphics for a decade. And if they slip on architecture I'll happily move to the next card vendor. Will I ask if it's tiled? Hell no, I'll ask how fast & how much.


[This message has been edited by dorbie (edited 02-09-2004).]

KuriousOrange
02-09-2004, 10:28 PM
Divide, that is really good stuff you linked to. Really good stuff indeed. One can't help wondering about the limitations of having to use spheres, cylinders and cubes when modelling a real world application, such as a game or some kind of simulation, though. If you were tracing into a polygon soup with those kinds of frame rates, it would be more stunning.

crystall
02-09-2004, 11:20 PM
Originally posted by dorbie:
You aren't saying you'd buy a tiled system in preference to a faster alternative (which I like to visualize:-)).

I cannot buy a tiled system ATM. Nobody offers them.



You're claiming tiled is innately faster, which has been the same unsupported claim some people have been making for 15 years despite famous failures.

I hope to prove it as soon as my tile-based software rasterizer is ready, it's not my first try mind you, I have tested almost everything from z-bufferring to edge-buffering with all sorts of fragment/span-buffers in the middle.



If it was faster at least one company would be winning on performance with it, not building monstrous systems that can only compete on scalability and draw enough power to cause a chassis meltdown.

The Kyro series quite proved that a decent implementation could easily compete with boards which sported twice or thrice its memory bandwidth. Wasn't that a success?



Tiling of some sort may have it's day but until now nothing has worked as well or as efficiently that those systems tiling advocates complain about, despite all the unfounded hype.

Immediate-mode rasterizers work well, not efficiently, they still waste a lot of their potential resources.


As for resolving depth order, again there are solutions, but infact buffer sizes & fragment sorts hugely complicate tiled approaches not the other way around.


Sorting doesn't seem a problem to me, my visibility loop is ~20 instructions and contemplates all the possible depth functions avaible in OpenGL. Doesn't seem to much complicated to me.



Transparency has actually undermined some tiled architectures in the past and has caused serious fragment storage and sort problems for them.

Yeah, that's a problem I had too, but I've come up with a decent solution which needs an extra buffer (usually 4KB) and 4 instructions more than the opaque polygons loop. Sort speed for transparent polygons is only marginally slower (15-20%).


On top of this extant software implementations require conventional zbuffered rendering paradigms making a transition to fragment sort an issue of momentum, that must follow tiled implementation not preceed it, (chicken and egg).

It depends on what you call z-buffered rendering paradigms. You always have to calculate z-values if you want to sort fragments. I believe the greatest advantage of tile-based renderers is the vast parallelism they offer during the sorting phase. In my implementation I do depth-sorting on 16 pixels at a time, *independently* of the size or shape of the polygons which is a big plus for software renderers IMHO. I could well do it on all the pixels of the tile but there is no CPU around with enough registers for it.

[This message has been edited by crystall (edited 02-10-2004).]

EG
02-09-2004, 11:47 PM
>The Kyro series quite proved that a decent
>implementation could easily compete with
>boards which sported twice or thrice its
>memory bandwidth. Wasn't that a success?

It also proved that some transparency and sorting-related artefacts could never be ironed out.

>Immediate-mode rasterizers work well, not
>efficiently, they still waste a lot of
>their potential resources.

As for OpenGL, you can use immediate mode and still achieve parallelism between CPU & GPU, you also get better predictability of the output with immediate mode rendering.

Another side effect of "smarter" rendering schemes is that they incur a cost for their "smart" aspect (performance and complexity cost): if you assume each and every software sends dumb geometry to the hardware all the time, then a smart scheme can bring you benefits. But when you're facing a less-dumb software, your scheming will only get in the way, and will itself become a waste of time and resources.

>Sort speed for transparent polygons is only marginally slower (15-20%).

When measured *independantly*. But whenever sorting is involved, it means you have to wait for the last item before you can start working (unless you go the speculative execution route, but that's entirely new can'o worms).
Don't forget that an immediate mode paradigm means the hardware can start working as soon as it gets its first instruction, making parallel execution that much simpler to achieve when writing your rendering code.

Korval
02-10-2004, 12:17 AM
How we got on tile-based renderers, I'll never know.

More than sort speed, more than anything, the biggest problem with differred tile-based renderers is scalibility.

A regular rasterizer takes up X amount of memory for geometry data, textures, and (nowadays) shaders. If I render the same model 20 times, X does not change; the size of the memory is still the same.

Because a differred tile-based renderer must do T&L and store all triangles before actual rasterization, X changes. So, let us define X as being the amount for the primary data, and Y as the memory taken up by the rendering buffer. For regular scan converters, Y = 0.

Now, if I do the same 20x rendering of a single model, it takes up 20x the room in Y. Also, if I did some texture-coordinate generation to save bandwidth, those generated texture coordinates have to be saved into the buffer, thus increasing Y.

Effectively, you take up a lot of memory. If I double the vertex count of a mesh, I double the memory cost, which is already doubled compared to a regular rasterizer. Double goes into X, and double goes int Y. As you can see, this scales very poorly to higher vertex count.

Also, note that this vertex data is read into the T&L unit, then written to video memory, then read again later for rasterization. Eats up significant bandwidth.

So, clearly, TBR's scale rather poorly.

crystall
02-10-2004, 12:39 AM
Originally posted by EG:
It also proved that some transparency and sorting-related artefacts could never be ironed out.

That was a problem with the limited size of the bin used to buffer per-tile polygons. The newer cores from Imagination Technologies for embedded systems have solved this problem although I am not sure how. For a software-based rasterizer this is a non-problem since tile-buffer can be grown at will.



As for OpenGL, you can use immediate mode and still achieve parallelism between CPU & GPU, you also get better predictability of the output with immediate mode rendering.


This is not an advantage of immediate mode renderers, deferred ones have it too. I've done a bit of coding on the Dreamcast and you get a lot of parallelism going on between the SuperH and the Kyro. Usually while the Kyro is drawing, the tile-processor is binning (ie setting up) polygons for the next frame (feeded thru DMA) and the CPU is transforming/lighting vertices for the 3rd frame. Think of it as a 3-stage graphics pipeline.



Another side effect of "smarter" rendering schemes is that they incur a cost for their "smart" aspect (performance and complexity cost): if you assume each and every software sends dumb geometry to the hardware all the time, then a smart scheme can bring you benefits. But when you're facing a less-dumb software, your scheming will only get in the way, and will itself become a waste of time and resources.


This is a very interesting point, I'm glad you made it. Smarter scenes means more burden on the programmer, less on the rasterizer implementation. This sounds to me like laziness on the rasterizer side. I sincerely believe that a solid and *scalable* rasterizer should cope just as well with 'good' and 'bad' scenes. There are a lot of things that an application programmer has to do when writing a game / app / demo and his time is better spent on adding features and polishing the product than reordering scenes to cope with the flaws of the underlying rasterizer. Same goes for the artists. To me it is not important to take advantage of best-cases, the important thing is to have consistent performance across the widest possible situations. Also some of the extra infrastructure required for building a tile-based rendering actually saves some time compared to an immediate-mode renderer. For example there is no need for clipping in a tile-based renderer.



When measured *independantly*. But whenever sorting is involved, it means you have to wait for the last item before you can start working (unless you go the speculative execution route, but that's entirely new can'o worms).


I was talking about my software rasterizer, pure visibility sorting on my 1 GHz G4 is between 400 and 450 mega-pixels per second on opaque polygons and roughly 15-20% less for transparent polygons.



Don't forget that an immediate mode paradigm means the hardware can start working as soon as it gets its first instruction, making parallel execution that much simpler to achieve when writing your rendering code.

In my case this is not a problem, a single-threaded software renderer will always start working as soon as it gets its data.

crystall
02-10-2004, 01:15 AM
Originally posted by Korval:
How we got on tile-based renderers, I'll never know.

It's because of one of the documents posted in the previous posts and it is because I am in the process of writing an almost-OpenGL compliant tile-based software rasterizer



More than sort speed, more than anything, the biggest problem with differred tile-based renderers is scalibility.

That's what I usually say of immediate-mode rasterizers.



A regular rasterizer takes up X amount of memory for geometry data, textures, and (nowadays) shaders. If I render the same model 20 times, X does not change; the size of the memory is still the same.

Because a differred tile-based renderer must do T&L and store all triangles before actual rasterization, X changes. So, let us define X as being the amount for the primary data, and Y as the memory taken up by the rendering buffer. For regular scan converters, Y = 0.

Now, if I do the same 20x rendering of a single model, it takes up 20x the room in Y. Also, if I did some texture-coordinate generation to save bandwidth, those generated texture coordinates have to be saved into the buffer, thus increasing Y.

Textures are definetely not stored once for every sent polygon. Only vertex data is (in my case, polygon gradients). In my case this amounts to 64 bytes per polygon plus 12 bytes per each interpolated component. At 10 milion polygons per second (hardly a quantity I could reach) it makes up 640 MB/s of transfer rate. Not much by today standards. Compare it with the bandwidth required by reading and writing to the depth buffer.



Effectively, you take up a lot of memory.


Quantify 'a lot'. My machine has 768 MB of ram, that's more than any gfx card out there and I hardly believe I will be able to fill it with textures, let alone the tile-bin.



Also, note that this vertex data is read into the T&L unit, then written to video memory, then read again later for rasterization. Eats up significant bandwidth.


Yes and no. In my case I can transform and setup polygons w/o touching memory but I am not sure to create functions to access such functionality since the performance gains are not really worth it in my experience.

dorbie
02-10-2004, 08:27 AM
"I hope to prove it as soon as my tile-based software rasterizer is ready"???

LOL, that is just hillarious. Software is not equal to hardware by any stretch.

As for competing with other systems, failed systems prove nothing. You just can't get away with specious claims like some hardware came close despite some missing desirable quality, there is often a reason that feature is missing, either in terms of cost or feasibility.

That has been the history of tiled rendering. They were all close but no cigar.

Like I said I'm agnostic. Maybe it can maybe it can't, but it will be judged based on delivered performance, not inflated claims for one architectural feature.

[This message has been edited by dorbie (edited 02-10-2004).]

jwatte
02-10-2004, 08:36 AM
Interesting that nobody commented on the difference in available memory throughput.

Oh, and putting blend circuitry in the memory controller means that hardware, as opposed to general purpose CPU, can use that memory more efficiently, to boot.

Tiled renderers? If they're so great, you'll just have to sit back and wait until they dominate the world. Just like the Worker's Revolution.

dorbie
02-10-2004, 09:19 AM
Exactly, I wish I could have said it as succinctly.

V-man
02-10-2004, 09:20 AM
The greatest advantage of soft renderer is that even you have a unclear spec (that is if you write one), your code will give the same result everywhere, even if someone takes your code and optimizes it. Version X of your renderer will always give the same result everywhere.

A major disadvantage of immediate mode is feedback.
Reflective objects in the scene may require recursion so you are forced to render multiple times and even then, your cubemaps (or whatever) may not be perfect. They will just be good enough.

Next is transparent object. You will need to figure how thick an object is (I gave a solution in the past) and then you will need to do figure out caustics.

I think that other light phenomena (interference) is not that important for the moment.

Yea I know, the thread became a CPU vs GPU vs immediate mode vs raytracer vs TLB vs non-TLB vs "who can predict the future". http://www.opengl.org/discussion_boards/ubb/smile.gif

Some people here are not considering the physical limitations that we will hit someday. You certainly can't make circuit traces smaller than dozens of atoms. How thick are transistors now? 50 nm with a 90 nm process.

An Si atom is about 100 pm. Atoms vary from 50 pm to 150 pm I beleive.
100 pm means 0.1 nm

50 nm means we have 50/0.1 = 500 atoms

In 2005, we will have 65 nm process.
In 2007, we will have 45 nm process. First 1 billion transistor cpu.

Let's do a rough estimate :
500 atoms with 50 nm transistor today.
375 atoms by 2005
282 atoms by 2007
212 atoms by 2009
159 by 2011
120 by 2013
90 by 2015 (oh oh!)
67 by 2017
51 by 2019
38 by 2021
28 by 2023
21 by 2025
http://www.intel.com/research/silicon/nanometer.htm

dorbie
02-10-2004, 01:27 PM
Mmmmmm... that's only if there is one software implementation. There are examples where there are multiple software implementations of a poor spec and you wind up with a compatability disaster. Look at SVG for example.

maximian
02-10-2004, 01:57 PM
I for won am a firm believer that another kind of processing tech will suplant current design.
But, worse scenario, cpu manufacturers could start design multilayers chips(AMD does this already in a fashion). So maybe instead of a flat die, we could have a cubical die with k-layers, perhaps each with 60 nm process.

Stop predicting the end of the "world". This has been predicted since well before this decade. It still has not come to pass.

jwatte
02-10-2004, 03:45 PM
Seeing as this is cpureview.net, I'd like to make the point that, actually, during the last year, processor performance improvements have been nowhere near Moore's law, and price/performance has done even worse, because all the chips you couldn't buy a year ago are $700 and up. (And they're only 20% faster than the chips you could buy a year ago)

Adrian
02-10-2004, 04:20 PM
Moore's law relates to the number of transistors on a chip, not performance.

In late 2000 the P4s had ~40,000,000 transistors, prescott has 125,000,000. So 3x the transistors in just over 3 years.

[This message has been edited by Adrian (edited 02-10-2004).]

dorbie
02-10-2004, 05:07 PM
Yup, we're also on the cusp of 64 bit mainstream and PCI Express. The most amazing development IMHO is the rate at which memory bandwidth has been increasing, that seems unprecedented. Now they're going to blow the doors off the i/o bus which has been an annoying problem with better options available but not reaching the mainstream. It'd be nice to be able to fit real raid cards & gigabit ethernet or better on your standard consumer PC mobos without wasting your time.

64 bit AMD cpus of various types seem to perform quite nicely and come in at pretty good price points so I don't necessarily agree even with the original performance claim. OK mainstream improvements from Intel have been a bit thin lately but AMD seems to be executing well. You're also supposed to look at the results over the long term, competing designs and processes play leapfrog and a brief lull from Intel doesn't mean a great deal, they're allegedly getting a new process up to speed now.

[This message has been edited by dorbie (edited 02-10-2004).]

Won
02-11-2004, 05:04 AM
crystall --

Hopefully the strong dissent hasn't scared you off. Personally, I am VERY interested in what your tile-based renderer is up to, and what sort of capabilities it has. Will you volunteer linkage?

For a software-based rasterizer (given the bandwidth/storage properties), I'm not at all surprised that tile based deferred is the way to go, but I'd argue that the stream-based (of which, immediate-mode is a degenerate case) IS the more scalable solution, despite the problem of geometry sorting. But really, there is a continuum of design choices in architecture.

Immediate mode is strictly causal, and deferred has no such restriction at the cost (marginal on a CPU compared to dedicated hardware) of binning/buffering. But it shouldn't be difficult to imagine a semi-causal graphics pipeline that allowed finite look-ahead (as opposed to the effectively infinite look-ahead in the deferred renderer) in the graphics stream. In the canonical case of rendering transparent primitives, the load of sorting primitives is shared between the client and server.

Like most engineering decisions, the best choice is rarely an extreme.

-Won

Edit: In fact, that is exactly what was proposed in the 2003 SIGGRAPH paper ...foo... forgot the title, but it was written by a BitBoy. http://www.opengl.org/discussion_boards/ubb/smile.gif

[This message has been edited by Won (edited 02-11-2004).]

[This message has been edited by Won (edited 02-11-2004).]

crystall
02-11-2004, 11:26 AM
Originally posted by Won:
crystall --

Hopefully the strong dissent hasn't scared you off.

I'm used to it, when you start writing a software-rasterizer these days you have to excpect to be almost alone at it.


Personally, I am VERY interested in what your tile-based renderer is up to, and what sort of capabilities it has. Will you volunteer linkage?

What do you mean by linkage? Unfortunately english is not my native language so sometimes I fail to understand a concept.
Anyhow I'd be glad to discuss it with you and as soon as it is finished I hope to be able to released it under an open-source license.


For a software-based rasterizer (given the bandwidth/storage properties), I'm not at all surprised that tile based deferred is the way to go, but I'd argue that the stream-based (of which, immediate-mode is a degenerate case) IS the more scalable solution, despite the problem of geometry sorting. But really, there is a continuum of design choices in architecture.

Immediate mode is strictly causal, and deferred has no such restriction at the cost (marginal on a CPU compared to dedicated hardware) of binning/buffering. But it shouldn't be difficult to imagine a semi-causal graphics pipeline that allowed finite look-ahead (as opposed to the effectively infinite look-ahead in the deferred renderer) in the graphics stream. In the canonical case of rendering transparent primitives, the load of sorting primitives is shared between the client and server.

This would be an interesting idea, possibly solving most of the problems related to the amount of storage needed by deferred rasterizers.


Like most engineering decisions, the best choice is rarely an extreme.

-Won

Edit: In fact, that is exactly what was proposed in the 2003 SIGGRAPH paper ...foo... forgot the title, but it was written by a BitBoy. http://www.opengl.org/discussion_boards/ubb/smile.gif



I will look for it, thanks for the tip.

dorbie
02-11-2004, 11:52 AM
Rubbish, we're challenging unfounded assertions about tiled hardware, this has nothing to do with software rendering.

I have infact recommended deferred tiled shading approaches in software rendering projects at a company I worked for in the past.

Take your licks and don't misrepresent the opposition :-)

Won
02-11-2004, 12:03 PM
By "linkage" I meant a web link or a pointer to more information. Your project page, for example. I bet a multi-processor AMD64 machine would make a decent rendering prototype platform considering the potential aggregate memory bandwidth. It would be interesting if you could "pipeline" multiple AMD64 processors over hypertransport and send command streams (vertices, primitives, fragments) from one processor to another.

The title is "Delay Streams for Graphics Hardware". The authors (bunch of Finns)suggest that it would be useful for improving/implementing occlusion culling, order-independent transparency and anti-aliasing/adaptive supersampling. They make estimates of how big/long the delay-stream FIFO needs to be for it to be useful. They conclude that it has an attractive cost vs. benefit to GPUs.

I wonder if the sometimes-mentioned F-buffer is something like this?

-Won

nostgard
02-11-2004, 12:09 PM
I was reading through this thread and just wanted to throw in my two cents.

gltester, You keep saying that people don't need the most realistic portrayal, and that they will be happy with something that isn't completely realistic because people won't notice the little details (such as caustics).

While I agree that people don't necessarily know what makes something look wrong, and they more than likely won't notice the details that we're talking about, I don't agree that people will be satisified IF they can see what it SHOULD look like. And, due to the fact that the film industry doesn't have the limitations we have for realtime rendering, they always will be able to see what computer generated graphics COULD look like.

Case in point: just a week or so ago, I was talking to a gamer friend of mine, who is always trying to get the best graphical performance and appearance out of his computer. I was really shocked when he said that he thought new games coming out, like Doom 3 and Half Life 2, looked really good and he didn't think they would be able to do much better. After arguing about it with him for the next 15 minutes or so, I found out he didn't know what global illumination was, and that I would have to show him an example to convince him. So I went and found some videos showing the difference for a scene in local illumination and global illumination. His reaction was just what I was hoping for - his jaw _literally_ dropped. I just don't think he realized how much of a difference one part of the rendering puzzle like that could make on the overall picture. I think it's the same way with a majority of the general population.

I don't see a reason for GPUs ever getting pushed out of the equation, either - at least not for a very long time. Having a specialized seperate processor to work in parallel with the CPU is what has pushed the quality of gaming graphics so much higher over such a short period of time. If anything, I can only see the GPU becoming more and more flexible - a trend we can already observe with the continual evolution of the programmable pipeline. Offloading any portion of the work for a specific process onto another piece of hardware leaves you more time to do other important tasks on the primary hardware. Sure, there are some realtime raytracing and fast GI solutions out there, done in software, but do these engines have anywhere near the feature set of other gaming engines out there? If you're having trouble just getting the CPU to render the scene in realtime, what happens to all of the other things that have improved gaming realism, such as physics?

What I would like to see happen in the future is for the hardware companies to continue listening to the developers, and extend their hardware to meet these needs - instead of coming up with flashy product selling technologies like Truform, which have seen little use in retail products. How much better would it be if, instead of having to get rid of the GPU, the GPU was extended in such a way to provide a better set of tools for those trying to do realtime raytracing? It seems like a much more practical solution, as opposed to simply throwing out such a useful tool.

crystall
02-11-2004, 01:13 PM
Originally posted by Won:
By "linkage" I meant a web link or a pointer to more information. Your project page, for example.

Oh, understood. Unfortunately I've been too busy writing the code and I've got little well-written documentation (I'm working on it in fact but it will take time http://www.opengl.org/discussion_boards/ubb/wink.gif ). If you are interested I can mail you privately (as it would be fairly off-topic here) a description of its features and implementation. Eventually I will release the code under an open-source license but I don't feel like releasing it in the wild before I iron out a bit of quirks and optimize it up to a certain point, it's too early for sourceforge http://www.opengl.org/discussion_boards/ubb/smile.gif.


I bet a multi-processor AMD64 machine would make a decent rendering prototype platform considering the potential aggregate memory bandwidth. It would be interesting if you could "pipeline" multiple AMD64 processors over hypertransport and send command streams (vertices, primitives, fragments) from one processor to another.

It would be a very nice platform, parallelization would be fairly easy and very satisfying.


The title is "Delay Streams for Graphics Hardware". The authors (bunch of Finns)suggest that it would be useful for improving/implementing occlusion culling, order-independent transparency and anti-aliasing/adaptive supersampling. They make estimates of how big/long the delay-stream FIFO needs to be for it to be useful. They conclude that it has an attractive cost vs. benefit to GPUs.

I wonder if the sometimes-mentioned F-buffer is something like this?

-Won

I'll take a look at it, thanks!

gltester
02-11-2004, 04:40 PM
Originally posted by nostgard:
gltester, You keep saying that people don't need the most realistic portrayal, and that they will be happy with something that isn't completely realistic because people won't notice the little details (such as caustics).



wow this thread is still alive?

The sad reality is that most people won't notice the little details like caustics, except in very rare situations in rooms specifically set up to show off a feature. (Like imagine a room with alot of curved glass... caustics would make a difference).

I'm not picking on you in particular, nostgard, but several people have been misunderstanding and therefore misrepresenting my previous statements.

Somebody mentioned comparing looking at a picture on the screen to looking out the window. They seem to be ignoring the fact that a computer monitor has a FAR lower resolution than the human eye+the real world.

None of today's monitor technology will ever be able to duplicate the full quality of a real world image. This is what I keep saying it seems like a million different ways now. I'm not saying GPU graphics can't get better. I'm saying that GPU graphics are inherently limited by the other parts of the computer. Soon we will reach a point where GPU improvements don't make that much difference to the quality of the final image.

Also, because we are so close to the practical quality limit of the monitor hardware, people are going to stop wanting to pay $200, $300 for a new GPU when the upgrade only tweaks a few minor lighting issues here and there (or whatever issues). Most of the graphics tricks that can be done already have been done.

Look at these pictures, people:

[ UT 2004 ] http://www.shacknews.com/screens.x/ut2004//1/thumbs/15b.jpg

[ Doom 3 ] http://doom3.com/getdesktop.asp?num=3&size=lg


Although the colors and textures in these pictures still have an obvious "cartoon" quality to them, they are in fact far superior in detail to almost any hand-drawn cartoon.

FOLKS WE ARE ONE GPU HARDWARE GENERATION AWAY FROM HAVING THESE PICTURES BE MISTAKABLE FOR PHOTOGRAPHS!!!

One year! I'm not saying they will look as good as photographs, I'm saying that unskilled observers or people who aren't paying attention, will be able to sometimes mistake the graphics for an on-screen photograph.

Look at the lighting in the Doom 3 pic! It's good! Not perfect, but good. If the texture resolution was higher, and therefore the colors a little more realistic, and maybe if they did something about the strange orange halos on the lights (is that fog or what?) it would look very close to a photograph.

Assuming you could actually take a photograph of a 10-foot-tall demon, that is. http://www.opengl.org/discussion_boards/ubb/smile.gif


[This message has been edited by gltester (edited 02-11-2004).]

[This message has been edited by gltester (edited 02-11-2004).]

endash
02-11-2004, 05:06 PM
FOLKS WE ARE ONE GPU HARDWARE GENERATION AWAY FROM HAVING THESE PICTURES BE MISTAKABLE FOR PHOTOGRAPHS!!!
And what a lot of other people are saying is "No". And I would agree with them.

Still, that is a matter of telling the future.

But as for the "...and then CPUs will catch up" argument, I don't see it happening. My argument is that pipelined hardware will always be faster at rendering for a pipelined API like OpenGL. Perhaps some day we'll see this sort of thing find its way onto CPU silicon, but I'm not holding my breath.

Basically, the answer to the original question is "current trends are toward continued use of GPUs", and that those trends appear to project far enough to be beyond divination.

neomind
02-11-2004, 10:53 PM
I'll take my shot at lengthening this too long thread =)

A problem with the reasoning of many in this thread is that they assume that the hardware of tomorrow is the same as the hardware we have today, only faster.

I am pretty sure that the CPU will replace the GPU, but not by being what it is today.

Imagine a CPU with a built-in FPGA (or something like it) specialized for DSP/floating point operations. Also, assume that the FPGA could be reprogrammed on the fly in software. That would be (in my mind) the best of both worlds. It would exist in the CPU and remove the graphics bus from the system. It would be fully software controlled so it would be as flexible as a CPU. But in a way it would still be a hardware solution.

Also, allow me to agree with those who have said that photo-realistic rendering is far away. I have not yet seen a movie with photo-realistic CG effects, so doing it it real-time anytime soon is probably absolutely impossible.

On the other hand, to my great joy, the general public seems to be too wise to go for looks alone in computer games at least. Neither Counter-Strike, The Sims nor Battlefield 1942 are visually stunning, far from it to be honest.


[This message has been edited by neomind (edited 02-11-2004).]

harsman
02-12-2004, 01:23 AM
Originally posted by Won:
I wonder if the sometimes-mentioned F-buffer is something like this?
-Won[/B]

The f-buffer is essentially a FIFO buffer for fragements that enables application transparent, order independent multi pass. Output fragmetns are stored in the f-buffer instead of the framebuffer and are then read in the original order by the fragment processor for another pass. See this paper: http://graphics.stanford.edu/projects/shading/pubs/hwws2001-fbuffer/

dorbie
02-12-2004, 07:40 AM
It's not clear that the F-buffer is always application transparent. The original paper discusses several possible implementations of a more generic idea some of which aren't transparent for some applications.

More generally I'd say the F-buffer allows you to store a flexible amount of fragment data in fast (on chip?) memory so it can be shared across sets of shading operations that would otherwise be limited by hardware. Its main strength is also its largest weakness. Being a FIFO rather than an auxiliary buffer, you can store a lot of data per fragment but that also means that the number of fragments that can be processed (at least efficiently) in each 'pass' varies with the number of registers stored by the fragments between each pass.

There is a heck of a lot of stuff about an F-buffer that isn't tied down by the name. Two implementations could be radically different in many ways and still legitimately claim to be F-buffers.


[This message has been edited by dorbie (edited 02-12-2004).]

Nutty
02-12-2004, 09:21 AM
Imagine a CPU with a built-in FPGA (or something like it) specialized for DSP/floating point operations. Also, assume that the FPGA could be reprogrammed on the fly in software. That would be (in my mind) the best of both worlds. It would exist in the CPU and remove the graphics bus from the system. It would be fully software controlled so it would be as flexible as a CPU. But in a way it would still be a hardware solution.

But then someone will take that cpu, increase the power by making certain assumption/restrictions, store the entire scene database in local RAM/cache (to remove need for fast bus) and call it a gpu. http://www.opengl.org/discussion_boards/ubb/smile.gif

I dont understand why you think that would be soo great? Why not have all that functionality custom built specifically for graphics in a gpu, and use the cpu for something else? Trust me, by the time we have graphics as good as photo's we'll be using much more advanced physics, AI on even the smallest of scene objects which will require alot of cpu power.

Even Doom3 now can bring top end AMD64's to their knees with certain physics situations, they even had to take things out to not kill the cpu soo much.

zeckensack
02-12-2004, 09:55 AM
gltester,
We are, in fact, years beyond that point. I remember that the first graphics card I had that could display photographs with convincing quality was based on a Tseng Labs (may they rest in peace) ET6000 and had 4MB of memory. 1024x768x32, yay! Enough to display anything. Really.

Problem is, the thing couldn't render the images, only display them. Do you get this point now, or do you wish to be slapped with a dead trout?

davepermen
02-12-2004, 12:36 PM
Originally posted by gltester:

The sad reality is that most people won't notice the little details like caustics, except in very rare situations in rooms specifically set up to show off a feature.

how much real life situations are in ROOMS?! yes, doom is rooms, but real worlds are outside. nature, islands, forests, mountains, cities, villages, etc.

rendering a room realistic isn't a problem. it's just 6 flat walls with some textures. that is nothing in complexity compared to any other scene.

simulating a room really isn't a big issue anymore. and yes, doom3 looks neat for it. but it lacks tons of visual elements that make you believe it's real. it looks rather cartoony and unrealistic.

nutball
02-13-2004, 05:21 AM
Originally posted by gltester:
FOLKS WE ARE ONE GPU HARDWARE GENERATION AWAY FROM HAVING THESE PICTURES BE MISTAKABLE FOR PHOTOGRAPHS!!!

NO WE AREN'T!!!

That's what people here have been saying, and you don't seem to be taking on-board.

None of the images linked to from this thread have looked anything like photo-realistic to my eye (and I'm a photographer amongst other things, so I see enough photos to know).


Look at the lighting in the Doom 3 pic! It's good! Not perfect, but good. If the texture resolution was higher, and therefore the colors a little more realistic, and maybe if they did something about the strange orange halos on the lights (is that fog or what?) it would look very close to a photograph.

Frankly I think the graphics in Doom 3 are very over-rated. Likewise Final Fantasy blah blah. Sure, they may be the best that can be achieved now in their own genres, and are very impressive for that.

But they are still very obviously CGI, not even close to looking like photos. I just find it rather irritating when people look at a Doom 3 screenshot and come out with "right, stick a fork in it, we're done, CGI is solved and we can all go down the beach".

Look back 20 years at the state-of-the-art graphics being produced then. I can guarantee that Doom3/Final Fantasy will look just as dated in 20 years time.

Stephen Webb
02-16-2004, 10:34 PM
Boy, I never thought I'd be talking this way.

A few observations:
I remember when the 8-bit nintendo was delivering what was very clearly the best we would ever get in the way of computer graphics.

I remember when the 64 bit nintendo was perfection. I could not imagine anything ever being better. How could it be?

I remember when 14,400 baud was actually faster, somehow, than the supposed theoretical maximum of our phone system.

I remember when 300 dots per inch was eye-limiting resolution.

I remember when 100 MHz was the theoretical maximum frequency for CPUs.

When 1024k = infinity...

When MMX, and then SSE, would stream so much multimedia so fast I'd have it coming out my nose.

I remember when airplanes couldn't go faster than sound...

well, maybe I don't remember that one...

The screenshots of the latest/greatest games are great, but not a GPU gen away from being mistaken for a photograph.

3 generations from now, the games will look better perhaps in ways you can't put your finger on today, compared to the state of the art today...but I suspect you will be able to subjectively tell the difference..

And, finally, I close with a question:

How much longer will it be before we are doing all of our work on the GPU, so we can finally get rid of the CPU?

(I'm not trying to take a position one way or the other BTW...though for the forseeable future (my horizon is fairly humble), I suspect we will all be buying GPUs)

-Steve

V-man
02-17-2004, 07:16 AM
Originally posted by Stephen Webb:
Boy, I never thought I'd be talking this way.

A few observations:
I remember when the 8-bit nintendo was delivering what was very clearly the best we would ever get in the way of computer graphics.

I remember when the 64 bit nintendo was perfection. I could not imagine anything ever being better. How could it be?

I remember when 14,400 baud was actually faster, somehow, than the supposed theoretical maximum of our phone system.

I remember when 300 dots per inch was eye-limiting resolution.

I remember when 100 MHz was the theoretical maximum frequency for CPUs.

When 1024k = infinity...

When MMX, and then SSE, would stream so much multimedia so fast I'd have it coming out my nose.

I remember when airplanes couldn't go faster than sound...

well, maybe I don't remember that one...

The screenshots of the latest/greatest games are great, but not a GPU gen away from being mistaken for a photograph.

3 generations from now, the games will look better perhaps in ways you can't put your finger on today, compared to the state of the art today...but I suspect you will be able to subjectively tell the difference..

And, finally, I close with a question:

How much longer will it be before we are doing all of our work on the GPU, so we can finally get rid of the CPU?

(I'm not trying to take a position one way or the other BTW...though for the forseeable future (my horizon is fairly humble), I suspect we will all be buying GPUs)

-Steve

All of those examples show short sightedness and are plain stupid.
I'm sure that while some were saying 640K is plenty for everything, others already knew that they will need multimegabytes, and some multiterabyte systems.

And up until recently, 3D games were flat and boring. Everything has a 100 poly budget and the most detailed texture is 64x64

Now they are interesting since many development houses are putting an effort into having decent shadows and many other effects.
The name of the game for the future will be detail me thinks in terms of graphics.

I think that hw is always behind software. Developers have to limit their engines and spend a lot of time optimizing. And this is by todays standards.

The previous threads already give opinions on the future, so read those.

dorbie
02-17-2004, 12:56 PM
Originally posted by V-man:

All of those examples show short sightedness and are plain stupid.

That is exactly the point he is making V-man. He's using examples to support the assertion that current claims are equally short sighted and foolish, and will appear so with hindsight to everyone who does not already see it.

marcus256
02-18-2004, 03:11 AM
Wow, this is the kind of thread were you simply can't read all the posts (I think I read about 50%). Anyway, here are my two cents:

1) Graphics processing/rendering is an inherently parallel process. No matter what kind of rendering technology we are talking about (raytracing, immediate, tile-based, radiosity...). More transistors equals more power.

2) The maximum level of parallellism that can be extracted from a general software program has been achieved by CPUs for some time now (or at least within percents from the theoretical max). The only thing that can go up for CPUs is the frequency. The new P4 shows this in a painful way (>30 pipeline steps!) - Intel is clearly aiming for high frequencies and DSP-like software (Mpeg encoding etc - AMD still shines for complex software such as compilers).

3) General CPUs are handcrafted; each transistor is a careful design decision. GPUs are still using non-custom processes, and thus they don't get as high clock frequencies (300 MHz vs 3000 MHz). GPUs have a lot of maturing to do in terms of manufacturing and design (today the short generation cycles of GPUs don't premit the kind of tayloring that a CPU enjoys).

4) The arguments that we are seeing "near photorealistic" today is nonsense. Come back when we have real time global illumination (not just precalculated lightmaps) in a fullblown outdoor scene (like outside my window: 100s of cars, 100s of trees with waving leaves, waving flags, a dozen of people walking around, hills and forests in the background, 1000s of buildings), and I will join the optimists.

5) Games have always been designed carefully to express what the hardware is good at at the moment. Descent, back in the Glide days, was really fast and good looking. No wonder: they only had a few textured, lit, quads in view. "When we have the ultimate rendering power", games will look nothing like the limited scenes in D3 & HL2 etc. Just fancy per-pixel lighting/reflection/transparency etc simply won't cut it. We need massive polygon/surface/data/geometry processing power, and the memory to store and access it fast enough.

6) I wouldn't be surprised if raytracing as a rendering primitive becomes the defacto standard in the future. The hardware of today is still focusing on fragment operations (texturing, fragment programs etc) and "fillrate", and it is utterly unable to deal with massive geometry - complexity grows linearly with geometry load. In raytracing, on the other hand, complexity grows linearly with the number of cast rays (e.g. the resolution of the image). Tree structures etc offer very efficient culling in RT (complexity O(log(n)) that simply isn't possible with immediate rendering. See http://www.saarcor.de/ - try the sunflower scene on an nVidia/ATI chip...

7) In contrast to GPU-on-CPU advocates, I believe in a third processing unit. Either some really general purpose hardware, like programmale FPGAs with a fast memory bus that can be easily programmed with a nice OpenFPGA API or something, or a massively parallel programmable high precision floating point unit, similar to the pixel/vertex shaders that we are seeing today, but not dedicated to graphics programming. Think: Physics, neural networks, radiosity etc.


[This message has been edited by marcus256 (edited 02-18-2004).]

davepermen
02-18-2004, 04:39 AM
i don't believe in a future for gpu's the way it is now, as external plugable card. it's too far from the rest of the hw, much too split. latency is a problem, and will always be, if you want to make them talk rather directly with cpu (a.k.a. share data directly).

there are tons of two-way algorithms out there that perform bad, because of this.

but yes, dedicated rendering hw is way to go anyways. the imho best solution would be (currently) a dual-opteron motherboard, with one opteron, and one dedicated spu-chip in. that way, they share memory, have both high bandwith and low latency when talking to memory, or to eachother. great for parallelism (instead of the actual serialism we have today on gpu's.. you just send the data and let it do the rest.. rather delayed..).

else i agree with marcus in every point. games are very restricted even today because of thw what the gpu can present.. and what not.

V-man
02-18-2004, 05:02 PM
Maybe this article about PCI express, 64 bit processors, the future of graphics will interest some of you.
Here is section from page 2

"Tamasi extrapolated these trends a decade into the future, to illustrate what PCs and PC graphics cards would be capable of in 2014 (provided that technology continues to march apace): CPUs operating at 100GHz with 10 terabyte hard disks, 44GHz system RAM with bandwidth of 160GB/sec. Graphics cards will be able to handle 127 billion vertices and fill 270 billion pixels a second with over 3 terabytes a second of memory bandwidth and frame buffers of around 32 gigabytes. The computational power will be around 10 teraflops – enough to be ranked one of the top ten large-scale supercomputers by today's standards. This would be enough to render Shrek in pixel-perfect detail in real-time, with power to spare. Ten years may seem like a long way off, but when put in terms of technological evolution, we'll have amazing computing power on our desktops before we know it. "
http://www.extremetech.com/article2/0,3973,1529341,00.asp

gltester
03-29-2004, 05:50 PM
Originally posted by marcus256:
4) The arguments that we are seeing "near photorealistic" today is nonsense. Come back when we have real time global illumination (not just precalculated lightmaps) in a fullblown outdoor scene (like outside my window: 100s of cars, 100s of trees with waving leaves, waving flags, a dozen of people walking around, hills and forests in the background, 1000s of buildings), and I will join the optimists.
/me beats a dead horse ;)

NV40 demoed on next Unreal engine?
http://www.techreport.com/onearticle.x/6499

The backgrounds and outdoor areas looked
very natural and realistic. However, the
biggest surprise was the character models
themselves. The engine was running
characters with 6 million polygons. Yes,
you read that right, 6 million polygons.
The characters were extremely detailed,
lit, the works.
We are probably nowhere near real-time raytracing. But, we are going to get near-photo quality rendering in just a few years anyway, even without raytracing. Which is the point I kept making.

Adrian
03-29-2004, 06:45 PM
Polygon detail is just one aspect of computer graphics. The next Unreal engine will still rely on hacks and precalculations for shadowing and lighting. Will it accurately calculate the amount of indirect light arriving at each surface in a fully dynamic scene? I doubt it. You can have photo realism today if you put enough restrictions on the scene but the goal is photo realism in a fully dynamic scene and we are still a long way away from achieving that.

btw the six million polygons was the number of polygons in the original model, only 6500 were rendered in the demo.