PDA

View Full Version : One Pass Rendering Pipeline!



Golgoth
06-12-2006, 05:20 PM
Hi all!

I ve been orienting my rendering pipeline toward glsl for a while now, after reading 2 years old posts in here… I can easily say that I m behind! But I ll like to catch up on the rendering architecture!

My main concern is how is lighting/shadow are handle!

Well, I m one of those that thinks that laziness is the foundation of affectivity… that said… multipasses is not considered as an alternative so far… until further arguments is brought to the table… I think it’s a workaround for bad design… first thing I hear on this is… "vertex are cheap bla bla…" maybe, but my brain wont allocate a memory bloc for this pointer and will eventualy crash!

The first question mark would be:

I heard that professional engines rendered one pass per light and blend each pass in the frame buffer? Is that correct?

If yes, I still cant figure why… but I should be pretty close to get my answer… so far I calculate lighting in one pass... lights are accumulate in the shader… and I don’t see why it should be done other wise… 8 lights max… because im using gl_ states in shaders temporarily, I hope…

this lead to a another question:

how hard it is to raise up the number of maximum lights to 128 for instance in opengl?... why 8 maximum lights? And please, I don’t want a hear… you shouldn’t need more then 8 lights at the time… or use any hocus pocus hack not to have to…. I want the truth!

now, to answer the first thing that comes to your mind atm… I know I can use any number of light using shaders… but… lights have lots of parameters to deal with… even though you have to set opengl light states anyways, sending them through shader parameters is more expensive (you may get me wrong on this but I doubt it)… and I choose not to for the moment… I m aware that using gl_ states is in contradiction with my master plan of using infinite number of lights eventually…. but I ve choose to reject this idea so far for simplicity sake… till further development arise on the subject.

In my research so far of using one pass rendering pipeline, it all comes to the same bottle neck… opengl/hardware failed to fullfilled my needs (welcome to the club you ll say)… first, the maximum of lights… second, the max texture matrix stack which is 10 on my gf 7800 and 8 texture units… why clamping those values so low? What is the problem with those? Still want the cruel truth here! How can Opengl developers not doing anything for this aberration… what s going on, who s in charge here? < --- Mad golgoth!

Casting shadows…. Still in the idea of using One single pass in the rendering pipeline… I manage to make an ugly compromise… and again, because of the low number of texture matrices and texture units available in opengl FFP which I use in the shader… the first 4 matrices/tex Units are used by color, gloss, environment and bump maps… and the next 4 units are used by shadow maps… which leads to a maximum of 4 shadow maps per primitives… and to answer the fire glowing in your eyes… yes, I want a be able to use 4 shadow maps per light and I which I can use more… it may sound heretic for most of you, and Im aware of that too.

Now what:

What would be the smart thing to do:

1- Do One pass per light. – easier shader wise but does it worth multipasses?
2- Do One pass for all the lights + One pass for texturing.
3- One pass does it all.

If One pass does it all:

1- Wait till Opengl 3.0 and hope for max Units upgrade – bye bye backward compatibility…
2- Send Light paramaters through shader directly and suffer a penalty cost via data sharing client/shader.
3- Don’t bother and do as everyone else… forget about it.

Some say that we can store data and functions in texture handler in a shader... I m not sure what is it and how it is done... can anyone help me clear this up?

Hope this is not to heavy… thx for reading this!

Still digging!

sqrt[-1]
06-12-2006, 06:14 PM
I think your problem is that you are trying to tack GLSL onto the old fixed function pipeline. This interface is only really useful to help porting existing programs, not for writing new (advanced) ones.

If you use the proper GLSL interfaces you have:
16 - Texture binding points (image units)
>40 - Texture matrices (uniforms)
8x4vec - Texture coord interpolators

rgpc
06-12-2006, 07:08 PM
I think the problem is that there is a distinct lack of understanding, on Golgoths part, as to how Shadowing is actually done. Not to mention a lack of understanding as to what it is that actually makes a "Texture Unit" etc.

Golgoth
06-12-2006, 07:29 PM
thx sqrt... your a straight shooter and I like it!

how about sending data to the shader:

if texture has gloss map... if texture has environment map then if cube else if spherical... if lightType ambient else spot… and all that sort of things... specular, diffuse, emission, spot cut off and so on x n lights and/or x n textures... we need everything to be sent to a shader not a shader for all cases... it is a lot of data to carry... I said this before... im still in the dark with the texture handling data but... ideally I ll push for a single shader that can handle any possible cases, why would you want to do it otherwise?... plus. you obviously offer the per primitive shader that will replace the default shader for that primitive... but mainly, design speaking... the engine should have a default shader that maxed out all the default render states like we use to do with FFP... why bother with a zillion of shaders?… lets go straight to the point … bring the entire render states to one shader and stop accumulating traffic data overheads … what you guys think about that?

To be honest, I m not up to date on recent developments but, in spite of all that, I would rather Opengl developers having serious thoughts on making gpu accessible through a more open …. mmmm… HFP (Hybrid Functionality Pipeline) that can do what shaders are all about in the first place!

Thx again for hearing me whining!

Golgoth
06-12-2006, 08:06 PM
I think the problem is that there is a distinct lack of understanding, on Golgoths part, as to how Shadowing is actually done.
as for shadows using 1 depth map per light like in OpenGL® Shading Language, Second Edition, I think I got this nailed down... still digging but, I have some ideas on combining all shadow maps in one texture unit... any hint would be welcome!


Not to mention a lack of understanding as to what it is that actually makes a "Texture Unit"
uint used as a memory address where texture data is stored!? What am I missing here?

on gf 7800:
GL_MAX_TEXTURE_IMAGE_UNITS: 16. Shader
GL_MAX_TEXTURE_UNITS: 4. ffp

correct?


etc.ouch... -.- how off am I? please, strike me!

Obli
06-13-2006, 02:51 AM
Originally posted by Golgoth:
ideally I ll push for a single shader that can handle any possible cases, why would you want to do it otherwise?...You probably know IHV keeps telling you to batch as much as you can.
Well, maybe you're stretching it a bit.
Although ubershaders does really help, I hardly believe having a single ubershader would make sense (at least for now). The reasoning I found is design-driven but being my own consideration, you're encouraged to take it with some salt (it wouldn't be the first time I'm wrong).

Most of the time, the "world" must be realistic. To do that, it must be coherent. You wouldn't really put per-pixel lighted polys near vertex-lighted ones.
It happens that most of the polys do have similar properties. There's then a restricted amount of polys to do special effects: to render them, some sort of state change is often needed so ubershading them wouldn't be a real win.

The bottom line is that, to a certain degree, your engine must manage renderstates correctly so you don't need to tell shaders "this surface does have glossmap" (information). Instead, you do the same by using meta-information embedded in the shader: using a shader which looks up glossmaps.


Originally posted by Golgoth:
why bother with a zillion of shaders?… lets go straight to the point … bring the entire render states to one shader and stop accumulating traffic data overheads … what you guys think about that?I think being able to set BLEND (for example) thuru a shader would be great (3DLabs originally proposed this). The point is that a shader does contain metainformation such as 'this surface does not receive shadows' (so, no shadowmaps are looked up). I think replacing all the shaders with a parameter-driven single one will likely increase the overheads. I am missing your point here.

Originally posted by Golgoth:
...making gpu accessible through a more open …. mmmm… HFP (Hybrid Functionality Pipeline) that can do what shaders are all about in the first place!I don't get you there. Actual pipelines are really "hybrid", I think in a really programmable environment, graphics would be mapped to stream processing problems.

V-man
06-13-2006, 06:37 AM
What would be the smart thing to do:

1- Do One pass per light. – easier shader wise but does it worth multipasses?
2- Do One pass for all the lights + One pass for texturing.
3- One pass does it all.It might be possible to do 2 lights per pass, or even more. It depends what number of instructions the GPU supports. Current generation supports looping and conditionals so it's possible to do plenty of lights per pass. There was a NV demo that demoes this. The teapot with many point light orbitting it.

One shader that does it all? You will lose some performance on gf6200 and above.
For older cards (gf fx 5800 and radeon 9700) you have to keep your shaders lightweight. I know that the 9700 is very limited in terms of instructions and it is also limited in features.

rgpc
06-13-2006, 06:44 AM
Originally posted by Golgoth:
[QUOTE]uint used as a memory address where texture data is stored!? What am I missing here?
Silicon for one thing.

Golgoth
06-13-2006, 11:25 AM
I think replacing all the shaders with a parameter-driven single one will likely increase the overheads. I am missing your point here.
if you compare the ubershader with a regular one, yes it will increase overhead... but, we have to validate to process or not a state somewhere, lets take light types for instance.. AFAICS, the tendency now is to make a shader for each light type…

if we compare the Overall process of rendering time of a single frame:

first - 2 shaders, 1 doing point light and the another one doing spot light…
second - 1 shader with a single if state enabling point or spot light…

who wins?

without mentioning the work involve in tracking files between artists, scripter, programmers and code duplicate… one thing is for sure, when dealing with 1000s of assets… not using the ubershader is definatly a workflow overhead…


I don't get you there. Actual pipelines are really "hybrid", I think in a really programmable environment, graphics would be mapped to stream processing problems. I meant to extend glExtensions/functionalities in accessing gpu through gl calls, that way we can stay in the same development environement, in my case Visual Studio, instead of creating a new branch of launguages/tools…

Like:

glTransformLogic(gl_Vertex * gl_ModelViewMatrix);
glFragColor(put color here);

how hard is that? I must be missing a huge piece of the puzzle because this is driving me insane.

Korval
06-13-2006, 12:37 PM
The reasoning behind the multiple pass approach is justified by many things.

Shadowing can be done in 2 ways: stencil and shadowmap. The stencil method requires multiple passes because there's only ever one stencil buffer. You need to do 2n+1 passes, where n is the number of lights.

Shadowmapping can, at a minimum, use n+1 passes. You still need one pass per light to generate the shadow maps in the first place.

Given that, you need at least one pass per light, either to generate a stencil buffer or to generate a shadow map.

Now, let's forget shadowing. Let's assume you're using shadow maps, and focus on lighting. Basically, the problem is simple: how many shaders do you want?

Lighting equations come in a huge variety. But, basically, all of them have a few inputs: light direction, light distance, surface normal, diffuse/specular surface color. Possibly a few other things.

Getting those parameters is very difficult. Indeed changes for each type of light. Light direction computed for directional lights is just a constant vector, whereas for point lights, it needs to be an interpolant.

Surface normal, in a smooth case, is an interpolant. In a bump-mapped chase, it's a modification of this interpolant (several interpolants, actually), based on a texture. Relief mapping goes even farther in computing the normal.

There are a variety of lighting equations. From basic Blinn/Phong, through to complicated BRDFs and various other things.

There's several ways to handle this complexity. One is to build a megashader, which can do everything. Parameters determine which features are on/off for every envocation of the shader. This doesn't work well, because such a shader is brutally inefficient on modern glslang hardware.

There's the dynamic multipass approach. That is, for each of the combinations of light-type, surface type, and lighting equation, generate a shader. Then, for every kind that acts on a particular mesh, you do a pass. Hence the 1-pass-per-light method. It's probably more efficient than the megashader approach, despite the multiple passes.

Then, there's what I would suggest. Figure out exactly how much stuff you want to interact with each object in a scene. Say, 1 shadowed (mapped) directional light and up to 2 directional lights with no shadows. Determine which lighting equation you will use. Then, build a shader for it. For each object, build the shader that you would want to use in that instance. There can be shader sharing, of course, where appropriate (say, a shader for every character).

The idea with the latter approach is to avoid multipassing on the "+1" step for shadowmapping. To me, the principle advantage of shadow mapping over stencil shadows is that shadow maps can render stuff with fewer passes. Doing an additional pass-per-light with shadow mapping makes no sense. And it avoid thes "megashader" problems, because the shader is hand-crafted for each application.

Golgoth
06-13-2006, 01:46 PM
Thx Korval!


The stencil method requires multiple passes because there's only ever one stencil buffer.that’s is so true... here is one piece I needed to catch up... never used stencil shadows. I must say that FEAR did a great job with them!


Shadowmapping can, at a minimum, use n+1 passes. You still need one pass per light to generate the shadow maps in the first place. Good point here… I didn’t consider shadow maps calculations as being a pass per say, for many reasons… because there is no render states attach to it, it is only written in the depth buffer and the pass size is based on the map resolution plus, rendered from the light pov… so its not a part of the final result but a lighting calculation. To be clear on this, multipasses includes only passes from the eye pov in my book. What I m referring to as multipasses regarding lighting is this approach: calculate one light at the time in a shader then draw scene… go to next light, blend frame buffer, draw scene and so on for all lights… that’s what im not crazy about… instead of combining all the light in the same shader and draw scene once.


There's several ways to handle this complexity. One is to build a megashader, which can do everything. Parameters determine which features are on/off for every envocation of the shader. This doesn't work well, because such a shader is brutally inefficient on modern glslang hardware. This doesn't work well? Why?


That is, for each of the combinations of light-type, surface type, and lighting equation, generate a shader. You mean determine which features are on/off for every evocation client side, generate, compile, link then use the shader… even doing this each frame if needed?
How could this be more efficient then sending uniform true or false to the umber shader?


Figure out exactly how much stuff you want to interact with each object in a scene. Say, 1 shadowed (mapped) directional light and up to 2 directional lights with no shadows. Cant take this approach, this matter is in the artists hands… the engine must allow the widest range of possibilities they can came up with, without modifying any line of codes. I never saw texture artists in production writing shaders… they shouldn’t have too… And that’s pretty much the bottom line on this matter.

regards

Overmind
06-13-2006, 02:08 PM
This doesn't work well? Why?What input parameters would your megashader have? That is, what are the attributes, varyings, uniforms, textures?

You're not done with a single "type" uniform that selects a particular equation. You also need every parameter of ALL possible equations, because you don't know which ones you'll need until shader execution.

Also, with one pass per light you have the additional advantage that you can cull objects that are outside the range of the light, so you do a lot less work if you have many lights with low range. Google for "deferred shading" to see how to take this to the extreme (think hundrets of visible lights).

Golgoth
06-13-2006, 03:16 PM
What input parameters would your megashader have?more or less then all shaders combined would have… whatever you need to process any given state…


You're not done with a single "type" uniform that selects a particular equation. You also need every parameter of ALL possible equations, because you don't know which ones you'll need until shader execution. I was expecting this one… you ll have to go through all of it client side anyways… client side, if this state enable send what you need to the shader for this state… if not… the umber shader have the variables handy but are not used to compute the final result… they can just sit there and wait for further instruction… on top of my head, we could do some sort of variable pooling… more like generic variables…

ei:
uniform var1;

if (state1 == 1)
float reflection_indice = var1;
else (state2 == 1)
float opacity = var1;

plus, what about sending diffuse, spec, ambient, emission in a single matrix 4x4…


with one pass per light you have the additional advantage that you can cull objects that are outside the range of the lightNot sure how using multipasses can benefit from this… you can cull objects that are outside the range of the light without rendering anything… but ill take a closer look on the deferred shading for sure.

Thx again

Korval
06-13-2006, 03:23 PM
This doesn't work well? Why?In addition to what Overmind said, only the most advanced graphics cards can handle the complex conditional branching necessary to do what you suggest. And using that conditional branching can cause substantial performance penalties.


How could this be more efficient then sending uniform true or false to the umber shader?Because you can precompile all the shaders that you will ever need. The number of combinations is pretty small:

If you have 2 light direction generators (directional and point), 2 surface normal generators (bump and smooth), 3 color generators (interpolated vertex, texture, and parallax&texture), and 3 lighting equations, then you have only 36 possible shaders. And the shaders themselves are pretty small.


Cant take this approach, this matter is in the artists hands… the engine must allow the widest range of possibilities they can came up with, without modifying any line of codes.Then you're going to have to take the multipass approach. If you aren't allowed to restrict what the artists can do, you're going to have to sacrifice performance. You can't get without getting.

Golgoth
06-13-2006, 04:27 PM
Thx again for your answers!


And using that conditional branching can cause substantial performance penalties. What kind of penalties?… if we are talking about 50% Global speed rate, I ll forget about it… but for ~5% speed rate to process umber vs without conditional branching, it sounds reasonable for increasing our day to day development quality… speed trade off is not at all cost… at least I think so…


Because you can precompiled all the shaders that you will ever need. That’s probably the part im scared of… im not sure if you mean hand coding 36 shaders or auto compiled them at the engine initialization… In both cases, I cant hardly imagine… hand coding is an absolute no go… and for the auto compile idea… hum… I ll have to meditate on this… maybe interesting… still have no clues how to.


If you aren't allowed to restrict what the artists can do, you're going to have to sacrifice performance. You can't get without getting. If I can buy peace this way, it’s a done deal for me …

regards

Korval
06-14-2006, 12:57 PM
What kind of penalties?… if we are talking about 50% Global speed rate, I ll forget about it… but for ~5% speed rate to process umber vs without conditional branching, it sounds reasonable for increasing our day to day development quality… speed trade off is not at all cost… at least I think so…It depends on what hardware you're talking about, and how you write your megashader. Don't forget: quite a bit of the glslang-capable hardware can't do conditional branching in the fragment shader period.

If you want exact answers, you'll need to benchmark it.


That’s probably the part im scared of… im not sure if you mean hand coding 36 shaders or auto compiled them at the engine initialization… In both cases, I cant hardly imagine… hand coding is an absolute no go… and for the auto compile idea… hum… I ll have to meditate on this… maybe interesting… still have no clues how to.I do not understand what you're trying to say here. Your artists aren't writing shaders, by your own admission. So you, or someone much like you, are going to have to write these shaders. Whether it's a megashader, or smaller ones.

It's only 36 compiled shaders. At a rate of, say, 4 shaders a day (written, tested, debugged), it wouldn't take you longer than 2 weeks to do it.

Not only that, the shader pieces are all swappable. It wouldn't be too hard to come up with some shader conventions so that all you do is compile (now using glslang terminology) the individual shaders, and combine them at the program linking stage into the 36. That way, you only need to write 2 + 2 + 3 + 3 or 10 shaders.


You can't get without getting.I meant to say "You can't get without giving," btw.

Golgoth
06-14-2006, 03:56 PM
korvak, you have been a great contribution to this thread, thx again!


Don't forget: quite a bit of the glslang-capable hardware can't do conditional branching in the fragment shader period. It is not a problem for my needs yet, since im not anywhere to be ready for a release… plus, im targeting a non public market at the moment… current dev is made on gf 7800, and im not planning on targeting any lower.


It's only 36 compiled shaders. At a rate of, say, 4 shaders a day (written, tested, debugged), it wouldn't take you longer than 2 weeks to do it.
I agree with you to a certain level here... its not a big deal once your engine is ready for release... but most of the time we re in dev mode... especially with shaders... they hardly ever final in my case... so to maintaining all shaders will be a real waist of time in dev mode... That said, you seamed to bring to light an interesting ideas that im clearly still in the dark with…


Not only that, the shader pieces are all swappable. What would this mean?

I know we can attach more then one shaders to a glsl program, anything to do with this?
AFAIK, I ve tried to attach multiple fragment shaders… logic turns out that the last frag shader overwrites the first one… which make me think that it could be used for multipasses… am I correct? Or did u meant something else?



It wouldn't be too hard to come up with some shader conventions so that all you do is compile (now using glslang terminology) the individual shaders, and combine them at the program linking stage into the 36.
Ok now its getting even more interesting, I can almost see a sparkle, can you elaborate on this if it is not too much trouble? is this topic covered anywhere? OpenGL® Shading Language, Second Edition barely goes over this topic.

Thx again

regards

Korval
06-14-2006, 06:19 PM
AFAIK, I ve tried to attach multiple fragment shaders… logic turns out that the last frag shader overwrites the first one… which make me think that it could be used for multipasses… am I correct?It's exactly like building a regular C program.

(note: now using glslang terminology).

You build a shader from one or more text files. This is analogus to having a .c/.cpp file that include one or more .h files. The shader text files are compiled in order, and can include header type information (forward declarations of functions).

A built shader, a glslang shader object, is like a .o/.obj file in C. It isn't a program yet, and you can't use it directly.

A full glslang program is what is created when you take one or more shader objects and link them together to form the program. Now, you know that a glslang program (that fully overrides the old pipeline) consists of a vertex shader and a fragment shader that link together. You know that these two shaders need to agree in terms of the names of variants passed between them.

What you may not know is that you can take two vertex shaders and one fragment shader and link them together into one program. When you do that, the two (or more) vertex shaders are combined exactly like .o/.obj files are combined into executables.

One of those vertex shaders can call a function in the other. As long as the function was declared when the vertex shader was built, it can call it. But it doesn't need to know which compiled shader is going to implement it; as long as the function matches the declaration, everything is fine.

It's easy to apply this to our case; do it C-style.

You have a main fragment shader text file. It implements the main function for fragment shaders, and it never changes for any fragment shader program you create.

In my earlier post:


If you have 2 light direction generators (directional and point), 2 surface normal generators (bump and smooth), 3 color generators (interpolated vertex, texture, and parallax&texture), and 3 lighting equations, then you have only 36 possible shaders. And the shaders themselves are pretty small.I defined the 4 stages of a fragment light program:

light direction generation
surface normal generation
color generation
lighting equation

So, your main shader text file looks something like this (in pseudo-code):


vec3 GetLightDirection();
vec3 GetSurfaceNormal();
vec4 GetColor();
vec4 ComputeLighting(vec3 lightDirection, vec3 surfaceNormal, vec4 color);

void main()
{
vec3 lightDirection = GetLightDirection();
vec3 surfaceNormal = GetSurfaceNormal();
vec4 color = GetColor();
gl_FragColor = ComputeLighting(lightDirection, surfaceNormal, color);
}There's your main. That shader text (if it were actual glslang and not pseudo-code) would compile into a shader object. But it would not link by itself into a program, because it calls functions that aren't defined in any shader object.

So, you write a shader object that implements GetLightDirection as directional light. It might look like:


uniform vec3 myLightDirection;
vec3 GetLightDirection()
{
return myLightDirection;
}So, whenever you use a program built from this shader object, you need to make sure that you set the myLightDirection uniform. Now, maybe the light direction is passed through an interpolant, for a point light. That might look like:


varying vec3 myLightDirection;
vec3 GetLightDirection()
{
return myLightDirection;
}Whenever you link this fragment shader object, you need to make sure to use a vertex shader object that provides a myLightDirection varying, of course.

Now, when you go to build your fragment program, you take your main shader object, and one of each kind of other shader object, for the parameters that the main shader object needs. You link them together, and you get a viable program. 36 of them. From 1 main shader object and 10 others.

The reason that the last shader in your examples kept overriding the previous ones was because, I assume, each one implemented its own main() function. So, much like a C linker, you multiply defined the same function. So, it gave you some warnings (maybe?) about multiple definitions, and then kept only the last one.

Golgoth
06-14-2006, 09:12 PM
I must say, I was literally glued to the screen, it was really interesting reading you!

Let see if I get this right, here is a pseudo wrap up:

Main.frag

So, it gave you some warnings (maybe?) about multiple definitionsI just tested it again, my bad, I thought it did but it did not overwrites. my bet is, at the time I did try it, my setup just deleted the current shader and replaced it by the new one internally. just for the record, it does not compile, here is what glIntercept return:

(4) : error C2002: duplicate function definition of main (previous definition at :45)
(4) : error C2001: incompatable definition for main (previous definition at :45)

zed
06-14-2006, 10:14 PM
u could do what ive done
write a small app that generates a shader string, which u then create a glsl shader from

string src = crete_shader( dir_light | diffuse_texture | normal_texture etc );
well i dont do this exactly i fill a struct with the required data
eg frag_shader.num_lights = 1;
and then let the app spit out the string based on what the struct contains

Brolingstanz
06-14-2006, 10:25 PM
I second the text magic. You could parse an any old script into something a GLSL compiler will understand. You could even add game event logic, but that might hit you hard in the combinatorials.

P.S. Zed, just so you know I've reverse engineered your game engine and I'm releasing my first title next week.

CrazyButcher
06-15-2006, 01:23 AM
the automatic precompiling also worked well for FarCry. It has like thousands of precompiled Cg shaders ;)

zed
06-15-2006, 09:56 PM
Originally posted by Leghorn:
P.S. Zed, just so you know I've reverse engineered your game engine and I'm releasing my first title next week. youre doing better than me than, looks like this weekends not gonna be to productive also, buggered my left hand + are forced to type with just my right.
u would of thought after all those years of wnnking my wrists would of been stronger
ps latest in-game video
http://www.filefactory.com/?35b6ab
sorry im not singing in this one

Golgoth
06-16-2006, 11:54 AM
the automatic precompiling also worked well for FarCryalmost 3k shaders afaics... design speaking... im definitely not convinced... but it does work… im going to the bottom of the ultimate umber shader… and see how it goes… still hoping to get korvac feedback… till then… keep it up guys and thx for the inputs!

SkeletalMesh
06-30-2006, 08:52 AM
:eek:

The first post on this thread ( all posts by starter of this thread ) is the funniest 'Guru Pretender' stuff I ever read.. :D

He has completely confused about what textures are, how GPUs work & what is programmable shading... But, thinks he is a next-gen technology architect... May be, he had wished that he would, some day write something like carmack's .plan file, but is too impatient to wait untill he can get to that level..

I really wonder you guys are trying to explain him things with such an effort. How can he even 'GET' anything ?.

I know im hard-hitting very hardly.. but seriously.. this is complete time waste.. and pretention...

Golgoth: Dude... Id say you invest a bit of time to thorough your understanding of the rendering pipeline, starting from FFP towards Programmable..
My guess is your are an experienced game-coder..
Shouldnt take you much time.. With a thorough grip on basics, the time you spend on design & prototyping of rendering engine would be really shorter..

Another suggestion is you can download Ogre3D or someother Opensource rendering engines out there... Ogre3D is definitely worth a try..
I'd say 'Why re-invent the wheel, when you already got a radial-tyre for free ????'

You can rather use your time on so many other important tasks related to engine development & setting up efficient production pipelines, artist tools, etc.. instead of getting into the low-level nitty gritty details of next-generation rendering pipelines..

Well, if you really love getting deep with this stuff (just like me) you can.. but, your R&D time is also gonna burn project budgets.. You can evaluate in that direction.. Not to mention the risk of running into dead-end problems in the middle of production, that this brand new bug.. stopped your artists.. It can be disasterous..

Commercial engines or free ones like Ogre3D which is of commercial quality are a result of many man years of work.. and feature additions.. bug fixing cycles.. R you sure you wanna go through all that at the cost of your project time ????

Korval
06-30-2006, 12:23 PM
The first post on this thread ( all posts by starter of this thread ) is the funniest 'Guru Pretender' stuff I ever read..This is not a helpful message. If you're just going to attack someone for their lack of knowledge, your presence is not required here.

ebray99
06-30-2006, 12:42 PM
I agree with Korval. This does nothing for the discussion.

Korval: above you said stencil must be done in 2n+1 passes? I assume the +1 is the z-prepass (or ambient pass), but what is the 2 for? In my experience it requires n+1 passes.

Kevin B

Korval
06-30-2006, 05:52 PM
Korval: above you said stencil must be done in 2n+1 passes? I assume the +1 is the z-prepass (or ambient pass), but what is the 2 for? In my experience it requires n+1 passes.The +1 is the ambient pass. If you do a z-prepass before this, it is +2. I suppose you could fold the ambient pass into one of the other lighting calculations, thus reducing it to +1.

And I consider the rendering of each light's stencil shadows a pass. It's not a full shader pass, but it is taking up loads of fillrate due to the long, multiple, likely overlapping volumes being rendered.

SkeletalMesh
07-01-2006, 06:48 AM
Originally posted by Korval:

The first post on this thread ( all posts by starter of this thread ) is the funniest 'Guru Pretender' stuff I ever read..This is not a helpful message. If you're just going to attack someone for their lack of knowledge, your presence is not required here. Agreed it does not 'add' to the discussion and might sound like attacking someone. You havent figured my intention. We all came from the same stage of 'lack of knowledge'.. Including me.

Why would anyone think asking questions is something to point at ??

But, these kind of statements were really irritating..


opengl/hardware failed to fullfilled my needs (welcome to the club you ll say)… first, the maximum of lights… second, the max texture matrix stack which is 10 on my gf 7800 and 8 texture units… why clamping those values so low? What is the problem with those? Still want the cruel truth here! How can Opengl developers not doing anything for this aberration… what s going on, who s in charge here? Anyways, looks like you ignored a majority part of my post which was addressed to Golgoth, and the best approach he can go about.. I stand on my point that it will be more helpful & productive to re-use free resources, as building an engine from scratch would be a significant time investment given the level of know-how..

Trying to explain Cache coherence, and teach realtime shadowing algorithms, and next-gen engine design on threads just does not make sense to me.. nor do i think it is helpful to anyone...

Golgoth
07-01-2006, 11:47 AM
Hi everyone!


The first post on this thread ( all posts by starter of this thread ) is the funniest 'Guru Pretender' stuff I ever read..That is one way to see it! English is not my first language so some comments may come out the wrong way… in my defense… first post, first paragraph:


I can easily say that I m behind! But I ll like to catch up on the rendering architecture! I m impatient and I m clearly not a guru but I have some ideas!


My guess is your are an experienced game-coder…Wrong, before I started c++ tutorials, I was a full time artist in the game industry for 10 years, you probably played games I ve worked on. anyways… I didn’t know the rabbit hole was going that deep… and you are right… there is pieces of the puzzle I don’t have yet and that’s why im here…


Id say you invest a bit of time to thorough your understanding of the rendering pipelineI m working on it... every day…


opengl/hardware failed to fullfilled my needs (welcome to the club you ll say)… first, the maximum of lights… second, the max texture matrix stack which is 10 on my gf 7800 and 8 texture units… why clamping those values so low? What is the problem with those? Still want the cruel truth here! How can Opengl developers not doing anything for this aberration… what s going on, who s in charge here? ..everyone seams to know the answer… I don’t… and I do not want design hardware… but I may consider introducing a new chapter in my .MasterPlan file… : )


I stand on my point that it will be more helpful & productive to re-use free resources, as building an engine from scratch would be a significant time investment given the level of know-how..Even thought you make it sound like ive just opened first person nehe tutorial… thx for the tip, but it is to late for me now… I know what im looking for and I ve dig too deep to go back… I was hopping for the big picture, a bullet points sequence, on how lighting is currently handle in a rendering pipeline and discuss next gen possibilities with experienced coders! I have it working in different ways… but im still not happy with it… as the title of this post, without including shadow calculations, is one pass rendering pipeline possible?


Trying to explain … next-gen engine design on threads just does not make sense to me.. nor do i think it is helpful to anyone...I m really interested, why would you say that?… because no one is really sure where it is going or because people that can answer are bounded by ndas?… If we cant discuss it here where then?

regards

Golgoth
07-01-2006, 12:05 PM
The +1 is the ambient pass. If you do a z-prepass before this, it is +2. I suppose you could fold the ambient pass into one of the other lighting calculations, thus reducing it to +1.if by ambient pass you meant controlling the color and shadow opacity with ffp... you may find this interesting:

GL_TEXTURE_COMPARE_FAIL_VALUE_ARB

unfortunatelty the extension is not suported on my gf7800 for some reason... so i couldnt validate it... but it should save you the ambient pass.

V-man
07-01-2006, 12:27 PM
Although years have passed and passing, some things are not likely to change much. The top game makers don't absolutely need a huge matrix stack or a zillion lights, and 100 texture units, etc. The hw vendors have to make sure they balance things out so that decent cost/performance can be acheived, even if it means multipassing.

"the maximum of lights"

Several reasons :
1. The minimum GL asks for is 8, so most offer 8.
2. GL does vertex lighting. Who likes that?
3. Shaders are the future. Code you 10,000 lights yourself. Do it per pixel.

"8 texture units"
1. Today, it's called texture image unit. That means a texture unit is not related to a texture coordinate
2. 8 is not bad by todays standards. ATI and NV offer 16, and they offer 8 tex coords.
3. Use generic vertex attributes.


"matrix stacks"
1. Does it matter? Does it improve performance?
2. Pretend there is no stack at all.
3. Write your own software stack

You could always consider a 3rd party solution because all this may be above you.
If you are interested in learning GL programming, you have to code it yourself.

Golgoth
07-01-2006, 02:13 PM
The hw vendors have to make sure they balance things out so that decent cost/performance can be acheived, even if it means multipassing. I disagree with this, it should be up to the developers to decide what’s most important…. hw vendors should leave the door open.

So my question is, what is it that makes textures or light units clamp to a certain number? What difference does it make to hw vendors? Why don’t leave 128 texture units for instance… is it just physically impossible, maximum instruction on ship? If yes, Why cant hardware take care of multipasses on its own?


GL does vertex lighting. Who likes that?
the reason I ll like to get more vertex lights is only for the gl_ build in variables that we can retrieve light info in shaders... since im doing all lights in one pass, using light struct in shaders will overload the shader quickly... plus, sending light variables via uniforms to shaders is costly... don’t you agree? I know ffp as to do it as well... but it seams to do it in a more efficient way... and does not influence max shader instructions… maybe someone can confirm this.


8 is not bad by todays standards. ATI and NV offer 16, and they offer 8 tex coords. Use generic vertex attributes. On current nvidia hw there is 16 attributes and 13 or already reserved by standard vertex attributes… 1, 6 and 7 are free… I ve filled up those already… so I have to make cruel choices if I m going to use gltexcoord attributes for something else that what there meant for… I heard that there is no attribute overlaps standard/generic on ati hw.. which leads to 16+13 attributes… right?


Does it matter? Does it improve performance? Yes it does… it says 10 but I cant go more then 8 for some reason or gl crash… if you have 2 shadow map and 1 occlusion map, it leaves 5 for textures… the problem with this is that 3d assets sometimes need several uvs for artists to work on texturing… to over come the limitation we must flatten the work before export… but once it is flattened it is a real pain to make more work on the assets uvs… and rollback on each assets before flatten is nonsense… which is a major down side on workflow performance. 8-10 is a tight closet… at 16 (which will make sense to match the texture unit doesn’t it?) we will start to breath a little! Each texture unit should have its own matrix… If I wrote my own stacks I ll have uniform overflow.


You could always consider a 3rd party solution because all this may be above you. The whole thing is hw related afaics, 3rd party will still have to deal with multipasses… see title of this post! If “vendors have to make sure they balance things out so that decent cost/performance” is the answer to all this… that’s all I needed… im gonna put this one pass thing on hold and hope for more power under the hood in the next gen video cards… till then, do as everyone else and go multipasses… and this thread is close!

ZbuffeR
07-01-2006, 02:31 PM
what is it that makes textures or light units clamp to a certain number? What difference does it make to hw vendors? Why don’t leave 128 texture units for instance… is it just physically impossible, maximum instruction on ship? If yes, Why cant hardware take care of multipasses on its own?
sending light variables via uniforms to shaders is costly...You almost answered yourself. You can do it currently, but it is slow. Why ? Because there is not this ultra-fast hardware support. Why ? Because more hardware means more silicium transistors. Which means more potential failures, which means higher production costs.

Of course, if there is a market, it is done. First GeforceFX batches had almost 50% of reject rates at the end of line ...

SkeletalMesh
07-01-2006, 02:34 PM
what is it that makes textures or light units clamp to a certain number? What difference does it make to hw vendors? Why don’t leave 128 texture units for instance… is it just physically impossible, maximum instruction on ship? If yes, Why cant hardware take care of multipasses on its own? Its expensive.. Transistor count wise.. Chip manufacture cost wise.. Heat emission wise, etc..
Physically impossible.. in a way yes...

They can only put n number of transistors on chip.. As technology is going ahead, they are reducing transistor size (fabrication process) and able fit more transistors on the chip...

Basically 'Texture units' or 'Light computation units' are.. at the lowest level, bunch of transistors layed out to compute some stuff..

If you can look at how hardware evolved, it would be obvious..

Riva 128 = 1 Texture unit
Riva Tnt = 2 Texture units
Riva TnT2 = 2 Textures but faster & more features
Geforce = 2 Textures, faster, lighting units added (hardware tnl)
(till this point these texture units could only do a bunch of 'Fixed ways of handling texels' Fixed function that is.. 'Add these two textures'.. 'Multiply these' 'Multiply 2X', etc..

Geforce 3 = 4 Texture units.. And also programmable..

With each generation.. there are more transistors you can put on chips.. Its not just texture units.. but there are a lot more things GPUs need to do.. So, GPU designers break their heads no how to best use the given transistor count..

Its a balancing act..

Komat
07-01-2006, 02:54 PM
Originally posted by Golgoth:

So my question is, what is it that makes textures or light units clamp to a certain number? What difference does it make to hw vendors? Why don’t leave 128 texture units for instance… is it just physically impossible, maximum instruction on ship? If yes, Why cant hardware take care of multipasses on its own?
Anything you add to the HW increases cost of the chip both directly (less chips per silicon waffle, higher chance of the waster) and indirectly (higher cost of the development and testing) . Unless there is a high demand and reasonable use for having 128 separate textures accessible from the shader, that cost is simply not justified.



the reason I ll like to get more vertex lights is only for the gl_ build in variables that we can retrieve light info in shaders... since im doing all lights in one pass, using light struct in shaders will overload the shader quickly... plus, sending light variables via uniforms to shaders is costly... don’t you agree?
What you really need is having a uniforms shared between programs which were unfortunatelly not included in the GLSL api.

Each texture or lighing unit has nontrivial semantics and state associated with it for purpose of the FF pipeline beyond having the values you access from the shader. If driver reported say 128 lights, it would have to support FF function with all those lights enabled with maximum features.



nvidia hw there is 16 attributes and 13 or already reserved by standard vertex attributes… 1, 6 and 7 are free… I ve filled up those already… so I have to make cruel choices if I m going to use gltexcoord attributes for something else that what there meant for…
You can use generic attribute that the nVidia aliases with conventional attributes, you simply can not simultaneously use that conventional attribute. To simplify things, do not use the conventional attributes at all and send all input trough generic attributes.

Golgoth
07-01-2006, 02:55 PM
Thx guys!

this is really interesting.

dont forget, about 12-15 years ago, a sgi workstation geared up for photoshop and softimage on unix cost around 50 to 100 thousand... my question is, beside the mainstream market right now... anyone is aware of an high-end video card that is pushing the current limits? some cards that would be mainstream in 2-3 years maybe?

SkeletalMesh
07-02-2006, 01:15 AM
The whole thing is hw related afaics, 3rd party will still have to deal with multipasses… Exactly.. So, it means the same issues exist even if you are using Direct3D for the rendering engine..

It might make things much clear, if you can spend some time on Direct3D API as well.. You would see that the numbers (texture units, texcoord sets/interpolators, uniforms/constant regs) are exactly the same there.. It would help you isolate API limitations and hardware limitations..
Though I dont think either API's add up any limitations.. You can do same things on either API. Convenience differs. Engine Design is purely Hardware architecture constrained..

Basically we are programming the same hardware, but with different API.. Same thing applies to shader languages.. GLSL / HLSL.. The syntax changes here and there..

You might also find Direct3D10 preview & documentation a very interesting read.. The next generation of hardware to come.. D3D10 hardware (again, by D3D-10 I mean, d3d10 gen hardware.. Or hardware that confirm to Direct3D 10 specifications) is gonna be a happy time for us coders and artists.. As most of the limiting numbers of this generation, that you talked about are going to be magnitudes higher. Also, there is a new Shader type added.. Geometry Shader that sits in between Vertex & Fragment processing pipe & can be used for lots of exciting stuff

SkeletalMesh
07-02-2006, 01:39 AM
Originally posted by Golgoth:

my question is, beside the mainstream market right now... anyone is aware of an high-end video card that is pushing the current limits? some cards that would be mainstream in 2-3 years maybe? 2-3 years, then D3D10 HW is probably what you are looking for.. You will find these very interesting..

http://www.gamedev.net/reference/articles/article2316.asp

http://enthusiast.hardocp.com/article.html?art=MTA0NSwxLCxoZW50aHVzaWFzdA

While PlayStation-3 GPU (nVidia RSX) and ATI-Xenos GPU on XBox36o are pretty much current gen hardware ( D3D9c / GL 2.0), Xenos is something in between D3D9 & D3D10, more towards D3D9

The next next gen of 3D hardware could be something completely different from Rasterizer GPUs we have today..

See.. Rasterizer 3D rendering is basically a hack to show 3D worlds at interactive speeds.. In reality Shadows are absense of Light.. But, it does not apply to our rendering techniques.. because both lighting & shadowing we do now, are basically two different hacks stacked up to make believe they are inter-related

Where as Ray-tracing is more like how lighting/shadows work in the real-world.. Ray-tracing would fundamentally solve issues which are so hard to achieve on todays GPUs.. like caustics, complex, fully dynamic lighting, soft-shadows, Real time Global Illumination, etc..

Things like Area Lights, Reflections, Inter-reflections, Colored Shadows, etc.. are so complex & expensive to implement at rough approximations, these come inherent with Ray-tracing solutions..

There are Real-time ray-tracing hardware chips in development, but only behind University Labs.. There are prototype hardware & demos (SARCOOR) which r already showing incredible images almost impossible on GPUs.. I am not sure if nVidia & ATI are developing something RT in their backyard, but, if properly funded, RT Hardware would be truly a revolution in realtime rendering..

KC

Overmind
07-02-2006, 03:50 AM
That's not entirely true. Raytracing is just another approximation of the real world. Of course, some things like arbitrary surfaces, reflection and refraction are better described by raytracing.

But diffuse lighting, caustics and realistic (soft) shadows are still a weakness of raytracing. Of course, we still have some hacks to solve these. For diffuse lighting, the same approximation as we use it now is used. Caustics and soft shadows can be done with casting a lot of rays, but that's still an approximation.

I also disagree that colored shadows and so on are hard to implement with a polygon renderer. Hard shadows are something that polygon renderers can currently do very well, and doing soft shadows is equally hard on a raytracer.

I agree that a hardware raytracer would be nice to have. But I don't think that quality would improve that much, because all these *hacks* work really well. Raytracing is not really more realistic, it is just a different approach, not necessarily better or worse than polygon rendering, with each method having it's own strengths and weaknesses...

V-man
07-02-2006, 07:06 AM
Originally posted by Golgoth:

Does it matter? Does it improve performance? Yes it does… it says 10 but I cant go more then 8 for some reason or gl crash… if you have 2 shadow map and 1 occlusion map, it leaves 5 for textures… the problem with this is that 3d assets sometimes need several uvs for artists to work on texturing… to over come the limitation we must flatten the work before export… but once it is flattened it is a real pain to make more work on the assets uvs… and rollback on each assets before flatten is nonsense… which is a major down side on workflow performance. 8-10 is a tight closet… at 16 (which will make sense to match the texture unit doesn’t it?) we will start to breath a little! Each texture unit should have its own matrix… If I wrote my own stacks I ll have uniform overflow.
You do know I was talking about the matrix stacks right? There is the projection matrix, modelview, texture and I beleive one for the color matrix which is part of imaging extension.

"If 10 is not enough, then you can write your own software stack" means you pretend the stack size is one and you code yourself something that handles the stacks in your own memory buffers.

Golgoth
07-02-2006, 11:26 AM
2-3 years, then D3D10 HW is probably what you are looking for.. You will find these very interesting.. Indeed!


ATI is working closely with Microsoft to make sure the DirectX 10 API and their GPU programmability is accessible to game developers. developing with Opengl on a Nvidia hardware kind of leaves me perplex a little… but it really sounds promising! regarding new architecture, what is opengl 3.0 juicy stuff? Dx10 vs. Opengl 3… new titans battle?


DirectX 10 is deeply embedded into Windows Vista operation and we currently know of no plans by Microsoft to allow Windows XP to officially support the new API. They are going right in our pockets, lets face it, what they really care about is their .ConquerTheWorldPlan file.


What we mean by API object overhead is that the API is using CPU cycles to achieve tasks necessary for rendering before being output to the video card for drawing. When rendering a game, the application first has to call to the API and then the API calls to the driver before it ever interacts with your video card’s GPU. These calls are all handled by the CPU, using valuable resources and creating a potential bottleneck. Isn’t it why UNIX system are so stable… there is no CPU cycles needed to achieve tasks on video and sound card?

ebray99
07-02-2006, 11:39 AM
Isn’t it why UNIX system are so stable… there is no CPU cycles needed to achieve tasks on video and sound card? It doesn't matter what OS you use... modern PC architecture requires the processor to dispatch commands to hardware. As far as the API is concerned, it simply provides a standard "interface" to application developers. Under the hood, the API is sending commands to another standard interface in the driver. However, it's not a simple indirection, but rather a slew of messages, spin locks, semaphores, etc and is really quite a bit of "stuff". Unix probably doesn't have as many things to synchronize and schedule as far as graphics and rendering is concerned, however it still requires the CPU get involved. In short, Unix probably just does less.

Kevin B

Golgoth
07-02-2006, 09:59 PM
Oups, sorry V-man your last post slip to my attention...


You do know I was talking about the matrix stacks right?I kind of mixed up matrix count and matrix stack depth there... but i was indeed thinking about the the matrix stack depth...


and I beleive one for the color matrix
glMatrixMode(GL_COLOR) comfirmed!


"If 10 is not enough, then you can write your own software stack" means you pretend the stack size is one and you code yourself something that handles the stacks in your own memory buffers. Hum… I m going to medidate on this for a while... just cant figure this out atm...

btw: I m still hopping to get korval feedback on the post starting with:


I must say, I was literally glued to the screen, it was really interesting reading you! If it is not to much to ask.

regards

Korval
07-03-2006, 02:43 AM
Why cant hardware take care of multipasses on its own?Because, when it comes down to it, it would be a really bad idea.

The ARB considered requiring glslang shaders to accept any valid shader, to completely virtualize all hardware limitations and just let the driver figure out how to handle it. Idealistic.

Idealism must give way to practicality. The fact is, hardware really needs to expose those limits to the user. A 5600 can't do looping at all in the fragment shader, but an 6600 can. A shader based significantly on looping will be virtually unusable on the 5600; you may as well not have bothered to compile it.

Furthermore, the program itself has no indication that the shader is sub-optimal for said hardware. As such, it can't know that the 5600 compiled shader is going to be horrifically slow without trying it first. Trial-and-error is not the best way to tell whether something is going to work well.

Now, let's say you, as the developer, are aware that the 5600 shader will be really slow, and the 6600 one will be fast enough. So you develop an approximation that you would like to use on the 5600, something that looks good, but not as good as it could.

Well, now what? You have no way to tell if you're executing code on a 5600 or not.

By exposing limits, you allow a program to provide alternative shaders for various levels of hardware.

Oh, a driver could potentially virtualize hardware limits (or, some of them, at least. I seriously doubt that virtualizing and multipassing on the number of attributes or varyings is even possible for all shaders). But it would create more problems than it solved.


doing soft shadows is equally hard on a raytracer.No, it isn't.

It's expensive, but it's not hard. Indeed, it just requires firing more shadow rays. Whereas, in scan conversion, you have to come up with entirely new shadow algorithms.


btw: I m still hopping to get korval feedback on the post starting with:I don't know what you want me to say. I explained what I was talking about, and you seemed to get it. Whether your code was correct or not, I can't say offhand, but that was the general idea.

Golgoth
07-03-2006, 10:48 AM
I seriously doubt that virtualizing and multipassing on the number of attributes or varyings is even possible for all shaderwith the arguments brought up to the table... I understand why now... well, compt me as one demanding more transistors.

ideally:

count = GL_MAX_TEXTURE_IMAGE_UNITS = 16 // texture count based algo.

GL_MAX_LIGHTS = count
GL_MAX_VARYING_FLOATS = count * 2
GL_MAX_VERTEX_ATTRIBS = count // no overlaps with standard attributes on nvidia.
GL_MAX_VERTEX_UNIFORM_COMPONENTS = count * count * 2
GL_MAX_PROJECTION_STACK_DEPTH = count / 2
GL_MAX_TEXTURE_STACK_DEPTH = count
GL_MAX_COLOR_MATRIX_STACK_DEPTH: count

this will probably set the standard for one pass rendering pipeline as far as I m concern...

here is the “Guru Pretender” call SkeltonMesh was waiting for:

further more, unlikely to happened unfortunately but, push hw to the limits and draw the line between… it is physically not possible yet and we can do it but it would be expensive… at least developers could work on yet to come mainstream hardware... while cooking 1-3 years projects for instance... the title could be up to date at release and years after...



I don't know what you want me to say. I explained what I was talking about, and you seemed to get it. Whether your code was correct or not, I can't say offhand, but that was the general idea. Fair enough!

Thx again gentlemen!

Korval
07-03-2006, 01:58 PM
Here's what you're not understanding.

All of these constants:

GL_MAX_LIGHTS = count
GL_MAX_PROJECTION_STACK_DEPTH = count / 2
GL_MAX_TEXTURE_STACK_DEPTH = count
GL_MAX_COLOR_MATRIX_STACK_DEPTH: count

are meaningless. They are from driver-side stuff that you could just as easily do yourself. They are not stopping you from doing what you want.

The number of attribtes, varyings, and uniforms are the actual hardware restrictions.

V-man
07-03-2006, 02:00 PM
Originally posted by Korval:
[QUOTE]Well, now what? You have no way to tell if you're executing code on a 5600 or not.glGetString(GL_RENDERER) should return Geforce 5600 along with other substrings in there so you are technically wrong.

At least GLSL offers some things that can be queried.
One can also query for info using ARB_vp and ARB_fp. If you have the NV extensions, query those.

Those numbers give an idea on which path your engine should take.

I like those low level shaders. They offered a lot of low level info. They told you when a shader isn't native (runs in software)
How many temps, how many parameters, environment parameters, branching depth.

Golgoth
07-03-2006, 02:39 PM
The number of attribtes, varyings, and uniforms are the actual hardware restrictions.got it!

hum… interesting… custom matrix stacks is one of the key… still have no clue how but I ll dig into this… im doing an export/import plugin atm but… I ll be back. :cool:

thx for all the great inputs!

cheers

knackered
07-03-2006, 03:33 PM
std::stack<mat4> matrixStack;

void glPushMatrix()
{
matrixStack.push(currentMatrix);
}

void glPopMatrix()
{
matrixStack.pop();
currentMatrix = matrixStack.top();
glLoadMatrixf(currentMatrix);
}

Komat
07-03-2006, 07:27 PM
Originally posted by V-man:
glGetString(GL_RENDERER) should return Geforce 5600 along with other substrings in there so you are technically wrong.
Problem is that format of that string is not standardized so it may theoretically change between drivers and is not required to contain reference to the hw type at all. You also have make to sure that you will not by parsing error include future "GeForce 56000" or exclude "GeForce 5600XT" or "GeForce 5600 Ultra" or "GeForce 5600/AGP"