PDA

View Full Version : Shader Model 3.0 in OpenGl



armored_spiderman
01-18-2006, 08:17 AM
can anyone answer me since when does OpenGl support Shader Model 3.0 (all its features) and if it is by ARBs or it is already in the core?

sorry if i said something idiot...its because im really wanting to look up for who has gived SM 3.0 support first...OPGL or DX...and maybe some of you can help me

thanks

KRONOS
01-18-2006, 11:54 AM
Shader Model (1.0, 2.0 or 3.0) is just D3D way of grouping certain capabilities any given card has. OpenGL doesn't work this way. You can access everything SM3.0 provides for D3D, in GLSL. You make your GLSL shaders and their run (or not): no shader models present.

If you wan't, you can access SM2.0 feature set with ARB_[vertex,fragment]_program. For other features higher than SM2.0 (like SM3.0) you have to worh with GLSL or use NVIDIA's extensions.

armored_spiderman
01-18-2006, 03:19 PM
Originally posted by KRONOS:
Shader Model (1.0, 2.0 or 3.0) is just D3D way of grouping certain capabilities any given card has. OpenGL doesn't work this way. You can access everything SM3.0 provides for D3D, in GLSL. You make your GLSL shaders and their run (or not): no shader models present.

If you wan't, you can access SM2.0 feature set with ARB_[vertex,fragment]_program. For other features higher than SM2.0 (like SM3.0) you have to worh with GLSL or use NVIDIA's extensions.hmm i understand...i already knew that OPGL dont use the name SM 3.0...he only has the features to do it

i want to know since when GLSL is capable of doing things tha is only possible with "SM 3.0" ...like dinamic branching, Dynamic Flow Control, Texture Lookup, Vertex Frequency Stream Divider, geomometry stancing, etc....things that microsoft and Nvidia consider as Shader Model 3.0

KRONOS
01-19-2006, 01:58 AM
Originally posted by armored_spiderman:
i want to know since when GLSL is capable of doing things tha is only possible with "SM 3.0" ...like dinamic branching, Dynamic Flow Control, Texture Lookup, Vertex Frequency Stream Divider, geomometry stancing, etc....things that microsoft and Nvidia consider as Shader Model 3.0Dynamic flow control and texture lookup (in the vertex shader) are possible with GLSL.
The vertex frequency stream divider and geomometry stancing are not available in GL because well, it is not needed (check out ARB meeting notes and NVIDIA's developer site).

And I would be carefull, because SM3.0 doesn't mean, for example that you are able to texture fetch in a vertex shader. For example in ATI's latest card, it "provides SM3.0", thus texture fetch. But then, ATI doesn't provide any texture format to use in vertex texture fetch, making it useless. SM3.0 doesn't mean that you automaticly have texture lookup in the vertex shader.

armored_spiderman
01-19-2006, 02:33 AM
hmm but i want to know since when is possible to do Dynamic flow control and texture lookup with GLSL.

And where cai i find the notes that explain about geometry stancing and vertex stream dividing being useless...i read about it in the site of nVidia...if it is useless why she should put it there?

sorry if i am disturbing you with stupid questions...im just a newbie in Shading Language trying to understando something....thanks for the help

Relic
01-19-2006, 02:44 AM
GLSL is just a language. It supports the flow control and stuff just by design.

What the underlying hardware can do can be queried by the OpenGL extensions unrelated to the GLSL language.
For example check out this document on how to distinguish different NVIDIA architectures by looking at the OpenGL extensions:
http://developer.nvidia.com/object/nv_ogl2_support.html

Other ways to query for features include the usual glGet* mechanism, for example vertex texture lookup capabilities would be queried with GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS. 0 means no vertex textures possible.

shelll
01-19-2006, 04:45 AM
instancing is very useable in d3d, in ogl it will not provide such big performance difference. but IMHO it will be good to have it, because it can significantly reduce function calls. in my case, from millions to maybe 20 and that is a lot different CPU load. but my app is fillrate bound, so it will not help, but...

armored_spiderman
01-19-2006, 05:57 AM
Originally posted by shelll:
instancing is very useable in d3d, in ogl it will not provide such big performance difference. but IMHO it will be good to have it, because it can significantly reduce function calls. in my case, from millions to maybe 20 and that is a lot different CPU load. but my app is fillrate bound, so it will not help, but...hmm i will read the Nvidia OpenGL 2.0 support PDF...any questions i will bring to you guys ;) thanks for the help

can you explain better to me how instancing can be good in d3d and not in OGL?and what about High Dinamic Hange? it is used in OpenGL? if it is..how it is done?..is there any feature of D3D that OPGL cant do either?

a

shelll
01-19-2006, 08:28 AM
HDR can be done in ogl ;) just google. for the instancing thing, just search this forum, you will find the answer. but i still want instancing in ogl...

Relic
01-19-2006, 09:10 AM
Originally posted by armored_spiderman:
can you explain better to me how instancing can be good in d3d and not in OGL?Explained here:
http://download.developer.nvidia.com/dev..._instancing.pdf (http://download.developer.nvidia.com/developer/SDK/Individual_Samples/DEMOS/OpenGL/src/glsl_pseudo_instancing/docs/glsl_pseudo_instancing.pdf)

armored_spiderman
01-19-2006, 10:14 AM
Originally posted by Relic:

Originally posted by armored_spiderman:
can you explain better to me how instancing can be good in d3d and not in OGL?Explained here:
http://download.developer.nvidia.com/dev...instan cing.pdf (http://download.developer.nvidia.com/developer/SDK/Individual_Samples/DEMOS/OpenGL/src/glsl_pseudo_instancing/docs/glsl_pseudo_instancing.pdf) i have readed tha ATI clains that she dont uses vertex texture fetch because she has another way to do it that is better and faster than VTF ... it is called R2VB ...of course that ATI can be just defending herself..thats why i want your opnion about it

she clains too that Nvidia hardware dont have enough power to use vertex texture fetch massively, like in a game...the performance would be poor ...apparentelly Nvidia implemented it but dont putted to much resources on that

Relic
01-19-2006, 11:08 PM
"She"? ;)
Dude, if R2VB means render to vertex buffer that's something different then texture fetches inside the vertex pipeline.
Vertex textures are normally used for displacement mapping, but can also be applied for general purpose stuff.
I prefer having a hardware feature over not having it. It means I can use it creatively today.

execom_rt
01-19-2006, 11:49 PM
Originally posted by shelll:
HDR can be done in ogl Except that ATI doesn't support alpha-blend floating point textures in OpenGL, even with the Radeon X1800 (it does, but in software rendering) ...

shelll
01-20-2006, 12:18 AM
execom_rt: taht is poor, that even x1800 can't do that in HWm if it is true :D ...but you can write your own blending shader ;)

Obli
01-20-2006, 01:18 AM
Originally posted by armored_spiderman:
can you explain better to me how instancing can be good in d3d and not in OGL?It's not that it not useful it's just much less useful than d3d.

This question comes regularly in other forms... try check it out but the outline follows:
Some OpenGL users noticed that GPU performance improve almost with batch size and that there's no point in using 2000 function calls when you can do, say, just 400.

Some other users notice however that GL has marshalling. Marshalling allows to take a certain amount of calls (say 150 but it's just a random number) and send them all at once to the video card, thus eating much less cpu power than their counterpart. Because of this, the improvement using instancing it's expected to be so low it's not worth implementing it.

I also would like to have instancing but I see they have good reasons for not proposing it.

armored_spiderman
01-20-2006, 02:00 AM
Originally posted by Obli:

Originally posted by armored_spiderman:
can you explain better to me how instancing can be good in d3d and not in OGL?It's not that it not useful it's just much less useful than d3d.

This question comes regularly in other forms... try check it out but the outline follows:
Some OpenGL users noticed that GPU performance improve almost with batch size and that there's no point in using 2000 function calls when you can do, say, just 400.

Some other users notice however that GL has marshalling. Marshalling allows to take a certain amount of calls (say 150 but it's just a random number) and send them all at once to the video card, thus eating much less cpu power than their counterpart. Because of this, the improvement using instancing it's expected to be so low it's not worth implementing it.

I also would like to have instancing but I see they have good reasons for not proposing it.hmm thanks...so because of marshalling...instancing become "useless" because marshalling can do the same and can do it faster?...tha is what i have understood

i use to call ATI as she =P ... from what i am reading here....it seems that ATI hardware is more fore gaming than for developing...looks like tha hardware lack in the support for some OpenGl features....these features that lacks in ATI hardware are too important for 3D development? you guys tha work with this (im just a gamer trying to understand things better) ... any of you use ATI to work? in this area how does Nvidia compare with ATI?

i dont wanna to bring a discussion....just trying to get some information =)

again i thank tou all that are helping

armored_spiderman
01-20-2006, 02:06 AM
Originally posted by Relic:
"She"? ;)
Dude, if R2VB means render to vertex buffer that's something different then texture fetches inside the vertex pipeline.
Vertex textures are normally used for displacement mapping, but can also be applied for general purpose stuff.
I prefer having a hardware feature over not having it. It means I can use it creatively today.yes i know its not the same thing....but ATI says that the result will be the same (what can be done with vertex textures...can be done with R2VB)... but i agree that is better having it than not having it

i just not understand why Nvidia dont implemented High Dynamic Hange with FSAA ...it turned to be a great point in favor of ATI hardware....at least for gamers

yooyo
01-23-2006, 06:49 AM
i just not understand why Nvidia dont implemented High Dynamic Hange with FSAA ...it turned to be a great point in favor of ATI hardware....at least for gamersHDR & FSAA is not related. You can play games with or without HDR and with or without FSAA. Combination of this effect may or may not work on some hardware.

Bigger problem is huge and growing hw architecture difference. This force developers to write separate codepaths for ATI na NVidia. Results can be:
1. Game work perfectly on hw-1 but work sloppy on hw-2
2. Game work nice on both hw, but not using latest features and effects
3. Game don't use extra features from hw-1 and hw-2.. ie.. useless tranistors on GPU :)

Because of this, most games doesn't use 3Dc or Vertex texture fetch. Now, we are facing with texture filtering issue on ATI. R2VB is nice feature, but Im afraid.. it will be used in very rare situations.. maybe in some very hw dependent software.

HW vendors MUST agree about some feature. For example FP textures on ATI...If something is a texture it MUST have all texture properties and behave like regular texture object, no metter is it byte, short, half or float pixelformat. Lack or filtering is very serious limitation

HW vendor have two options.. Make some deals about hw future feature set or paying more people that should go to game developers companies and offer optimizations for free, just to be sure that some new AAA game runs better on their hardware.

yooyo

armored_spiderman
01-23-2006, 08:04 AM
Originally posted by yooyo:

i just not understand why Nvidia dont implemented High Dynamic Hange with FSAA ...it turned to be a great point in favor of ATI hardware....at least for gamersHDR & FSAA is not related. You can play games with or without HDR and with or without FSAA. Combination of this effect may or may not work on some hardware.

Bigger problem is huge and growing hw architecture difference. This force developers to write separate codepaths for ATI na NVidia. Results can be:
1. Game work perfectly on hw-1 but work sloppy on hw-2
2. Game work nice on both hw, but not using latest features and effects
3. Game don't use extra features from hw-1 and hw-2.. ie.. useless tranistors on GPU :)

Because of this, most games doesn't use 3Dc or Vertex texture fetch. Now, we are facing with texture filtering issue on ATI. R2VB is nice feature, but Im afraid.. it will be used in very rare situations.. maybe in some very hw dependent software.

HW vendors MUST agree about some feature. For example FP textures on ATI...If something is a texture it MUST have all texture properties and behave like regular texture object, no metter is it byte, short, half or float pixelformat. Lack or filtering is very serious limitation

HW vendor have two options.. Make some deals about hw future feature set or paying more people that should go to game developers companies and offer optimizations for free, just to be sure that some new AAA game runs better on their hardware.

yooyohmm first off all i want to thanks you all that have spended time answering my newbie questions, people that are high related to 3D development and GLSL losing time to answer so commom questions is really odd to see and i apreciate it...forgive my english...i know it sucks =P

what i have been reading is that...this HDR+AA hardware that ATI claims to have needs do be worked by software too...just like the patchs that have been released to make Nvidia run HDR + AA on some specific games...so its beguns to get hard to understand how does works this on ATI hardware...

armored_spiderman
01-24-2006, 05:44 AM
to not run from the theme of the topic... lets go back to GLSL :)

can anyone explain to me the diference betwen static branching and dynamic branching? :D

Humus
01-24-2006, 08:59 PM
Originally posted by Relic:
Dude, if R2VB means render to vertex buffer that's something different then texture fetches inside the vertex pipeline.Yes, but can implement everything that VTF can, and is much faster.

Humus
01-24-2006, 09:00 PM
Originally posted by execom_rt:
Except that ATI doesn't support alpha-blend floating point textures in OpenGL, even with the Radeon X1800 (it does, but in software rendering) ...It has been implemented and should appear in a future driver.

armored_spiderman
01-26-2006, 09:13 AM
Originally posted by Humus:

Originally posted by execom_rt:
Except that ATI doesn't support alpha-blend floating point textures in OpenGL, even with the Radeon X1800 (it does, but in software rendering) ...It has been implemented and should appear in a future driver.Humus you know ATI hardware pretty well...can you explain to me how works HDR + AA on ATI hardware? and the diference that it have to Nvidia HDR...and how Valve implemented HDR + AA in Nvidia
it seems that even in ATI cards to suport HDR + AA is required some pragramming...just like Nvidia..but in ATI hardware its easier...but needs an extra programming.. or i am wrong and ANY game that have HDR implemented can use AA without any other work on the code?

thanks for the help ^^

Humus
01-26-2006, 06:21 PM
You need to write code for it. It would be nearly impossible for the driver to override it, since you don't do the HDR in the backbuffer but with a separate FP16 render target. The driver could make some educated guesses when to override a render target with a multisampled version, but that would be prone to failure.

Instead the app has to create a multisampled render target and do the resolve blit itself. This is very easy though, so it should not be a problem.
The difference between ATI and Nvidia here is that the X1K series can multisample FP16 surfaces while Nvidia can't. This means that if you do HDR the regular way with FP16 render targets you get no AA on Nvidia. That it works in Valve's implementation is because it's kinda hackish. They don't use FP16, but regular RGBA8, but append tonemapping in the end of the shaders AFAIK. This means they lose linearity with blending and other nasty stuff, but with some tweaking from the artists they can make it look quite good anyway.

armored_spiderman
01-27-2006, 01:44 AM
Originally posted by Humus:
You need to write code for it. It would be nearly impossible for the driver to override it, since you don't do the HDR in the backbuffer but with a separate FP16 render target. The driver could make some educated guesses when to override a render target with a multisampled version, but that would be prone to failure.

Instead the app has to create a multisampled render target and do the resolve blit itself. This is very easy though, so it should not be a problem.
The difference between ATI and Nvidia here is that the X1K series can multisample FP16 surfaces while Nvidia can't. This means that if you do HDR the regular way with FP16 render targets you get no AA on Nvidia. That it works in Valve's implementation is because it's kinda hackish. They don't use FP16, but regular RGBA8, but append tonemapping in the end of the shaders AFAIK. This means they lose linearity with blending and other nasty stuff, but with some tweaking from the artists they can make it look quite good anyway.thanks a lot to have take time to answer my newbie question, now its pretty clear the difference. It looks that Nvidia G71 will come with AA on FP16 surfaces too...it would be nice :D

Its has been released a pacth to farcry that enables the use o AA + HDR in nvidia hardware...Farcry HDR always was RGBA8 or they have to change all the code to get AA + HDR to nvidia?

it seems that even they can achieve AA + HDR on Nvidia it will not have the same quality that using INT10 or FP16 HDR + AA on ATI

http://prohardver.hu/c.php?mod=20&id=996&p=6

in this interview Eric Demers says that 10B HDR can have the same precision and more speed than the FP16 that is used today

and in this interview seens that Adaptive AA can be used on old ATI architectures

"Some form of adaptive AA is going to be made available on previous architectures. Our fundamental AA hardware has always been very flexible and gives the ability to alter sampling locations and methods. In the X1k family, we've done more things to improve performance when doing Adaptive AA, so that the performance hit there will be much less than on previous architectures. However, the fundamental feature should be available to earlier products. I'm not sure on the timeline, as we've focused on X1k QA for this feature, up to now. Beta testing is ongoing, and users can use registry keys to enable the feature for or older products, for now."

the question is if the old architecture will have enough power to sustain AAA...it seens to be more demanding on the hardware