Siggraph06

Here’s an outline of the D3D10 system, covered in the Siggraph06 paper:
http://download.microsoft.com/download/f/2/d/f2d5ee2c-b7ba-4cd0-9686-b6508b5479a1/Direct3D10_web.pdf

I thought the pipeline overview was interesting. Here’s a quick take on it:

Input Assembler (IA):

  • Gathers data from up to 8 streams (VBs), each stream has up to 16 fields of 1-4 elements
  • Instancing address mode allows n repeats of a block of k verts

Vertex Shader (VS):

  • 1 vertex in, 1 vertex out
  • New integer ops and expanded float ops (transcendental functions)
  • Memory access of up to 128 buffers (textures) and 16x4096 element constant buffers

Geometry Shader (GS):

  • Single primitive in (point, line, triangle), 0 or more primitives out
  • In/out primitive types need not match
  • Max of 1024 32bit values of vertex data output
  • Per input primitive adjacency information available to shader

Stream Out (SO):

  • Copies subset of GS output to up to 4 ID buffers.
  • Ideally it would work like IA (8x16), but is fixed at only 1x16 or 4x4 elements max.

Rasterization (RS):

  • Verts and attributes of single primitive in, fragments out
  • Fixed function clipping, culling, perspective divide, viewport transform, primitive setup, scissor, depth (polygon) offset

Pixel Shader (PS):

  • Single fragment in, single fragment out
  • Write up to 8 color values (render targets) and depth (discard and depth replace still a zcull hazard)

Output Merger (OM):

  • Performs depth and stencil test (unified depth/stencil target)
  • Render target blending
  • Up to 8 render targets (attribute buffers)
  • Blend enable/disable for each render target (shared blend function)

Anyway, this is a glimpse at what the next-gen hardware can do, so it may serve as a prelude to what we can expect/desire in OpenGL in the future.

P.S. It looks like our constant/uniform storage worries are over :wink:

I just hope it doesn’t take too long to expose this functionality with OpenGL.

Very good link leg

A problem I find there is that the geometry shader only knows the adjacent triangles and not the entire “batch” mesh.

SM4.0 is looking good with finally the sampler offset indexing, bitwise instructions, etc… But I don’t see mention to depth/stencil/color framebuffer access inside fragment shaders yet??

And yep, with the constant buffers are cool too.

Now lets pray for some OpenGL extentions fast and Windows Vista good drivers…

A problem I find there is that the geometry shader only knows the adjacent triangles and not the entire “batch” mesh.
yeah, there are a few more limitations than I imagined. I was thinking the GS would end up being more like a full fledged, no holes barred tesselator, and apparently they considered it, but it just wasn’t feasible (maybe for 5th generation). Still, this ain’t too shabby :wink:

But I don’t see mention to depth/stencil/color framebuffer access inside fragment shaders yet??
Was this planned? I never heard about something like that being planned for SM4, and personally I would be very surprised if we get it any time soon.

What I’m missing is a bit more advanced blend stage. Simply enabling/disabling blend per render target is not exactly what I had in mind :wink:

Was this planned?
Who knows? Obviously it’d be a cool feature. The problem is the performance hit in the current architecture… to expensive to get around right now. But it’s eminently doable.

Simply enabling/disabling blend per render target is not exactly what I had in mind
What did you have in mind?

A blend shader stage. Something that gets all outputs of the fragment program and all current buffer values, and outputs updated buffer values.

It wouldn’t be neccesary to be a fully programmable shader, I’d be content with a combiner like functionality. The main point is being able to freely combine all these values, not only source and destination of a single buffer.

And if that’s still too much to ask for, at least a seperate blend function for each render target would have been nice :wink:

I believe they actually debated making the OM (in addition to some others) stage programmable, but it just didn’t happen. I’m sure most if not all stages will be at least partly programmable in the not too distant future. The problem seems to be one of maintaining performance (which of course is key) while keeping the hardware relatively inexpensive. I agree, though, programmable blending and indeed everything else would be sweet.

Originally posted by Leghorn:
I believe they actually debated making the OM (in addition to some others) stage programmable, but it just didn’t happen. I’m sure most if not all stages will be at least partly programmable in the not too distant future.
in the paper linked in the first post it is mentioned that they thought about merging the pixel/fragment shader stage with the ROP stage (OM) but decided against it for obvious reasons. i hope to see a programmable ROP stage in the near future (ROP stage not merged with PS stage). but as DX9 lives now for almost 5 years i don’t see a DX11 tech update being done in the not so distant future.

to the paper. very interesting read and it will be very exiting to work with all these new features. if you look at this [1], you can see that with opengl 3.0 LM there is much similarity with the features described in the DX10 paper. so i think there is much work done on opengl allready. i hope that opengl 2.x and 3.0 will start very soon and live parallel for some time, so that 3.0 can mature to a solid standard…

[1] http://www.gamedev.net/columns/events/gdc2006/article.asp?id=233

i hope that opengl 2.x and 3.0 will start very soon
I rather hope that OpenGL 3.0 will be released as individual extensions very soon, so it can mature to a solid standard before the whole thing will be made into core features. But afaik that’s the plan anyway…

Originally posted by Overmind:

[quote]i hope that opengl 2.x and 3.0 will start very soon
I rather hope that OpenGL 3.0 will be released as individual extensions very soon, so it can mature to a solid standard before the whole thing will be made into core features. But afaik that’s the plan anyway… [/QUOTE]Jep, but extensions to OpenGL 2.x won’t help the OpenGL 3.0 Lean and Mean profile thing. So i would prefer the tripped down OpenGL 3.0 LM to be a parallel development to the standard profile version.

Why won’t they help? The old functions are not going to be removed from core anyways, and who says implementors can’t layer core features on top of an extension?

It’s only a little difference in how it’s specified, but there is absolutely no functional difference between an extension and a core feature.

On the other hand, if you make it a core feature, you’re stuck with it forever, you can’t remove it from the core. If you make an extension, you can modify it before promoting it to a core feature, and then all you have is a deprecated extension lieing around, but the core spec is clean…

Jep, but extensions to OpenGL 2.x won’t help the OpenGL 3.0 Lean and Mean profile thing.
The point of releaseing them as extensions is to make sure they work before making them into an inflexible core.

The idea i think with OpenGL 3.0 Lean and Mean is that all legacy functions get removed. So i hope that the 3.0 core will have much less functions (not functionality) than the current 2.0 core.

Sure does an extension phase help to mature future core functions, but if they are placed in a 2.0 context i think these extensions may have to deal a lot with compatibility issues to the legacy functions. So hence my opinion that OpenGL 3.0 LM profile should run parallel to OpenGL 2.x.

Sure does an extension phase help to mature future core functions, but if they are placed in a 2.0 context i think these extensions may have to deal a lot with compatibility issues to the legacy functions.
No necessarily. You can effectively have two APIs. There might be some mechanism to wrap new-style objects in old-style object indices, but that would be about it for cross-talk between them.

Yes, that’s what I figured, too.

You can make a new API that you can use exclusively, or you can create objects with the new API and use them in the old API through “wrapper objects”, but you can’t use old objects in the new API.

So the new API stays clean, there are just a few additions to the old API to allow integration of the new API, but not the other way round.

Direct3D itself is adopting OpenGL’s layered approach, to some extent. I think it’s a good thing, too–a step towards some API stability. But I hope it’s not at the expense of future flexibility (though I doubt that it is).

Personally, I don’t mind if even the entire API changes a bit every year or so, if it means keeping things in-line and in-time with the evolution of the hardware. Backwards compatibility is a distant second to keeping pace with (some) grace. Ultimately, as long as I can access the current hardware features, I don’t really care what it looks like. Whatever makes things easier for the driver writers to make it happen. If they can structure things in OpenGL in such a way as to make revisions somewhat less arduous, that might go a long way towards expediting features, even if it means minor changes in the API from time to time (that probably wouldn’t sit well with the purists ;-)). But let’s face it, even a clean split is going to have to change, eventually.

but as DX9 lives now for almost 5 years i don’t see a DX11 tech update being done in the not so distant future.
Well, my feeling is that the API changes with the hardware, whatever it takes, and no matter how ugly it gets; that’s been D3D motto from the beginning. I see this slowing down, to some extent, but certainly not stopping. If the hardware changes sufficiently to warrant a new API, rest assured that those changes will be made.