Hey all. I apologize for putting so much in one post, but I haven't had a chance to post all week :P
First what I think is a bug. This has to do with alpha blending on the GeForce 256 vs TNT2. I'm blending textured, billboarded polygons with blend function GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA. The poly's are sorted according to window z coord. The texture environment is GL_MODULATE, the filters are both GL_LINEAR, and the texture format is GL_ALPHA. The typical way of drawing the blended polys is to set the depth mask to GL_FALSE and render them after rendering all opaque objects. Unfortunately, this does not work right on the GeForce. Blending still happens, but instead of my texture fading to totally transparent at the edges of the polygons (which it should do), it has an abrupt cutoff to it when the alpha reaches some small but non-zero value. I believe this is a bug for two reasons. One is that it works right on my TNT2 at home, and the other is that doing glDepthMask(GL_FALSE) causes this to happen WHETHER THE DEPTH TEST IS ENABLED OR NOT!! Is this a driver bug or am I just really missing something? I believe the newest drivers are installed (6.18) on both machines.
Second a question. I find myself rendering a lot of 2-d height fields. I.e. for every x,y value, there is an associated z value. Usually, all the z values change every frame but none of the x and y values do. Is there some way that I can have the graphics card keep the x and y values and just send the z values every frame? That would cut bandwidth to 1/3 of it's current usage, which would be sweet.
Finally, I remember Matt asking for feature suggestions on Nvida graphics cards. Here is one that would be on the top of my list:
Call this extension ZENO_LAZY_SORT_EXT
When GL_BLEND is enabled, every time a polygon gets rendered it's alpha value is checked. If it is not equal to 1.0, that polygon gets stored on the graphics card in some sorted data structure (binary tree, hash table, etc) based upon it's window z coordinate. Then, once SwapBuffers() is called, the card can take this cache of poly's (which is already sorted), put the depth buffer in read-only mode, and draw the polygons. There are SO many benefits to this sort of hardware acceleration:
1) The user does not have to do the math to transform the polygon's z coordinate to window coordinates - the graphics card does this anyway (with T&L), so this saves duplicate math.
2) The user does not have to call a sort routine on these transformed coordinates, which is usually n log n time at best.
3) The graphics card could do the sorting at the polygon level. Usually when I do this it is at the object level cuz they're in display lists, so I can't have transparent 3d objects and have them look right (except maybe spheres).
4) The user does not have to separate translucent and opaque objects in his scenes data structures.
Is there some reason that I don't see why this would be an undoable or bad extension? Has it been proposed or does it already exist somewhere?
Thanks in advance for your help,