Usefulness of transform feedback

I’ve read about TF and I’ve tried to think useful applications of it. It can be used for hardware skinning, probably, when a skinned mesh is rendered multiple times per frame. In the first extra pass (where TF is enabled) we can apply the local bone transformations and capture the vert positions in a VBO. Then we render the mesh using this VBO multiple times in this frame.

Any other useful applications of TF?

Fast, pipelined GPGPU, with imho possibly much-lower overhead than CL or cuda. Skinning is just one of the many examples of such GPGPU.

Instance culling on gpu: http://rastergrid.com/blog/2010/02/instance-culling-using-geometry-shaders/

Tesselation/subdivision, that is possibly more flexible than the DX11 one.

Since it’s basically “arrays of structures” with possible variable result length, and possible recursion to emit N array elements out of a kernel/input, it looks quite useful :slight_smile:

You can calculate silhouette edges, which have a number of uses.

Particle systems.

Occlusion culling much more efficient than with occlusion queries.

Could you provide some info on this? Sounds very interesting.

Could you provide some info on this? Sounds very interesting. [/QUOTE]
Haven’t actually done this but seems like it’d be pretty efficient, but probably only useful for point (or small) queries. Render to a depth texture, bind that as a vertex texture, then send point queries down the pipe. Vertex shader does a depth texture lookup and compare, and serializes out a “visibility” attribute. Then render the objects into the scene with the new visibility attribute bound. Or instead of writing a visibility attribute, use the geometry shader to “prune out” the vertex data and serialize out only the desired primitives (ala Instance culling using geometry shaders).

For a more general solution (arbitrary-sized/shaped occlusion queries), seems like we really need to be able to serialize normal occlusion query results to a buffer object post rasterizer rather than storing them in query objects to yield a more efficient NV_conditional_render for occlusion queries.

Ah yes, thanks! And if we manually create a min/max RG32f mip-chain (R=minZ, G=maxZ), with a simple texture-fetch on a given LOD we can query arbitrary-sized bounding-spheres. Tens of thousands of queries being executed in one draw-call this way :D.
No need for the queries to share instance mesh info, thus the whole scenegraph can be queried. Provided that some rough depth was pre-laid (I do this currently in my code; anti-portals might be the term).

Generally the query results will be <20k in count, so a simple VBO readback gpu->cpu won’t be problematic, especially if packed as a bitfield. Further curable by GL4’s indirect-draw commands.

Bounding spheres can give too many false positives, but the manual visibility testing can be easily extended to do triangle vs Z-plane intersection for complex custom bounding volumes, if necessary.

Existing occlusion queries are proving a bit of a bottleneck for me even with perfect pipelining (rasterization of many triangles, that cover 100k-1M pixels of the screen), and CHC++ doesn’t look like a solution, so this piqued my interest.
Thanks again :slight_smile:

For a more general solution (arbitrary-sized/shaped occlusion queries), seems like we really need to be able to serialize normal occlusion query results to a buffer object post rasterizer rather than storing them in query objects to yield a more efficient NV_conditional_render for occlusion queries.

Why? Is there something inherently inefficient about conditional render?

Granularity = batch.

Though it’s better than CPU-side conditional render based on occlusion queries of course.

Nice. Any examples?

http://developer.download.nvidia.com/SDK/10.5/opengl/samples.html#cg_geometry_program

Never tried it, but was always going to:
-draw normals/tangents anything for an arbitrary mesh for debugging porposes.

Nice. Got to look into it in more detail, though I have not really used cg so an example in GLSL would have been better for my brain…

Oh, disregard normals/tangents drawing. I was somehow thinking about Geometry Shaders rather than TF…

BTW, my particle systems on TF worked pretty well. I was able to update & call for drawing 1 million particles per frame running at 30fps on Radeon 2400 HD.

Transform feedback is essential for the Light Propagation Volumes technique. With it it’s possible convert a Reflective Shadow Map into a grid of points that represent a huge number of lights (with position, direction and color) that can be directly injected into the propagation volumes. It’s one beatiful example of how transform feedback can be used in a non-trivial way.

Interface stuff like picking & mouseovers without refactoring and maintaining all your shaders in software (and where your shaders may only be available at runtime too).

Print Rendering using native shaders.

Any other situation where the application needs post transform screenspace information.

With the fixed pipeline gone and potentially data driven vertex code something like this becomes essential for what used to be considered mundane operations.