Transform Cache

I’m not sure if this has changed, but some older hardware at least have so called post-transform vertex caches.
What steps in OpenGL does one have to take to ensure maximum utilization of these?

The main factors for optimising vertex cache use are to minimise the number of vertex shader outputs (so you can fit more vertices in the cache), and to order primitives so that primitives sharing a vertex are close together in terms of their position in the element array (to increase the likelihood of a vertex being in the cache).

Additionally, the less expensive the vertex shader, the less it matters whether a vertex is cached.