how does texture cache work?

I want to use vertex textures, and I’ve read that a cache friendly texture access is very important for good performance.
So here’s my question: Is accessing a small subrectangle in a large texture going to cause cache misses? To put it another way, is it going to be slower to read a small subrect from a large texture, than to read from a small texture in the first place?

This is hw dependent, but some hw cache blocks of a 2D texture.
For 3D, some cache 3D blocks and some square blocks for each slice.

And since you mentioned vertex textures, I think the NV GPUs are using the same cache for the vertex and fragment pipe.
I have also read at http://www.gamedev.net/reference/programming/features/d3d10overview/ that performance is not great for doing fancy things with vertex textures.

Eventually I’ll probably switch to render-to-VB (essentially, just bind the texture to an FBO and copy its contents using glReadPixels into a PBO, which I will later use as a VBO), but for now, vertex textures seem to be much easier, so I’ll do that first.

Probably depends a lot on your content. If you consider that you probably have far fewer vertices than pixels getting filled the main challenge is to pipeline the latency of the texture fetch made by the vertex shader. The additional texture bandwidth may not be the primary concern when compared to other conventional texture operations although they tend to be inherently coherent.

Given the use of texture in a vertex shader an attempt to cache texture fetches may be pointless in many cases anyway.

So, success would be highly application dependent IMHO.