another texture memory management post

I am writing a texture memory manager that attempts to keep track of which texture objects are currently resident. The reason behind it is that LRU is very bad for the system I’m working on. The manager will allow for MRU use of texture objects, ensuring that the texture object used for sub-texturing is resident (I want to sub-texture when I run out of texture memory, rather than swapping out the LRU texture and reusing texture memory). Unfortunately, since clients can add graphical objects in a different order in different frames, I can’t do any kind of texture sorting or the such since there is no guarantee as to how the data will be encountered.

So my question is, what is more expensive, binding or sub-texturing? Binding can switch from a resident texture to a resident texture, which I think is just a state change. I believe this is the best case scenario. Binding can also switch from a resident to a non-resident texture, meaning that a swap occurs from system memory to texture memory, correct? This is probably bad. And then there is sub-texturing. I’m not too sure on this one, but I believe the supplied data just gets sent down the pipe and replaces the specified area in the currently bound texture object, yes? So assuming the texture object is already bound, the only hit would be the pipeline transfer.

If it turns out that binding is not a big deal, maybe using LRU is ok. But all the tests that we’ve tried (LRU) show a round-robin effect. Bad, bad for animating.

I haven’t been able to do any benchmarking, but I was hoping someone somewhere may have tried/done something similar and might have some input. Any help / suggestions good or bad are most welcome.

BTW, this is being done on an SGI InfiniteReality graphics system.

One more thing since you’re here… How does glPixelStore affect texture transfers? From what I’ve read it appears to be affected only by UNPACK-ing. If this is correct, does that mean the texture is only affected on a glTexImage or glTexSubImage call and NOT during under-the-covers swapping?

Thank you.

On iR the bind will be MUCH faster for resident, it’ll be almost instantaneous compared to subload.

For non resident the situation is more complex. A bind of non resident to texture would on the face of things be faster than a subload because the GEs won’t have to swizzle (although with the right external format it may not matter) BUT, typically this kind of bind will involve a copy from texture memory to system memory because server side textures are not cached on the host, so it will be much slower, not only because of the extra work to copy, but also because the textures will come back over the RBUS.

In addition relying on the bind to generate the page would mean it happens in mid render, when you’re drawing stuff and you’d pay an extra performance price for this. You really want to page immediately after screen clear or start next frames subloads at the end of this frame depending on proximity to vertical retrace for example.

Strided subloads are about half speed, so full image subloads should be used if possible. If you subload from the host make sure you have the packed format to match the internal texture format.

This all means you probably want to fill texture memory and bind for that stuff, then if you need more texture subload over an LRU image that wasn’t used this frame, or the last MRU image if none is available.

The situation w.r.t. host side TRAM copy may have changed of be an option now, this was knocked around as an idea. You’ll need to ask someone at SGI, Dave Shreiner may be able to help you.

> typically this kind of bind will involve a
> copy from texture memory to system memory
> because server side textures are not
> cached on the host

I would assume that current PC drivers cache the texture images on the “host” side, and do not read back the texture when replacing. I’ve convinced myself that the nVIDIA drivers do that, for example, and I can’t see why ATI or Kyro would be any different, as that’s the way to highest benchmark scores.

Also, I wouldn’t assume that the driver does a simple LRU texture replacement, because that would also not give the highest benchmark scores. Have you measured the specific driver behaviour (what’s your hardware?) and made sure you can’t trust the driver?

yep, jwatte is right w.r.t the nvidia drivers.Other drivers should do the same.Also keep in mind that a texture subload will first be copied to system memory and then sent down to vram on nvidia and similar drivers.I know that from a nvidia driver developer.Don’t know abou sgi drivers though.