are there problems with lots of textures?

canuckle · April 15, 2007, 5:29pm

I have a program with a memory cache potentially containing many (say, 3k different) textures. This could potentially take up a lot (say, 500 MB) in memory.

I’m assuming OGL is keeping a copy of my textures in AGP memory in RGBA_32 format and sends it to the video card when necessary. I usually draw using the same (say, 150) textures every frame, so hopefully there’s not a lot of thrashing of video memory.

I’m wondering if this is an acceptable thing to do or if I shouldn’t rely on the OGL drivers to do a good job managing this memory.

Any ideas?

Cyranose · April 15, 2007, 6:59pm

Out of curiosity, are you binding most or all 3k textures each frame and only seeing the usual 150 of them? Or do you have some spatial testing before binds?

[note: In responding, I’m assuming you have less available videomem than total requested texture bytes, or you wouldn’t be asking. Let me know if that’s not the case.]

If you want to benchmark how much those texture “misses” (where the texture is requested but not resident) cost, you can add a toggle to bind only one texture for all objects and see the difference. If it’s significant, then you have your answer as to “cost” and whether you should try to optimize here.

As for relying on the driver, you don’t have much choice. “If the texture don’t fit, you must page it.”

OTOH, one fairly simple method I used for a project a while back (with at least that many textures) was to have all textures have two software states – one where the texture is considered “unloaded” meaning that when bound, GL is handed the ID of a common low-res stand-in (perhaps a 1x1 or 2x2 pixel of a similar color, but not too many of these either); and the second state is the normal “loaded” textured state. This way, your Bind wrapper code doesn’t need to know much except the currently used ID for that texture.

A texture manager would then use spatial tests to flip the best set of textures from unloaded to loaded states and back as a participant flew around.

At the time, bandwidth was very low relative to now, so we also had to carefully manage how many we flipped to “loaded” per frame to avoid stutters when the textures were actually fetched from system memory.

The downside is that if you don’t get textures into the loaded state in time, you see the stand-in texture, which may or may not be a problem for you. The upside is you can make the uploading very controllable and deterministic (compared to the blind approach).

A similar method would also work, where you could force the MIP level to the lowest resolution for any texture you think you might not need at the moment, thereby capping the # bytes transferred at any given time.

Other methods try to group textures, which helps with binds but not so much with memory (and mipping is harder). And then there are various “universal texture” methods to try and handle the paging. But that’s also not trivial for a lot of unique textures that may or may not have spatial coherence.

k_szczech · April 16, 2007, 7:48am

If you want to benchmark how much those texture “misses” (where the texture is requested but not resident) cost, you can add a toggle to bind only one texture for all objects and see the difference.
Better would be to randomly choose texture of similar format and size - just try larger/smaller texture sets. This way you make sure cache and texture bind cost don’t get in the way of your benchmark.
Still, you may get a whole lot different results on different GPU’s.

A texture manager would then use spatial tests to flip the best set of textures from unloaded to loaded states and back as a participant flew around.
Fine for DX, but in OpenGL you can’t tell if texture is resident on modern GPU’s. Longs Peaks will allow that again.

knackered · April 16, 2007, 8:09am

He’s not talking about loaded as in card-resident, he’s talking at the application level.

Cyranose · April 16, 2007, 11:35am

Yes, not “loaded” as in resident, but in quotes to represent a virtual state, in this case, using the GL-ID of a pre-loaded stand-in or one of the lower-res MIP levels instead of the texture you want to temporarily avoid making resident.

I’m not sure what random binding buys you, k_szczech. I’d like to understand. My point was just to see what the total cost of 3k/500MB texture binds are for this app. Texturing could be turned off entirely, but then the fill cost is different, so I suggest using a dummy texture to factor that out.

If the question was just about the number of textures and not the memory management issues, then I’d suggest binding the same number of unique but smaller textures (known to fit in videomem) 1:1 with the original binds to distinguish the binds from the possible memory transfers. But I doubted that was the case, as even a 512MB card will page when handed 500MB worth of textures.

canuckle · April 16, 2007, 12:49pm

Hey, guys, thanks for the help!

To clarify, I bind at most 150 textures per frame. Between any two consecutive frames, most of the bound textures are the same, so the miss rate should not be high. I understand that OGL will have to page in textures into the card when necessary.

My question was whether or not I can trust OGL to store 3k textures without doing something very inefficient. From your responses, it seems that the answer is yes.

Another relevant topic (which you guys touched) is if I could provide a pointer to the textures whenever they need to be paged in to the video card (instead of OGL doing that for me). That way OGL would not need to keep a copy of the texture in driver memory and I could also use the image for other purposes in my application.

Thanks for your help!

k_szczech · April 16, 2007, 2:57pm

He’s not talking about loaded as in card-resident
But I believe canuckle is

I’m not sure what random binding buys you, k_szczech
Nothing. That’s the point.
If you want to see how much you loose on swapping textures with system memory then you shuldn’t compare 500 textures vs. 1 texture since that also involves texture binding speed and texture caching differencies.
Comparing 500 vs. 100 would be much better.
Actually you don’t need random textures. The best option would be to map these 500 textures to a set of 100 textures (every 5 textures will be actually using the same GL name).