Instability: large textures?

Has anyone seen computation, or even total system, instability when operating on very large textures?

I’m working on a simple GPGPU framework using an example to add three arrays together in two passes. I vary the size of these arrays right up to around 3400x3400x4 each (with two as input, one as output in memory) before getting GL_OUT_OF_MEMORY. All looks good, although 132MB is quite a bit less than the 512MB on the card; but other processes use memory, so…

But occasionally when working with large data sets - 2900x2900x4 is one example - I’ll get a randomly incorrect result from the addition. I do it in two passes of two additions each, to illustrate an optimisation I’m testing, so either pass might have miscalculated. Like:

0.765007 + 0.0233772 + 0.0704367 = 1.42595 (?)
Then it’ll work fine again and be hard to reproduce. Just a moment ago, using a slight variation of the two-pass algorithm, I managed to completely freeze and reboot my machine.

Running the same example on my laptop (with half as much system and video memory), that same executable just swaps endlessly before causing an exception. At first I suspected overheating could be the case, but the GPU was at 52C on reboot; seems fine. And this only really seems to happen with very large data sets; I’ve never seen a similar result during intensive repeated computation on small sets.

I think I’m going to stick to 2048x2048x4 textures meanwhile, because I’ve never had problems with these. But it is a little worrying that the result of a GPU computation might be incorrect… :confused:

What’s the max texture-size, that your hardware supports? My X1600 supports only 2048^2 textures, though it does support 4096^2 sized framebuffers.

Jan.

GL_MAX_TEXTURE_SIZE reports 4096 on my GeForce 7900 GTX.

I experienced similar problems with Forceware drivers in versions < 81.98: when trying to display textures that don’t fit in video memory, I found that some texels might be corrupted. This occured only with textures swapped out after having been loaded on the GPU.

However, the problem has been corrected as far as I know. Hope this helps.

We have been working with large textures here, ‘large’ meaning in comparison to the amount of graphics card memory.

We have noticed some problems with drivers. One thing we noticed is that we could not rely on the driver’s out of memory error. Sometimes the texture creation goes through, but the texture only returns white/garbage when it is sampled. We’ve also noticed a bottom line disappearing on RTT textures that exactly the max limit.

We wrote our own memory manager that used DXDiag to obtain the amount of card memory and made sure we never filled the card up past a certain %. This also elimated problems that we were having with memory being swapped between the card/RAM by the drivers which would sometimes cause a large pause (a few seconds) when a window viewing a large (eg 128mb float32 texture) was brought to the forground.

We also had several application crashes and bluescreens from drivers, which seemed to be happening when the drivers couldn’t allocate a system memory copy for a texture because our application had already maxxed out RAM usage. Other times we have had skipped lines or other weird artifacts, all which stopped when we limited the GPU memory usage.

We split our work up into smaller texture chunks when possible. Unfortunately, GPU drivers don’t seem to do very well when handling large textures (>64mb) that are common in GPGPU apps.

Stephen,

What you’re saying sounds all too familiar. I’ve also had problems with swapping in/out of system memory severely degrading performance when working with very large textures.

The lack of a decent memory management extension in OpenGL makes this real tricky. I think you may be right in using DirectX to monitor memory usage; I’ll give this a shot.

I don’t use DX to monitor the memory usage. I use the DXDiag API to get the total amount of physical memory on the card. Then in my application I have a resource manager that keeps track of GPU memory usage, eg. framebuffer usage + texture usage + large VBOs + pbuffer usage. If this resource usage exceeds say 70% of graphics card memory, the resource manager will start to destroy stale/unused GPU objects when some piece of code tries to allocate a new GPU object.

This is of course an inexact estimate because:

  1. there could be other applications running in the background using framebuffer/texture resources and my resource manager has no way to know how much memory they are using. We allow the user to set the a GPU % cache setting here.

  2. the drivers are free to do anything they want behind the scenes. They don’t have to store textures/VBOs on the card if they don’t want to.

We’ve also found that using PBOs for uploading texture resources (especially large ones which we split into several PBOs) seems to force the driver not to keep a system memory copy of the texture which is a big advantage for us. If we’re working on a 512MB card, and we want to upload a 192MB texture, it seems like there are 3 copies of the texture: (1) our application has a copy of the 192MB texture, (2) the driver normally seems to like to keep a copy of the 192MB texture somewhere in system RAM (?) and (3) then there’s another copy of the 192MB texture somewhere on the card.

Copy 2 is extra and uses up memory that our application could definitely use. It would be nice if there was some way to tell the driver please don’t do this. We have a very sophisticated cache to disk system to handle our 192MB copy based on usage, but we can’t do anything about the driver’s copy.

All of the above is not fixed in stone, its just some of our guesses from having spent a lot of time trying to work around driver problems/limitations, we don’t know what the driver is actually doing, and indeed, we’ve noticed behaviour changes in different driver revs.

I’m not saying the drivers are badly written here, its just that they’re designed and optimized for a larger number of smaller textures, not huge float32 GPGPU textures that use up 1/4 - 1/2 of GPU card memory. The OpenGL model of abstracting away memory management doesn’t work so well here imho.