PDA

View Full Version : async texture loading



pronvit
07-15-2005, 11:05 AM
I have main thread that renders some graphics and another thread that loads textures using separate GL context that is shared with main one. This works perfectly on Linux but is slow on Windows. Does anybody knows what's the problem and how to load textures not interrupting main thread??

andras
07-15-2005, 01:26 PM
I don't know why it's slow on Windows, but you don't need a separate context just to load textures asynchronously in the first place! Why not load the image into a local buffer, using your loader thread, then signal the main thread, when data is ready, and then load data using glTexImage from your main thread? A more advanced solution would be to load straight into a PBO, but that's optional..

andras
07-15-2005, 01:28 PM
err, i've sent this message twice by accident.. is it possible to remove this second one somehow?


I don't know why it's slow on Windows, but you don't need a separate context just to load textures asynchronously in the first place! Why not load the image into a local buffer, using your loader thread, then signal the main thread, when data is ready, and then load data using glTexImage from your main thread? A more advanced solution would be to load straight into a PBO, but that's optional..

pronvit
07-15-2005, 02:13 PM
the problem is that glTexImage2D itself takes much time and unacceptable in main thread because image needs to be updated smoothly (camera move)

Joe Montana
07-15-2005, 03:08 PM
Ive found the same problem. glTexImage takes a lot of time. glTexSubImage also makes no improvement even though i though it would.
Ive seen programs like the free WorldWind from NASA thats seems to do this seamlessly...

andras
07-15-2005, 03:16 PM
If speed is an issue, then load data into PBO first. glTexImage from a PBO is lightning fast. If it's still not fast enough, then just upload one texture (or even just a subimage) each frame.

andras
07-15-2005, 03:34 PM
As for WorldWind, I believe it uses DXT1 compressed textures. A 2048x2048 RGB texture is just 2MB when compressed with DXT1! Uploading 2MB is a snap! How much data you are trying to upload every frame? I'd also guess that WorldWind doesn't switch all the textures at once, rather it loads them up as required, one at a time. Of course, these are only guesses from my part. The WorldWind project is open source, so you could take a look at how exactly they did it..

pronvit
07-15-2005, 10:15 PM
I know about WorldWind but it's DirectX based and not OpenGL, so it won't help me much. Btw, it also updates models and loads textures in separate thread. And I'm wondering why using separate thread doesn't work for me.
Very good example is Google Earth - it's like WorldWind but OpenGL based (and not opensourc:( ), camera move is very smooth there and isn't interrupted at all by texture loading so I'm very interesting in how this is done.

pronvit
07-16-2005, 04:43 AM
finally I found the solution!! :)

i tunred off S3TC and FXT1 texture compression in driver options in Windows and now my app works as fast as on Linux:)

so next question (I'm new to OpenGL programming) - how to control these settings from my app?

ZbuffeR
07-16-2005, 05:03 AM
The specs :
http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_compression_s3tc.txt
http://oss.sgi.com/projects/ogl-sample/registry/ARB/texture_compression.txt
Sample code :
http://www.robthebloke.org/opengl_programming.html (search for 'compression' in the page)

andras
07-16-2005, 08:10 AM
I know about WorldWind but it's DirectX based and not OpenGL, so it won't help me much.It doesn't really matter if it's OpenGL or not, what I said about loading still holds.

Btw, it also updates models and loads textures in separate thread.I've never said otherways. What I said is that you do not need a separate context to do this. :)
And don't forget about the powerful PBOs!!

pronvit
07-16-2005, 09:13 AM
but how can two threads use one context at the same time?

in any case now it works quite well for me (my textures are small and I don't need PBOs and so on, it was slow because of texture compression done by driver).

andras
07-16-2005, 10:32 AM
but how can two threads use one context at the same time?They can't. You only upload textures from the one thread that has the context. But the other thread can still do all the hard work without needing a context. I've explained it before, please read more carefully. ;)
Anyway, if it works for you now it's fine, I'm just saying that IMHO it's not worth the trouble to have multiple contexts for this. You can do simpler and better.

zed
07-16-2005, 01:33 PM
i havent seen Google Earth or WorldWind but i can imagine what theyre like.
loading a 1024x1024 dxt3/5 texture takes a LOT less than 0.01 seconds thus u can happily maintain 100fps on a reasonable system by just loading one texture a frame (ive done this myself) with absolutly no pauses when new textures are loaded in.
no threads are needed

pronvit
07-16-2005, 02:01 PM
but I can see pauses.. and they aren't there when no textures need to be loaded..

ok, thank you, I'll play with texture formats/flags..

andras
07-16-2005, 02:06 PM
but I can see pauses.. and they aren't there when no textures need to be loaded..Can you tell us how many, and what size/format textures you are trying to upload every frame?

BTW: PBO! PBO! PBO! :D

pronvit
07-16-2005, 02:12 PM
textures are small - 512x512, I'm loading them from .jpg files (in separate thread) and then uploading as GL_RGB, now I can see very small difference (if any) between uploading in main thread, one per frame, or in separate thread (and context). of cource maybe there's something wrong in other parts of my code, but I surely can see difference between situations when new textures need to be uploaded and when not.

any tutorial/doc on using PBO?

andras
07-16-2005, 02:28 PM
Could you try uploading in main thread, one per frame, but do the jpeg decompression in another thread? If it's still slow, I suspect there is something else.

As for PBOs, you won't find too many tutorials, but it's exactly the same as a VBO, for which you'll find tons of materials online. The only difference is that with PBO, you can bind an arbitrary buffer object to GL_PIXEL_UNPACK_BUFFER target, and after this, every glTexImage call will source the data from the bound buffer object. And the contents of the buffer object can be loaded in a separate thread, and it's usually located in VRAM or AGP memory, from where the actual upload with glTexImage will be lightning fast. Hope this helps.

ScottManDeath
07-16-2005, 03:38 PM
If you use glTexImage and S3TC the driver compresses you textures --> slow

If you use glCompressedTexImage with precompressed S3TC format and proper parameters for the call, the driver just copies the texture to the GPU --> fast

ZbuffeR
07-18-2005, 12:05 AM
(I just played a bit with Google Earth : they seem to use only small textures)

OneSadCookie
07-18-2005, 01:50 AM
Using a four-component format (eg RGBA) will almost certainly be subtantially faster to upload than a three-component one (RGB, as you say you're using).

I know libjpeg doesn't like to decompress to four-component colors, but perhaps you could use an alternate image format or decompression library.

Of course, as the other say, uploading precompressed DXT textures is probably as fast as you can get...

pronvit
07-18-2005, 02:03 AM
ok, thank you for all replies, I'm still playing with my code, hope to get what I need with your help.

so two questions: how to tell driver (from my app) not compress my uncompressed textures when I call glTexImage2D? and do you know some library to compress textures (to do this in background thread and then pass small compressed data for uploading).

btw, I read in help for glTexImage2D that given data is converted to floats, "multiplied by the signed scale factor GL_c_SCALE, added to the signed bias GL_c_BIAS, and clamped to the range [0,1]" - so this isn't a direct copy in any case?

jide
07-18-2005, 02:36 AM
Originally posted by OneSadCookie:
Using a four-component format (eg RGBA) will almost certainly be subtantially faster to upload than a three-component one (RGB, as you say you're using).Why would it be substantially faster ?

kehziah
07-18-2005, 03:42 AM
because of better data alignment

pronvit
07-18-2005, 04:20 AM
Originally posted by jide:

Originally posted by OneSadCookie:
Using a four-component format (eg RGBA) will almost certainly be subtantially faster to upload than a three-component one (RGB, as you say you're using).Why would it be substantially faster ?From help for glTexImage2D:
"It is converted to floating point format and assembled into an RGBA element by attaching 1.0 for alpha."
so alpha will be there in any case and function will return faster if alpha will be already there

pronvit
07-18-2005, 04:50 AM
most interesting thing is that under Linux my app works smoothly w/o any advanced methods (such as PBO) and under Windows it doesn't. so I assume maybe there are some different defaults in drivers on differend systems...

Relic
07-18-2005, 06:28 AM
These conversion rules in the spec are just to specify what the exact procedure is, but it would be silly to convert to float and back to an integer internal format with the default bias of 0.0 and scale of 1.0 because the data won't change.
I also wouldn't expect that RGB is expanded to RGBA if a hardware can support that format natively. It will save the memory for the alpha component knowing that the alpha should be 1.0 during usage of such textures.

andras
07-18-2005, 06:37 AM
Originally posted by ZbuffeR:
(I just played a bit with Google Earth : they seem to use only small textures)That's the default. Go to tools/options to change it..

andras
07-18-2005, 06:40 AM
I also wouldn't expect that RGB is expanded to RGBA if a hardware can support that format natively. It will save the memory for the alpha component knowing that the alpha should be 1.0 during usage of such textures.In practice, nVidia cards do not support 24bit RGB, they are all stored as 32bit RGBA (this makes DXT1 sound even greater in comparison since it's just 4bits per pixel!!) I don't know about ATI though..

andras
07-18-2005, 06:45 AM
how to tell driver (from my app) not compress my uncompressed textures when I call glTexImage2D?Simple. If you set the internalformat to be compressed, then glTexImage will compress it on-the-fly, otherwise it won't.

Relic
07-18-2005, 07:27 AM
Originally posted by andras:
In practice, nVidia cards do not support 24bit RGB, they are all stored as 32bit RGBA Mostly right, but check the last column here: http://developer.nvidia.com/object/nv_ogl_texture_formats.html

andras
07-18-2005, 08:16 AM
Hehe, so the GF6200 TurboCache supports it... Obviously in that card they did everything to save memory.

pronvit
07-18-2005, 08:50 AM
Originally posted by andras:

how to tell driver (from my app) not compress my uncompressed textures when I call glTexImage2D?Simple. If you set the internalformat to be compressed, then glTexImage will compress it on-the-fly, otherwise it won't.it's obvious, but how exactly to set internal format to be uncompressed?

Jan
07-18-2005, 10:26 AM
Originally posted by pronvit:

Originally posted by andras:

how to tell driver (from my app) not compress my uncompressed textures when I call glTexImage2D?Simple. If you set the internalformat to be compressed, then glTexImage will compress it on-the-fly, otherwise it won't.it's obvious, but how exactly to set internal format to be uncompressed?RTFM:

ie. GL_RGBA8 - 32 Bit UNCOMPRESSED