Streaming texture

The page http://www.songho.ca/opengl/gl_pbo.html
has a description of using two PBO to update a texture using asynchronous DMA.

The concept is simple enough but one thing confuses me. In any real application, would you not have at least two textures as well as two PBO?

The figure labled “Streaming texture uploads with 2 PBOs”
will not allow rendering with the texture while writing into it as far as I understand (new to opengl).

If it had been two PBO with one texture bound each, then
you could read data from disk into PBO_0 while rendering using TEX_0 and DMA’ing data from PBO_1 into TEX_1 and then swap

in this sequence

1: DISK->PBO_0 and PBO_1->TEX_1 and RENDER(TEX_0)
2: DISK->PBO_1 and PBO_0->TEX_0 and RENDER(TEX_1)
3: DISK->PBO_0 and PBO_1->TEX_1 and RENDER(TEX_0)

I think to tap the benefit of PBOs you need to have the timing down in your app when you are ready to render.

Obviously it’s not going to be like a movie frame so you want to be streaming textures in the background.

In most cases I think you would be more concerned with having the texture you need than building a stack of them fast enough. Having two halves of two textures is not very useful.

If it had been two PBO with one texture bound each

First mistake: textures are not bound to PBOs or vice-versa. A PBO is just a buffer object that you use to transfer data to a texture in place of a client memory pointer.

Second, the reason to use two buffer objects is to gain maximum performance. There are 2 steps in streaming data: getting the data from wherever it is coming from (reading from disk etc), and the DMA transfer of that data to the texture. You use one buffer object as the destination of a read (step 1), while a second buffer object is being used as the source for a write (step 2). It gives maximum performance, at the cost of memory.

To do what you’re suggesting would be no different from using one PBO for each texture.

In any real application, would you not have at least two textures as well as two PBO?

Well, that depends.

The two PBOs is for a specific case of streaming: you need to update the texture every frame. If you’re only streaming data “infrequently”, if your data-set is simply larger than can fit into video memory, then you are not guaranteed to be streaming data constantly. You also generally have some textures that aren’t currently being used, which are the destination textures for streaming in a new section of data.

With this kind of streaming, you generally give yourself several seconds of latency. That is, you say “I need the data for block X” long before you actually intend to start using it. In this case, you may use one or two PBOs for multiple textures.

This kind of streaming is far more prevelant than the kind where you need to update a texture every frame. That kind of streaming is typically movie rendering. And you’re generally only rendering one movie at a time, so it’s not a problem.

To do what you’re suggesting would be no different from using one PBO for each texture.

That was what I suggested actually… the link I gave suggested using two pbo with one texture.

I am doing movie rendering but in a 3D scane and with several movies. Every time I render I need to update the texture.

In that light, would the sequence I suggested be the optimal? I cannot write to a texture while rendering with it so i need two. I cannot read video into a PBO while writing from it to a texture so I need two of those as well. I assume there are no extra problems requiring me to have more than two of each?

Thanks for the info about the “bound to texture” misconception I was having.

In that light, would the sequence I suggested be the optimal?

Actually come to think of it, you don’t need 2 PBOs at all. One will do, so long as you define it with the STREAM_READ flag and orphan it immediately after each upload.

So you read data into the buffer, then use glTexSubImage to upload it. Then you call glBufferData(NULL) to orphan it.

The 2 PBO method is more guaranteed to give you fast streaming performance. But most implementations are pretty good about proper orphaning support, so this ought to work fine. In this case, you only need 1 buffer object for all of your uploading; just orphan the buffer after every glTexSubImage call.

Do you mean that as soon as the asynchronous glTexSubImage returns, and is actually in the process of moving data from the buffer to the texture, I release the PBO from that buffer and bind it to another one which I can fill with video data from disk?

This means that while the old buffer is being moved into the texture, I fill a new buffer with the next frame?

Seems reasonable enough considering the buffer data will be in different parts of memory, but would it not mean that I constantly allocate device memory, copy to it, allocate new memory, copy to it etc, rather than having two PBO’s which each have the same buffer assined constantly?

Might I also ask… when calling glMapBuffer, is the returned pointer to memory on the graphics card mapped into normal address space or is it realy a pointer to system memory?
I assume it is card memory mapped, as the name inplies, but I would like to be absolutely sure when considering how to best write the import routine.
It would mean that the “slow” part is writing data to the buffer while the copy to texture is very fast.

Thanks for the help thus far :slight_smile: