Pitfalls of rendering to texture.

Does anyone know if there are operations that will be faster if you render to a normal pbuffer and copy versus using the render to texture ext / render to depth texture?

Render_to_texture extension should be faster, as it eliminates the copy totally. I think there is a bug with this at the moment, which makes it slower, but if you’re writing for at least several months down the line, I’d use render_to_texture, as it will be fixed, and be faster than all the others… one day.

Nutty

Or if you use ATi hardware: have been faster that all other methods for like 9-10 months or so already …

It is in no way guaranteed to be faster. In fact, I can think of many reasons it could be slower.

  • Matt

It tends to be though. There are never any garantuees, it may all be done in software. But realistically, with proper hardware support and proper drivers, in most cases binding a pbuffer directly as a texture should be faster. Sure, I can think of some situations too where doing a copy may be faster too, for instance when the pbuffer is rendered to in a format that’s less suitable for texture access, for instance a scanline based format, and then is accessed very often later on. Then it would be better to do a copy and let the driver properly set it up in a swizzled format that’s better for texturing access.
In most realistic cases though it should be faster to bind it directly.

nVidia is having some serious issues with render to texture in all currently available drivers. In my experience its at least 2 to 3 times slower than doing the copy yourself. I still don’t get what the point of render to texture is if it does the copy. Even if it does i really don’t see how its any easier and to hell with it anyway since its so damn slow.

This is something that really really really really really really really really really really really really really really needs to be fixed. PLEASE !!!

Devulon

In D3D you have control over what type of format the driver is using(swizzled non) why does the driver let you initialize a render texture if its just going to copy? Is the format the only thing that can cause it to be slower? Is there no OpenGL way to specify the format?

Can you please specify the exact driver version and hardware you are using where you see “2 or 3 times” slower performance with RTT? That isn’t in line with what recent drivers should be doing, though earlier drivers did have a fairly significant performance issue that has since been fixed. Also, can you give any other further specific details? There are a few unlucky operations that just completely die when used with RTT.

Realistically, RTT will have slower fill rate but a smaller fixed overhead than CopyTexImage. Which wins out is not obvious. RTT is not a panacea; if you’ve gotten that impression, it’s the wrong impression.

If you want to render to a linear format, that’s what RTTR (render to texture rectangle) is for. That, or a standard pbuffer with CopyTexImage.

  • Matt

I might as well add providing an application w/ source code that shows the performance difference.

  • Matt

Is it abnormally slower or just a little? Are there any good performance docs? Isn’t rendering to a swizzled format faster? So if I specify texture rectangle then change all texture coordinates to be 0…width vs 0…1 will that be overall faster? Or are linear textures as slow to render with as swizzled textures are slow to render to?

I guess all of this greatly depends on how much rendernig you need to do into the buffer versus the size of the buffer. An annoying problem.

A question:
For something that might only do 4 full size blending passes on a 64*64 rgb texture will it most definitally be fastest to use render to texture?

Originally posted by JelloFish:
Isn’t rendering to a swizzled format faster?

The opposite is likely to be true.

  • Matt

Originally posted by mcraighead:
[b]It is in no way guaranteed to be faster. In fact, I can think of many reasons it could be slower.

  • Matt[/b]

What’s the point in this extension then?
If pbuffers+glcopysubimage work faster?
Am I missing some other point to rendertotexture?

So then how much slower is rendering into a swizzled texture than texturing from a linear texture?
and if the speed is bad for both solutions is the speed of a copy from linear mem to swizzled mem the “best” solution enough to warrent complete dismissal of RTT as an option?

It seems like RTT’s only advantage is memory savings.

I guess the speed of textureing from swizzled or linear entirely depends on the orientation of the texture right?

All this talk about memory formats makes me think that all research should go into improving memory bandwidth. I mean it seems to me like the fact that you must have your color buffer as an interleaved rgb stream is just plain cumbersom. In a couple years all this talk will be obsolete and we will be kicking ourselves for not putting more effort into speeding memory up sooner(before vertex programability among other things). Imagine if you could control pixels the way you control verticies r separate from g and b, no concern over what underlying format everything is using.

More Ranting:

Its the whole 6 month cycle thing it focuses on whiz bang features versus the fundamentally different technology that is needed. Its a very self destructive system.

Actually nvidias RTDT demo goes slower drawing a teapot than my game runs using CTDT.

Originally posted by knackered:
What’s the point in this extension then?
If pbuffers+glcopysubimage work faster?

Speed or no speed, try dynamically generating a 512x512 texture for a 640x480 display without pbuffers. Then you’ll know the answer.

[This message has been edited by LordKronos (edited 05-09-2002).]

JelloFish,

Faster memory throughput is the major push in all the hardware generations, and has been for a long time. Compare the pitiful 130 MB/s you get with a cheap 64 bit SDR card to the 10 GB/s they promise on the 256-bit high-clocked DDR cards.

Further, nVIDIA has previously pushed out new features on a “same speed” card once a year, and done a speed bump once a year, on staggered “6 month” intervals.

TNT2 -> GeForce1 was a feature upgrade; GeForce1 -> GeForce2 was speed bump; GeForce2 -> GeForce3 was a feature upgrade; GeForce3 -> GeForce4 was a speed bump; …

So I don’t see what your beef is. They keep giving us faster memory, and more things we can do with that faster memory, all at consumer prices. As is ATI, and even SiS, Matrox and now 3dlabs! I think the thing to complain about would be “Intel Built-in 3D Graphics” as it drags the entire bottom level of the market down with it :frowning:

Originally posted by LordKronos:
[b] Speed or no speed, try dynamically generating a 512x512 texture for a 640x480 display without pbuffers. Then you’ll know the answer.
B]

No, I’m saying WITH pbuffers, why is there a need for render-to-texture if it’s actually slower?