PDA

View Full Version : Pitfalls of rendering to texture.



JelloFish
05-08-2002, 08:13 AM
Does anyone know if there are operations that will be faster if you render to a normal pbuffer and copy versus using the render to texture ext / render to depth texture?

Nutty
05-08-2002, 08:18 AM
Render_to_texture extension should be faster, as it eliminates the copy totally. I think there is a bug with this at the moment, which makes it slower, but if you're writing for at least several months down the line, I'd use render_to_texture, as it will be fixed, and be faster than all the others.... one day.

Nutty

Humus
05-08-2002, 09:45 AM
Or if you use ATi hardware: have been faster that all other methods for like 9-10 months or so already ...

mcraighead
05-08-2002, 12:10 PM
It is in no way guaranteed to be faster. In fact, I can think of many reasons it could be slower.

- Matt

Humus
05-08-2002, 12:52 PM
It tends to be though. There are never any garantuees, it may all be done in software. But realistically, with proper hardware support and proper drivers, in most cases binding a pbuffer directly as a texture should be faster. Sure, I can think of some situations too where doing a copy may be faster too, for instance when the pbuffer is rendered to in a format that's less suitable for texture access, for instance a scanline based format, and then is accessed very often later on. Then it would be better to do a copy and let the driver properly set it up in a swizzled format that's better for texturing access.
In most realistic cases though it should be faster to bind it directly.

Devulon
05-08-2002, 12:55 PM
nVidia is having some serious issues with render to texture in all currently available drivers. In my experience its at least 2 to 3 times slower than doing the copy yourself. I still don't get what the point of render to texture is if it does the copy. Even if it does i really don't see how its any easier and to hell with it anyway since its so damn slow.

This is something that really really really really really really really really really really really really really really needs to be fixed. PLEASE !!!!!!!!

Devulon

JelloFish
05-08-2002, 02:05 PM
In D3D you have control over what type of format the driver is using(swizzled non) why does the driver let you initialize a render texture if its just going to copy? Is the format the only thing that can cause it to be slower? Is there no OpenGL way to specify the format?

mcraighead
05-08-2002, 06:33 PM
Can you please specify the exact driver version and hardware you are using where you see "2 or 3 times" slower performance with RTT? That isn't in line with what recent drivers should be doing, though earlier drivers did have a fairly significant performance issue that has since been fixed. Also, can you give any other further specific details? There are a few unlucky operations that just completely die when used with RTT.

Realistically, RTT will have slower fill rate but a smaller fixed overhead than CopyTexImage. Which wins out is not obvious. RTT is *not* a panacea; if you've gotten that impression, it's the wrong impression.

If you want to render to a linear format, that's what RTTR (render to texture rectangle) is for. That, or a standard pbuffer with CopyTexImage.

- Matt

mcraighead
05-08-2002, 06:36 PM
I might as well add providing an application w/ source code that shows the performance difference.

- Matt

JelloFish
05-08-2002, 09:16 PM
Is it abnormally slower or just a little? Are there any good performance docs? Isn't rendering to a swizzled format faster? So if I specify texture rectangle then change all texture coordinates to be 0..width vs 0..1 will that be overall faster? Or are linear textures as slow to render with as swizzled textures are slow to render to?

I guess all of this greatly depends on how much rendernig you need to do into the buffer versus the size of the buffer. An annoying problem.

A question:
For something that might only do 4 full size blending passes on a 64*64 rgb texture will it most definitally be fastest to use render to texture?

mcraighead
05-08-2002, 09:27 PM
Originally posted by JelloFish:
Isn't rendering to a swizzled format faster?

The opposite is likely to be true.

- Matt

knackered
05-08-2002, 11:44 PM
Originally posted by mcraighead:
It is in no way guaranteed to be faster. In fact, I can think of many reasons it could be slower.

- Matt

What's the point in this extension then?
If pbuffers+glcopysubimage work faster?
Am I missing some other point to rendertotexture?

JelloFish
05-09-2002, 12:43 AM
So then how much slower is rendering into a swizzled texture than texturing from a linear texture?
and if the speed is bad for both solutions is the speed of a copy from linear mem to swizzled mem the "best" solution enough to warrent complete dismissal of RTT as an option?

It seems like RTT's only advantage is memory savings.

JelloFish
05-09-2002, 12:44 AM
I guess the speed of textureing from swizzled or linear entirely depends on the orientation of the texture right?

JelloFish
05-09-2002, 12:58 AM
All this talk about memory formats makes me think that all research should go into improving memory bandwidth. I mean it seems to me like the fact that you must have your color buffer as an interleaved rgb stream is just plain cumbersom. In a couple years all this talk will be obsolete and we will be kicking ourselves for not putting more effort into speeding memory up sooner(before vertex programability among other things). Imagine if you could control pixels the way you control verticies r separate from g and b, no concern over what underlying format everything is using.

JelloFish
05-09-2002, 01:02 AM
More Ranting:

Its the whole 6 month cycle thing it focuses on whiz bang features versus the fundamentally different technology that is needed. Its a very self destructive system.

JelloFish
05-09-2002, 10:46 AM
Actually nvidias RTDT demo goes slower drawing a teapot than my game runs using CTDT.

LordKronos
05-09-2002, 10:54 AM
Originally posted by knackered:
What's the point in this extension then?
If pbuffers+glcopysubimage work faster?


Speed or no speed, try dynamically generating a 512x512 texture for a 640x480 display without pbuffers. Then you'll know the answer.



[This message has been edited by LordKronos (edited 05-09-2002).]

jwatte
05-09-2002, 01:22 PM
JelloFish,

Faster memory throughput is the major push in all the hardware generations, and has been for a long time. Compare the pitiful 130 MB/s you get with a cheap 64 bit SDR card to the 10 GB/s they promise on the 256-bit high-clocked DDR cards.

Further, nVIDIA has previously pushed out new features on a "same speed" card once a year, and done a speed bump once a year, on staggered "6 month" intervals.

TNT2 -> GeForce1 was a feature upgrade; GeForce1 -> GeForce2 was speed bump; GeForce2 -> GeForce3 was a feature upgrade; GeForce3 -> GeForce4 was a speed bump; ...

So I don't see what your beef is. They keep giving us faster memory, and more things we can do with that faster memory, all at consumer prices. As is ATI, and even SiS, Matrox and now 3dlabs! I think the thing to complain about would be "Intel Built-in 3D Graphics" as it drags the entire bottom level of the market down with it :-(

knackered
05-09-2002, 01:28 PM
Originally posted by LordKronos:
[B] Speed or no speed, try dynamically generating a 512x512 texture for a 640x480 display without pbuffers. Then you'll know the answer.
B]

No, I'm saying WITH pbuffers, why is there a need for render-to-texture if it's actually slower?

Won
05-09-2002, 03:02 PM
Jellofish - out of curiosity, what is RTDT or CTDT?

Also, the reason why textures are swizzled is so that you have good spatial locality in your texture memory fetches. They are probably organized so they fit in memory words or texture cache lines. The orientation of the texture doesn't really figure into it. Additional memory bandwidth is a good thing overall, but it is pretty orthogonal to the issue of swizzled v. linear.

Why would you want RTT v. pbuffer copy? I think it depends how long it takes to render, how long it takes to copy, how much video memory is available on and how many times you plan on using the texture. The different techniques are better/worse depending on the situation.

-Won

zed
05-09-2002, 08:55 PM
>>[B] Speed or no speed, try dynamically generating a 512x512 texture for a 640x480 display without pbuffers. Then you'll know the answer.
B]No, I'm saying WITH pbuffers, why is there a need for render-to-texture if it's actually slower?<<

LordKronos meant its impossible to do a 512x512 texture in a 640x480 window as any contents of a framebuffer than aint on the screen is undefined

JelloFish
05-10-2002, 12:25 AM
Sorry, i just mean rendering to a depth texture or copying to a depth texture. True they can exist orthogonally, all im saying is the greatest whiz bang features are useless without more memory bandwith; enough that swizzling versus linear just isnt an issue.(I realize this may never be the case)

It seems to me like unless you are doing the simplest renders available it is always faster to render to a linear pbuffer and swizzle to your texture. RTT seems like a useless extension in this way.

Devulon
05-10-2002, 02:07 AM
Originally posted by JelloFish:
Sorry, i just mean rendering to a depth texture or copying to a depth texture. True they can exist orthogonally, all im saying is the greatest whiz bang features are useless without more memory bandwith; enough that swizzling versus linear just isnt an issue.(I realize this may never be the case)

It seems to me like unless you are doing the simplest renders available it is always faster to render to a linear pbuffer and swizzle to your texture. RTT seems like a useless extension in this way.

knackered
05-10-2002, 11:35 AM
Originally posted by zed:
>>[B] Speed or no speed, try dynamically generating a 512x512 texture for a 640x480 display without pbuffers. Then you'll know the answer.
B]No, I'm saying WITH pbuffers, why is there a need for render-to-texture if it's actually slower?<<

LordKronos meant its impossible to do a 512x512 texture in a 640x480 window as any contents of a framebuffer than aint on the screen is undefined

Yes, I'm aware of what he meant, but my question was not "what is the point in the pbuffer extension?" - hope I've clarified myself.

Moshe Nissim
05-10-2002, 12:44 PM
Originally posted by mcraighead:

If you want to render to a linear format, that's what RTTR (render to texture rectangle) is for.
- Matt

Is a rectangle texture always stored in linear format? Regardless of render to texture ops?
If I want a fast window-to-texture copy, would then a rectangle texture do a better job? (presumably because both it and the window are linearly ordered). Assuming that texture would later be rendered in the same orientation (not 90 degrees rotated), I assume I will not suffer much performance drop from the fact that it is linearly ordered.

mcraighead
05-10-2002, 05:28 PM
RTTR is likely faster to render to but slower to texture from (unless your texture is oriented right).

It's also more restricted, e.g., no mipmaps or wrapping. It's not generally applicable.

- Matt

Moshe Nissim
05-10-2002, 09:41 PM
Originally posted by mcraighead:
RTTR is likely faster to render to but slower to texture from
- Matt

I don't actually need RTTR in this case. The question is about general rectangle textures. Are they always with linear order? I need to copy into the texture with glCopyTexImage2D from a framebuffer window, so I thought linear-to-linear copy will be faster. Is this correct?

mcraighead
05-11-2002, 09:02 AM
Copying to linear is probably not any faster than copying to swizzled, but I haven't checked.

- Matt

Moshe Nissim
05-11-2002, 10:30 AM
Originally posted by mcraighead:
Copying to linear is probably not any faster than copying to swizzled, but I haven't checked.
- Matt

But are rectangle texture guaranteed to be linear, regardless if they are ever bound to RTTR ?


[This message has been edited by Moshe Nissim (edited 05-11-2002).]

[This message has been edited by Moshe Nissim (edited 05-11-2002).]