OT: HL2 FSAA prob, what did they mean?

I’m sure you’ve read this news on other website. It’s a bit OT I know, plus it’s a DX engine.
But I was really wondering about what they said:

It’s a problem for any app that packs small textures into larger textures.
(Valve)

Did they mean that they use like one big texture that actually holds multiple textures?
Now why would you want to do that?

Now why would you want to do that?

To reduce the number of render state switches.

It’s not just the cost of texture binding - being able to render more stuff without changing renderstate means fewer, larger batches of rendering primitives, and this is usually touted as the key to good performance these days.

And in case you want to know. . . here’s a bit by ATI’s sireric on the problem and on why it’s likely that only ATI will be able to have a solution:

It’s not a special mode of AA. It has to do with the method used to sample textures. If you render a partial pixel, you need to sample the texture at that pixel to get a color for the fragment. What texture coordinate do you use for a partially covered pixel? You can use the pixel center (which is what is done in current drivers/specs), or, the R3xx allows you to sample at the centroid of the fragment.

Both systems have problems. Center of pixel can lead to sampling outside the texture, and then the application has to create a buffer region outside to compensate (but that buffer region can be unlimited, depending on the pixel/texel ratio). If you use centroid, you avoid that problem, but can create new ones in cases where you only want to sample a texel once (i.e. menus and text). We would need to way to allow the application to specify the kind of sampling wanted on each texture. That’s not in the API now. However, the HW can do it. We might be able to do something in the control panel.

Brilliant.

Epic doesn’t understand texture filtering.

Valve doesn’t understand texture filtering.

Who’s next?

From a signal theory pov (which is the only way to go for texture filtering), “centroid sampling” or whatever they like to call it is pure nonsense.

Q: My audio samples exhibit an annoying click at the end. What can I do?
A: Duplicate and repeat the last sample.

shudder

I’m agreeing with Zeckensack here, unless the above post was made by people not directly involved with the game in question. What they’re saying really doesn’t make sense. And it doesn’t make sense on several levels.

What I find really bizarre is that they only just noticed this was a problem? The game’s been in development for several years and 70 days from release they announce they are trying to resolve an issue with FSAA? If they didn’t just notice it why couldn’t they have fixed it for the NV30?

I mean I never use FSAA except I test my software with it and with AF and multisampling. The reason being that those settings have a habit of causing me problems.

I don´t know much about AA, but i always thought it´s done on the framebuffer and has nothing to do with texturing, at all, does it?
And i don´t understand what AA has to do with texture-filtering. Could someone explain this to me?

Jan.

I don´t know much about AA, but i always thought it´s done on the framebuffer and has nothing to do with texturing, at all, does it?

Well, there are antialiasing techniques that aren’t all done in the framebuffer, but they aren’t discussing those techniques.

And i don´t understand what AA has to do with texture-filtering. Could someone explain this to me?

Which is why we’re on the belief that they’re idiots. There really isn’t a connection there, but they believe there is one.

Originally posted by MikeC:
[b]Now why would you want to do that?

To reduce the number of render state switches.

It’s not just the cost of texture binding - being able to render more stuff without changing renderstate means fewer, larger batches of rendering primitives, and this is usually touted as the key to good performance these days.[/b]

I don’t think that nowadays it’s the real bottleneck, especially when using tons of shaders. I mean there are a lot of posts around here where NVIDIA guys have been telling that a front to back rendering was more important than reducing the number of state switches.
Packing small textures into larger sounds to me like it’s bringing more problem that it solves: limits filtering techniques, mipmaps troubles, and by extension this FSAA problem.

I think it’s a bad idea not to at least offer the choice to switch off this “texture packing”.

What really bothers me in this situtation is that Valve is now blaming NVIDIA, ATI and Microsoft/DirectX for the consequences of their decisions.

Originally posted by Jan2000:
And i don´t understand what AA has to do with texture-filtering. Could someone explain this to me?

Jan.[/b]
If your renderer breaks upon changing resolution, it will break upon activating AA. This is a very general statement, different things can go bad depending on the exact AA method.
But it’s about the only thing that’s true about potential AA issues.

(well, there are obvious issues with R2T postfilter effects, but let’s not complicate things further)


NVIDIA guys have been telling that a front to back rendering was more important than reducing the number of state switches.

They’re both important, impossible to say which is more important because it depends whether your app is fill rate or CPU limited. If your app is fill rate limited then front to back rendering will help. If your limited by CPU or the bottlenecks introduced by state switches then clearly those need to be reduced.

Personally one of the most obvious optimisations with regards to texturing is to reduce bindtexture calls. It would be near the top of my optimisation list to pack textures into fewer big textures. I don’t think it was a poor decision by valve to do that, I would have thought most commercial games do the same.

Originally posted by Jan2000:
[b]And i don´t understand what AA has to do with texture-filtering. Could someone explain this to me?

Jan.[/b]

It has to do at polygon edges: if you multisample (i.e. you do only edge AA, as opposed to supersampling), you may endup sampling outside the polygon, which may cause sampling outside the texture if it’s “tightly” snapped to the poly.

Thus, if you are packing different smaller textures inside a bigger texture, with this kind of AA you may endup sampling texels from a neighbouring smaller texture.

I must admit so far I almost ignored parts of specs regarding multisampling, so excuse me if I say BS.

It has to do at polygon edges: if you multisample (i.e. you do only edge AA, as opposed to supersampling), you may endup sampling outside the polygon, which may cause sampling outside the texture if it’s “tightly” snapped to the poly.

Thus, if you are packing different smaller textures inside a bigger texture, with this kind of AA you may endup sampling texels from a neighbouring smaller texture.
Are you sure about this? Let’s forget about packing small textures for a while. Take simple black-white, 2x2 chessboard texture, with wrap mode set to “repeat” (so that image is tilable, but discontinuous at edges). Let’s render textured quad, having texcoords covering entire texture {(0,0), (1,0), (1,1), (0,1)}. Now, If what you said was true, such quad rendered in MSAA mode could have awful errors at edges? Can’t believe multisampling is that broken… If HW is supposed to “reject” samples failing depth and stencil tests, then why would it ever be allowed to “accept” samples falling outside of polygon?

Going back to HL2, could somebody explain this:

  1. Would simple 1-texel-wide border of each packed sub-texture solve the problem? (the border filled with data according to desired wrap mode of sub-texture)

  2. Is this problem specific to DX? DX rules of bilinear sampling ar said to be different in details from these of GL.

Here’s some screens of a little test-app:

No FSAA: http://members.shaw.ca/dwkjo/screenshots/check.jpg
4x FSAA: http://members.shaw.ca/dwkjo/screenshots/check2.jpg

Filtering: GL_NEAREST
Texture wrap: GL_REPEAT
Texture size 64x64

Radeon 9700 Pro, Catalyst 3.6

As I was about to post this I tried a 256x256 version of the texture with GL_LINEAR filtering. The result was that the bleeding occured regardless of FSAA unless the quad was not rotated at all.

[This message has been edited by Ostsol (edited 07-20-2003).]

Thus, if you are packing different smaller textures inside a bigger texture, with this kind of AA you may endup sampling texels from a neighbouring smaller texture.
Why use edge AA when you can’t use mip-mapping? This texture packing technique would cause that problem even if you had a border around the subtextures. AA, which can be an expensive technique, only gives a small image quality improvement. Mip-mapping, which is not that expensive (and can even lead to speed increases when the texture stride is large), gives a noticable image quality improvement.

Ostsol, that example is designed to show the bleeding at the edges; it’s not a realistic situation. Most tiling textures don’t have such abrupt color transitions that would show at the edges as the checkerboard texture does. And as you pointed out, with this texture, this artifact occurs even with GL_LINEAR magnification filtering, which has been standard for years now.

I think the texture packing technique is a fundamentally flawed technique because of the mip-mapping issue, and I don’t think it is/will be widely used. Really, why should it be such a performance hit to bind a new texture?

It’s not about texture binds, it’s about packing state changes so you end up having shaders/materials that cover larger batches of polygons. Similalry you might use vertex components or textures to change material properties rather than changing state outside drawprimitive calls.
Rendering fewer batches is supposed to be saving you a lot of CPU time.
AFAIK this is a more serious issue with D3D than OGL.

Originally posted by Aaron:
I think the texture packing technique is a fundamentally flawed technique because of the mip-mapping issue, and I don’t think it is/will be widely used. Really, why should it be such a performance hit to bind a new texture?

if you have to switch nearly per triangle, it would hurt because you cannot render batches… thinking of old-style lightmapping now… i dunno why/how it hurts hl2 exactly, though…

I don’t think that nowadays it’s the real bottleneck, especially when using tons of shaders. I mean there are a lot of posts around here where NVIDIA guys have been telling that a front to back rendering was more important than reducing the number of state switches.
Packing small textures into larger sounds to me like it’s bringing more problem that it solves: limits filtering techniques, mipmaps troubles, and by extension this FSAA problem.

If you have say a couple thousand non-power of 2 very small textures, then it makes very good sense to pack them into large power of 2 textures. Provided you group textures together as to their render method.

To me it sounds like they aren’t leaving pixel gaps between their images in packed textures. Anyone who’s written a texture packer will obviously know you have to do this, just to stop bilinear filtering sampling adjacent packed images.

Perhaps they might need to increase the texel gap, depending on the amount of sample coverage used in the FSAA implementation.

But I agree, its a bit pointless waiting until this time, then blaiming the IHV’s when your engine design breaks under certain conditions.

[This message has been edited by Nutty (edited 07-21-2003).]

Originally posted by Ostsol:
As I was about to post this I tried a 256x256 version of the texture with GL_LINEAR filtering. The result was that the bleeding occured regardless of FSAA unless the quad was not rotated at all.
Technically it shouldn’t. This is most likely a precision limitation inherent to the tex coord interpolators. That’s a different issue.

However, I’d like you to check a few things in your test app:
1)Have you tried with a mipmapped checkerboard texture?
2)What’s the texel/pixel ratio, roughly? Do you magnify or minify?
You see, GL_LINEAR is not a particularly useful minification filter

Minification will lead to texture aliasing (shimmering) either way with GL_LINEAR, regardless of AA.

Originally posted by Aaron:
Why use edge AA when you can’t use mip-mapping? This texture packing technique would cause that problem even if you had a border around the subtextures. AA, which can be an expensive technique, only gives a small image quality improvement. Mip-mapping, which is not that expensive (and can even lead to speed increases when the texture stride is large), gives a noticable image quality improvement.

First, normally you don’t mipmap the textures you pack. Look at Quake3, for example, it packs lightmaps but because a lightmap is almost always rendered magnified, it doesn’t bother mipmapping them. The non-mipmapped texture lookup is also faster to do by the graphics chip, so from a performance point of view (in this case!) it’s also better not to use mipmaps.

Second, even if you mipmap those textures, when the texture is used minified you are less likely to be able to notice that problem because it will affect to few pixels and without a visible pattern. One of the reasons is that the textures packed are never tiled (or at least you cannot use repeat modes, although the artist could tile them “manually”).

Regarding the binding cost:

  • in OpenGL there shouldn’t be much of a difference, obviously the least unnecessary work you do, the better and if your native textures are not power of two, you will get a better texture cache usage if you pack them in power or two textures.
  • in DX, on the other hand, it’s very important to be able to send large batches to the graphics board, because each drawprimitive call means a ring 3 - ring 0 context switch. Packing textures allows you to send larger batches (as already mentioned by other posters).