More control for texturing

maybe that would be a better topic for 2.0 suggestions, but first i should make sure there isnt already a way that im too blind to see.

everybody and their dog have written a terrain renderer, so adding a little more that others didnt have is a tempting goal. so besides wasting time on the “perfect and most efficient quadtree” and smallest size possible i want some flexibility in detail textures and, to make things worse, i want it in one pass.

after my last terrain i decided that 3d textures sound perfect. you could only blend between two adjacent slices, but that seemed ok. of course distant terrain looked horrible, so mipmapping was next… and the end of the 3d approach. though i can specify width and height for each mip level opengl now hates me for trying to break its mipmap chain by not reducing the depth of the texture too.

to get around the selective restriction i added two textures, one containing the 3rd tex coord in each channel, the other containing the blending weight in each channel. its hell to work with it and still wouldnt solve the mipmap problem.

fine, i’ll place them in 2d and all next to each other. silly idea that already didnt work the last time around. texcoord fiddling (adding half a pixel here, subtracting a pixel there) would reduce bleeding and just duplicating the edge pixels would probably be a cheap solution to make them fit. tiling though seems impossible. with one “slice” ranging from 0-.25 theres no way to make it tile (none i could see at least). so after an afternoon of trying to cook up a shader to fix that i realized its impossible to tell if .3 is supposed to be slice2 or a tiled slice1.

next thought was: hey, im not using a gf3 anymore, this card shouldnt mind a dozen textures. in terms of “it works” it doesnt, but judging from the impact of sampling more than 4 textures i would guess the number of pipelines of a rad9800 to be 4.

multiple lookups into the same texture seem much cheaper, but how to store a bunch of textures in one to allow for tiling? for a second i wondered if subimage might help and if even when treated as seperate textures it would allow the driver to be clever. unfortunately i wasnt clever enough to notice that subimage wont allow multiple subimages for the same texture (if it does, i cant see how).

so, what i would need is somewhere between 3d texture and mipmap. multiple slices in one texture that can be independently sampled. removing the restriction for mipmap sizes would be great and allow just that, but somehow i feel the driver is using level 0 to decide how much memory the texture will need instead of adding more with each level (and it saves the trouble to store the offset for each mip level).

can it really be, that with all the programmable hardware and extensions and whatnot its not possible to achieve what i would consider a simple goal?

I’m not sure exactly what problem you are trying to solve. (You loose me at your first mention of 3D textures.) If you can clarify what the problem is, I could help better.

Have you tried using a fragment program? I suspect that’s all you need to get it to do what you want.

i guess by stepping so quickly through all things i tried i forgot to really explain the problem.

situation: about 6+ textures that want to be blended in wild ways (ie, fragment program is already used).

basically theres one texture containing 4 weights, a base texture to either influence the result or to interpolate and a couple of detail textures (4).

doing
tex base
tex weight
tex detail1, detail2

drops to half the speed when using more than 4 different textures, while multiple lookups in a single texture are quite fine.

the problem is: i need mipmapping and gl_repeat. the former makes 3d textures pointless (unless i would store a gazillion redundant slices just so i can divide the depth by 2 for each level without loosing the original slices) and the latter (repeat) is making it hard/impossible to store them all in a 2d texture.

so what i tried by now with a big texture coord (ie 0-8) in the fragment program:
frc coord
coord.x*=.25
tex detail1
coord.x+=.25
tex detail2

in addition, i tried adjusting the coord instead of just multiplying with .25
mad coord, .249, .001
to prevent neighbouring textures to cause ugly borders. after a lot of tweaking it always either left a small colored edge or the edges wouldnt match correctly. the real problem though was that frc would jump back to 0 for a coord of 1 and i think that was the reason for very ugly seems between tiles (independent of other parts bleeding into it).

another try included a cubemap, but bending the tex coords in a way to result in “normal” lookups in 4 of the faces was an ugly hack and didnt really give a decent result.

so the short question would be: how to sample a bunch of textures without using a seperate texture object for each and still being able to tile textures.

I think I understand better.
One thought: if you are trying to use just four texture objects but the number of fetches isn’t a problem, you could essentially get five textures out of it by having them all be RGBA and using the A from three of the textures to store another R, G, and B.

If you were to build your own mipmaps you could avoid adjacent sub-textures bluring into eachother.

As for tiling it’s a shame there’s no MOD instruction. I suppose you could use a LUT for that, but then you’d have a dependent fetch and you’d use up another texture (although that could be something to put in that last alpha channel.

Originally posted by Jared:
so the short question would be: how to sample a bunch of textures without using a seperate texture object for each and still being able to tile textures.

I wish I had a good answer, since I’ve had the same need. In the past, I’ve called this magic thing 2.5D textures, since what we need here is essentially a packed stack, bound as one, of 2D textures bound together via a 3rd coordinate/dimension. The 3rd dimension doesn’t get MIP reduced like the first 2 dimensions, but the texture can still be tiled in those dimensions.

I don’t know the hardware limits well enough to know why this might be hard, but I suspect it’s just not asked for enough to be a priority. I was hoping to emulate this in a shader, but no luck yet.

Anyway, if you’re using a regular grid for terrain as opposed to a TIN, you have the old standby of remapping your textures to each quad drawn to simulate texture repeat at the cost of extra verts drawn. Even if you’re using a TIN, you can do this to some degree by bloating your texture (doing a manual repeat 2x or 3x of the original) and picking texcoords that stitch things bach together.

I don’t think those are great suggestions though. A more practical approach is to do as someone else said and try to make use of the individual channels in the texture for different purposes.

well, my not so great version would store greyscale detail maps in each channel and just do a dp3 with the weight texture. not looking too great though.

i thought about using bigger textures and do some tiling there (it would still beat the 256x256x1024 3d texture ,-) )

id be curious about technical limitations too, especially since cube maps would do just what i need, except that they additionally do some wild texcoord conversion.

guess we need a new texture type in 2.0 or combine 3d and cubemaps etc. in a more general type. loosening the mipmap restrictions would already be a great help.

It would be nice to be able to specify texture filtering settings per texcoord. Really, why I want is a fully programmable texture fetch/filter shader. Sure, you can do lots of stuff already from the fragment programs, but it should probably become it’s own thing if it becomes programmable.

-Won

Edit:
It would be nice, for example, to programmatically generalilze things like wrap modes and the way textures seam together, like in cube maps.

[This message has been edited by Won (edited 02-18-2004).]

Won, I agree, that would be very cool. Given linear access to texture memory and the usual assembly operations with the addition of MOD, it would be easy to write a program to map texture coordinates to memory locations. One could then do interpolation as desired.

(I’m sure my naïeve understanding of graphics hardware leaves something to be desired, but still.)

Reading two wildly disparate pieces of a texture is likely to have as bad performance as reading two separate textures. At least if I understand current memory controllers correctly; each block of texture data touched counts as one “fetch” and it doesn’t really matter where it’s coming from.

Thus, if you read the same pixel twice, there’s just one fetch; however, if you read two widely separate pixels, that’s two fetches, just like reading two separate textures would be.

If you’re seeing a big step-wise drop in performance, a bunch of things could be causing this:

  1. maybe you have V-sync on, and adding the extra fetch caused you to start missing flips

  2. maybe there’s specific hardware on the card to deal with up to N textures (2 per pipe on GeForces, I’m told). Fetching 4 thus costs you 2 “fetch cycles”. Perhaps those fetch cycles are masked by other throughput issues you have. However, when you do the next fetch, you’re using 3 fetch cycles, and perhaps it’s not masked anymore

  3. some other reason I can’t think of right now

However, if your drop is “from 160 fps to 120 fps” then I wouldn’t worry – regular games are playable at 30 fps, and FPS games are fine at 60 fps. Make sure the end result runs well on current cards, and they’ll run super-well on cards a year from now.

Jared,
if you can, could you please post the “original” fragment shader, ie the one with lots of separate textures that you have been trying to optimize away?
Maybe there’s something that can be done there that you’ve missed.

I don’t really believe texture packing is a good solution, partly because of what jwatte said, partly because I just loathe the potential artifacts of packing. Packing is primarily a cheap trick to overcome the limitations of certain state change challenged APIs, which OpenGL just isn’t.