In the process of “fixing up” my 3D engine, after a lot of reading (and pestering helpful folks here a bit), I’ve chosen “bindless textures” as the basis of the new implementation of “images” in the engine. In this case, “image” refers to:
- surfacemaps AKA normalmaps AKA bumpmaps
- colormaps AKA texturemaps
- displacementmaps
- conestepmaps
- specularmaps
- heightmaps
- etc…
Also I imagine other build-in features of the engine will be implemented with bindless textures, but not through the generalized mechanism I’m discussing now. For example, to implement rendering the sky background I divided my billion-plus star catalog (real catalog of real stars plus lots of information) into 1024x1024x6 “regions” that exactly correspond to the areas covered by the 1024x1024x6 pixels in a 1024x1024 cube-map (with 6 faces). I also imagine I’ll need to implement a typical general-purpose “skybox” or whatever they’re called these days to let people put a “far-away background image” into a 1024x1024x6 to 4096x4096x6 cubemap. But those will be handled separately in a custom way, not through the generalized “image” mechanism.
The first question I have is this. Does any way exist in OpenGL v4.6 core or an ARB extension (thus likely to become core… someday) that lets me pack those u64 bindless texture handles into a UBO without half the space being wasted? It seems clear from what I read so far that an array of u64 values is half empty, with only one u64 texture handle per 16-byte “location”.
My natural instinct was to create a structure like the following…
[b]struct u64vec2 {
GLuint64 x;
GLuint64 y;
};
[/b]… then create an array of 256 or 1024 (or more) of elements of that type in a UBO. That way the desired bindless texture handle (or apparently a “sampler”… I guess) would get accessed a bit strangely, but that’s easy enough.
But, I don’t see where OpenGL (or GLSL) specifies 2-element u64 vectors. Of course, that’s probably because I’m missing some ARB extension or other, right? At least I hope so (if not already core).
I guess GLSL doesn’t necessarily need to recognize that type for my purposes, since each element will look like a sampler to GLSL… I think.
I don’t much care how the u64 bindless texture handles get packed tight… I just want to find some way. Does some way exist?
The following is a little “color” to explain what is my plan… in case there is some fatal flaw in my plan.
My engine primarily creates “shape” objects AKA “shapes” procedurally. Which means, all cameras, all lights, all [shape] objects that are “drawn” or “rendered” are created by creating simple [shape] objects, then assembling them into larger, more complex, rigid and/or hierarchical [articulating] shapes. The simplest shapes are fairly standard, namely camera, light, points, lines, face3, face4, faces, disk, grid, grix, mesh, peak, cone, tube, pipe, torus, ball, globe, sphere, ring, bridge3, bridge4, bridges, and so forth.
The create functions for all shape objects include arguments to allow each shape to be customized. For example, the create functions for almost all shapes contain the following arguments:
[b]- sides
- side_first
- side_count
- levels
- level_first
- level_count
[/b]
… and where it makes sense, also…
[b]- twist
- taper
[/b]- and so forth.
… as well as obvious arguments like options (up to 64 option bits), color (default color for all vertices), etc.
The objid (object identifier) of up to four image objects [plus other kinds of objects] can be specified in the shape object create functions.
Each of those four image objects gets turned into a u08 integer, and the four of those u08 values are delivered to the shader as a single u32 integer vertex-attribute, to be broken into up-to 0,1,2,3,4 fields by the shader (as specified by 4~6 other bits in another u32 vertex-attribute).
Each of these [nominally] 8-bit image identifiers is not the same as the objid of the image in the engine. Instead, the 8-bit value is an index into one of four UBO blocks which contain samplers backed up by u64 bindless texture handles (as I understand this mechanism so far). By default with the current “uber-shaders” (and nominally), each of the four UBO blocks serves a different purpose, typically:
UBO #0 == colormaps AKA texturemaps
UBO #1 == surfacemaps AKA normalmaps AKA bumpmaps
UBO #2 == conestepmaps + heightmaps
UBO #3 == specularmaps or othermaps
At least that’s what happens right now (though monochrome specularmaps can be taken from the A channel of conventional colormaps if the A channel is not needed to specify transparency). Frankly, I’m fighting the battle between offering too much flexibility to accommodate existing textures versus simplicity, but that’s a side issue.
The important point is this. Each set of shaders (each “program object”) can specify what it does with the images in each of the four UBO blocks. That’s just a specification of each “program object” that can be specified by anyone who writes shaders for the engine (though they are expected to keep the same associations as the standard default “uber-shader program” for any image that serves one of the standard functions. So, if someone writes shaders that support a “displacement map” (but not “conestepmap”, obviously), they would put their “displacement map” in UBO #2.
So, to return to the basics.
When a shape object is created, the objid of up to 4 images is specified.
When each image object is created, its type is specified as an argument. A number of constants like the following are defined to specify image type:
IG_IMGAE_TYPE_IMAP0, IG_IMAGE_TYPE_COLORMAP,IG_IMAGE_TYPE_TEXTUREMAP — 3 intuitive names for one type == UBO #0
IG_IMAGE_TYPE_IMAP1, IG_IMAGE_TYPE_SURFACEMAP, IG_IMAGE_TYPE_NORMALMAP, IG_IMAGE_TYPE_BUMPMAP — 4 names== UBO #1
IG_IMAGE_TYPE_IMAP2, IG_IMAGE_TYPE_CONESTEPMAP, IG_IMAGE_TYPE_HEIGHTMAP — 3 intuitive names for one type == UBO #2
IG_IMAGE_TYPE_IMAP3, IG_IMAGE_TYPE_SPECULARMAP, IG_IMAGE_TYPE_OTHERMAP — 3 intuitive names for one type == UBO #3
As shown above, the type each image object is given when created determines which UBO that image object is stored into.
Whan each shape object is created, zero or one image object can be specified for each purpose. Another way to say this is, each shape object can specify the image objid of zero or one image object in each of those four image UBO blocks. The engine converts the image objid to the appropriate 8-bit index into the sampler arrays in each UBO in order to access the specified image objects for the desires purposes.
The reason for this scheme (as opposed to a single UBO block) is to increase the number of simultaneously accessible image objects to 1024 rather than 256. Since the engine does all the complex work under the covers, the burden on the application is minimal (just say what is the purpose of each image object when it is created).
The other thing to remember is the following. What happens to the zero to four objid of image objects passed in the shape object create functions? Each objid is converted into the appropriate 8-bit index and put into the appropriate field of that 32-bit vertex-attribute (based on the purpose == type of each image object), so that 32-bit vertex-attribute contains the same value in every vertex in the shape object. Which means, the exact same texturemap, normalmap, conemap, specularmap, heightmap, othermap will be processed for every vertex… leading to uniform and consistent appearance throughout the shape object.
That happens automatically, by default.
However, after shape objects are created, the application running on the engine can call shape object modification functions to change any combination of vertex attributes in any combination of shape object vertices.
And so, even the simple shape objects I mentioned can be radically customized.
A couple background facts. First, the “provoking index” is set to the first vertex in each triangle in this engine. Which means, 2/3 of those two 32-bit integer vertex-attributes never make it to the fragment shader… only the first vertex in each triangle gets to the fragment shader. Which is good. Of course these two 32-bit integer attributes have flat modifiers in the shaders, so they are constant across each triangle.
Second, as implied above, almost all shape objects are created out of an arbitrary number of levels and sides. What does this mean?
For example, consider a simple “faces” object. As you might expect, a faces object is just a flat, round disk (but not identical to the disk object in an important way, as you’ll learn shortly).
Let’s say an application program calls the ig_shape_create_faces() function and specifies the following values for the following arguments:
[b] - levels = 3
- level_first = 1
- level_count = 2
- sides = 6
- side_first = 1
- side_count = 4[/b]
By default, as with all shape objects, the radius of the surface of the faces object is 1 meter (but can be scaled, sheared or otherwise modified at any time). Since the create function specified this faces object is composed of levels == 3, with level_first == 1 (not the default == 0) the surface of this faces object has a 1/3 meter radius hole in its center (like a hole in the center of a round disk).
Since the create function specified this faces object has sides == 6, the nominally round outer edge actually has only 6 sides… a hexagon. But side_first == 1 and side_count == 4, so only 4 of those 6 sides physically exist. Errrr… I mean “graphically exist”, meaning only those graphical surfaces exist (contain vertices).
And so, this faces object is 2 meters diameter, has a 2/3 meter hole in the center, and only 4 of the 6 sides exist.
One important point of the last several paragraphs is the following. The first (provoking) vertex of every triangle starts on the lowest level that triangle touches (every triangle spans between two adjacent levels) and also starts on the lowest side that triangle touches. This is important because this lets the engine provide a way to specify different image objects for vertices on a given range of levels and/or sides. So, for example, after an application creates any kindi of object, the application can easily tell the engine to “set 1, 2, 3 or 4 of the image objects on any range of levels to be new image objects” and/or “set 1, 2, 3, or 4 of the image objects on any range of sides to be new image objects”. Okay, I’m no artist, so that probably sounds boring. But to give just one example of what that means, those two function calls could change the texturemap and normalmap of some middle level or levels from the default “cobblestones” to “grass” or “bricks”, and some arbitrary side or sides from the default “cobblestones” to “fancy artistic tiles”. Okay, I really am not artist. But you get the point.
Also, everything above leads to the following point. I mentioned many of the simple shape objects, each of which is radically configurable during the create process, and later on by simple function calls (by 3D scaling, 3D shearing, 3D twisting, 1~4D randomization of attributes, and an open-ended number of additional procedural processes).
But the engine is designed to support vastly complex procedurally generated shapes too. For example, the create functions for simple shapes like “cup” attach, bond or fuse 2 to a few of the simplest shapes together, while create functions for super-complex shapes like “spacecraft” or “planet” or “galaxy” attach, bond or fuse tens, hundreds, thousands or more simple [and complex but simpler] shapes together.
The point is, when objects are attached, bonded or fused together, all the configuration performed on the individual elements is retained. And yet, individual aspects can be changed in simple, coherent, intuitive ways. For example, if the image object that contains those “fancy artistic tiles” can easily be replaced by any other texture to display something else.
Which leads to the following. Each create function (as well as modification functions) for super-complex objects like “house”, “office”, “spacecraft”, “planet” or pretty much anything else can generate an astronomical number and variety of shape objects of that general kind (house, office, spacecraft, etc). The relative dimensions of pretty much any subsection can easily be configured or changed, the appearance of any [natural] portion of any surface can be changed, just about every aspect of that “kind” of shape can be changed. Of course the create function (and supporting modify functions for that shape) can offer as many or few opportunities for configuration during shape creation… and/or later.
Anyway, that’s what I’m trying to achieve, and this scheme I described up near the beginning of this message is my attempt to implement some of these features and capabilities.
I wanted to implement a scheme with array textures, but they just don’t seem flexible enough. Unless all textures [of a given purpose] are the same size and work effectively with all the same sampler configuration, I don’t see any plausible way to make those four 8-bit imageid fields specify a rich enough variety of images for all the various purposes.
Maybe some guru out there sees a way to make that work. I don’t.
In contrast, the bindless textures do seem flexible enough, since (as I understand this), every texture can be a different size and configured differently.
But what do I know? Only enough to get the basic texturemap and normalmap stuff working (displaying on object surfaces). But they couldn’t even be selected before now… only the one texturemap and normalmap would display. :disgust:
Anyway, if anyone is a serious image/texture guru… especially bindless textures or via some other fancy texture-works tricks… I’m all ears. I mean eyes. Post your crazy ideas below. Thanks.
PS: And sorry my message is so long. I thought it might provide sufficient context to spur good ideas, or prevent waste of time to post ideas that just aren’t flexible enough.
PS: I’ve gone to a huge amount of effort to be able to render many shape objects in each call of glDrawElements() or similar. That’s another crucial reason for making so many image objects accessible simultaneously.
PS: As per my usual practice, I “design forward”. Which means, I’m open to any approach that is likely to be supported by most high-end cards from at least AMD and nvidia two years from now. My development rigs are Ryzen and Threadripper with nvidia GTX1080TI GPUs, 64-bits Linux Mint (later maybe windoze). These can be considered “absolute minimum required hardware”, since 2-ish years from now most folks running sophisticated 3D simulation/physics/game applications will have better. I do prefer to avoid non-ARB extensions, but I’ll consider anything likely to be supported by high-end AMD and nvidia GPUs (must be both, not just one or the other brand).
PS: Almost certainly the engine will be released as open-source. I don’t consider 3D engines a viable commercial market (not for me, anyway). For me, this engine is just one subsystem in a larger project.
Thanks!