Hi,
I am working on virtualization and visualization methods for large image and volume data sets. My current approach is to use large texture atlases to store the page data on the GPU. With volume rendering a problem with large 3D textures is their cache inefficiency under most viewing angles (caused by their GPU internal, more 2D texture optimized, layout). So i was stoked by the bindless extension, because it allows me to store the pages of the virtual textures in single smaller textures and access them in the shaders through an uniform block or better a larger texture buffer holding the 64bit resident texture handles. So my experiments began…
First experiment: Simple volume ray caster using a single 3D volume texture. You can find the complete source code of the experiment here.
1. Store the texture handle in a uniform block
uniform volume_uniform_data
{
vec4 volume_extends; // w unused
vec4 scale_obj_to_tex; // w unused
vec4 sampling_distance; // x - os sampling distance, y opacity correction factor, zw unused
vec4 os_camera_position;
vec4 value_range;
mat4 m_matrix;
mat4 m_matrix_inverse;
mat4 m_matrix_inverse_transpose;
mat4 mv_matrix;
mat4 mv_matrix_inverse;
mat4 mv_matrix_inverse_transpose;
mat4 mvp_matrix;
mat4 mvp_matrix_inverse;
sampler3D volume_texture;
sampler1D color_map;
} volume_data;
[...]
// sample volume
float v = texture(volume_data.volume_texture, spos).r;
Results: Works perfectly as expected.
2. As uniform buffer storage can get pretty limited when trying to input a large number of page textures into a single shader (64KiB overall storage at max, used with other uniform data…), i tried to store the texture handles in a texture buffer with a RG_32UI format, which in the shader is converted back to a uint64_t which can be interpreted as a sampler:
layout (binding = 4) uniform usamplerBuffer texture_handles;
[...]
// sample the texture
uvec2 vtex_hndl_enc = texelFetch(texture_handles, 0).xy;
uint64_t vtex_hndl = packUint2x32(vtex_hndl_enc);
sampler3D vtex_smpl = sampler3D(vtex_hndl);
float v = texture(vtex_smpl, spos).r;
Results: Quite unexpectedly, this works perfectly! (This use case was not expressed in the bindless textures spec)
3. Now i tried to only fetch and translate the sampler once at the beginning of the shader and store the sampler in a global variable:
Problem: The glsl compiler does not allow to assign values to global sampler variables (the spec implied this should work).
So i just stored the uint64_t handle in a global variable and translate this value to a sampler right before taking a texture sample:
// globals
uint64_t vtex_smpl = 0;
uint64_t ctex_smpl = 0;
[...]
// one time initialization on shader start
uvec2 vtex_hndl_enc = texelFetch(texture_handles, 0).xy;
uvec2 ctex_hndl_enc = texelFetch(texture_handles, 1).xy;
vtex_smpl = packUint2x32(vtex_hndl_enc);
ctex_smpl = packUint2x32(ctex_hndl_enc);
[...]
// sample texture
float v = texture(sampler3D(vtex_smpl), spos).r;
Results: This works, but it is much slower than the previous result fetching the handle for every sample!
These tests were made using a smaller volume with dimensions 501x401x576 using 8bit scalar values with a simple transfer function on a 1600x1024 viewport. The plain non-bindless version ran with 2.5ms per frame, the uniform block bindless version ran with 2.8ms per frame. The first texture buffer version ran with 3.2ms per frame and the version using the texture buffer trying to just once fetch the sampler ran with 3.8ms per frame.
Second experiment: Modify virtualization renderer to use a texture buffer containing individual page texture handles instead of a large volume texture containing the page textures. (not openly available at this point in time).
#if SCM_LDATA_VTEX_BINDLESS_TEXTURE == 1
struct page_atlas_3d_info {
usamplerBuffer atlas_textures;
};
#else
struct page_atlas_3d_info {
vec3 size_pages_rec;
sampler3D atlas_texture;
};
#endif
I changed my data structure containing the indirection information into the texture atlas to contain a simple index into the texture buffer. Additional changes were made to the methods retrieving the actual texture samples from the atlas. What is important is that for the octree traverser i use to retrieve the volume brick for the ray traversal i store the temporary data in the following struct, which is filled out by the traversal function when querying the octree for a texture coordinate:
struct ray_cast_trav_info {
sampler3D vpage;
uint64_t vpage_hndl;
uvec2 vpage_index_data;
int vpage_level;
vec3 vpage_coord;
vec3 octree_node_pos;
vec3 octree_nodes_per_level;
}; // struct ray_cast_trav_info
void ray_cast_octree_traverse(in vtexture3D vtex, // virtual texture decriptor struct
in vec3 vtex_coord, // virtual texture coordinate
in float target_lod,
out ray_cast_trav_info trav_info)
{
[...]
}
Now when sampling the current volume brick i expected i could use the decoded vpage sampler i got from the texture buffer in the following way:
vec4
texture_page(in ray_cast_trav_info rc_tinfo,
in vec3 page_tc)
{
return texture(rc_tinfo.vpage, page_tc);
}
Problem: Simple put: The temporarily stored sampler does not work, i always get vec4(0.0) back from this lookup.
So i checked where the problems start and i found that the uint64 handle was ok, so i stored this handle additionally and used it during the lookup:
vec4
texture_page(in ray_cast_trav_info rc_tinfo,
in vec3 page_tc)
{
return texture(sampler3D(rc_tinfo.vpage_hndl), page_tc);
}
Results: This works, but it is much more slower than my current texture atlas approach by a huge factor (3x to 6x slower in my experiments).
I used 512MiB worth of 64³ smaller volume page textures for my experiments (resulting in 2048 resident 3d textures). I ran all tests under Windows 7 x64 using a GeForce GTX 680 with the 301.32 driver.
I know that this functionality is pretty new and the drivers surely need to mature. I was surprised how far i got with my experiments and a hope that the performance situation can be improved drastically because i see this way of handling virtual volume textures overcoming a lot of our problems with larger 3D texture atlases.
Regards
-chris