PDA

View Full Version : new Relief Mapping shader (better Parallax Mapping?!?)



fpo
09-08-2004, 02:51 PM
Hi... I just finished a new Relief Mapping shader that looks excellent!

It is similar to Parallax Mapping but uses a binary search to generate accurate per-fragment offsets also including self shadows and correct Z depth values.

Only a single quad is drawn in a single pass. It only uses a color texture (RGB) and relief texture (RGBA, normal map + depth map in alpha channel).

The shader is big (around 200 assembler instructions) but runs fast on new video cards such as the GeForce6 (over 100 fps at full screen).

Check out the demo at:
reliefmap2.zip (http://fabio.policarpo.nom.br/files/reliefmap2.zip)

Also the updated documentation including details on the new corrected Z option at:
ReliefMapping.pdf (http://fabio.policarpo.nom.br/docs/ReliefMapping.pdf)
ReliefMapping_I3D2005.pdf (http://fabio.policarpo.nom.br/docs/ReliefMapping_I3D2005.pdf)

Now I'd like to support curved surfaces but my first thoughts on the subject could not find a simple implementation to it.

The demo uses CG and OpenGL and needs latest NVidia drivers 61.77. It will not run ATI cards as fragment shader is too long.

I also would like to try a comparison of this technique with original Parallax Mapping using same texture maps. Does anyone here have a Parallax Mapping implementation to compare it to?

Thanks for any feedback on the new shader...
FPO

MZ
09-08-2004, 03:48 PM
Doesn't work here, I have GF 5200 and 61.77 drivers. All I see is black background and white wireframe square, which moves as i drag mouse. Strangely, Wireframe option is enabled by default, but changing it doesn't help. Tried to experiment with settings, then load relief and color maps manually - still got no change. (BTW it is annoying when the open dialog starts at file system root rather than the dir where the app started). CPU seems not busy much.

dorbie
09-08-2004, 03:59 PM
Very cool, it works very well on a 6800 for me, I remember there was some discussion about iterating the depth offset in the original parallax mapping thread.

FWIW I'd rather you stuck with a similar name like iterative parallax mapping or refined parallax mapping or some other appropriate variation. It seems like there's enough in common with the original algorithm to stick with a similar name.

In addition the displacement in parallax mapping along the view vector in tangent space is not entirely arbitrary, it should be stated that it is scaled by tan of the angle of incidence in your paper (first sentence in section "2. How it works"). There is an arbitrary factor in there but IMHO this is a height scale factor. Obviously you know this but I think it is worth clarifying when you're publishing it like this.

I'm left wondering how well it will work on curved surfaces, I don't think you should restrict yourself to planar solutions, rather you should treat pixels on curved tangent space (i.e. interpolated vectors from different per vertex coordinate frames) with the planar intersection solution at each pixel and see what is produced. It will be an approximation but probably a better one than parallax mapping. Have you tried this? How well did it work?

Korval
09-08-2004, 04:20 PM
I'm not sure that the term "mapping" should even apply to this technique. Not that the technique isn't good (though quite expensive), but it's just that it isn't really texturing (though textures are being accessed). Instead, it's, effectively, raytracing a heightmap. It doesn't really seem very similar to the parallax technique, if it is doing what I think it is doing.

dorbie
09-08-2004, 04:36 PM
Parallax mapping samples height and fetches an offset rgb. This samples height and uses the result to sample height again etc until it uses the final result to fetch rgb. It's kind of an iterative parallax indirect fetch. The actual function name "ray_intersect" is infact an iterative search through multiple offset offset fetches (yes offset offset is not a typo).

I still think there's nothing stopping this working well in curved tangent space although silhouettes are still the Achilles' heel.

SirKnight
09-08-2004, 05:28 PM
but runs fast on new video cards such as the GeForce6 (over 100 fps at full screen).
What resolution and quality settings? On my 6800 GT I'm getting 60fps fullscreen. My settings are: 1024x768, AA at 2xQ, anisotropic at 8x, Images settings at Quality, vsync off and so on.

I really like this technique though.

-SirKnight

Pentagram
09-09-2004, 04:36 AM
In windowed mode it runs about 106 fps for me
Full screen it's 145 (dunno what res it doesn't ask me anything but it looks like 640*480)
(On a geforce6800 GT that is)

fpo
09-09-2004, 04:57 AM
It is difficult to name it as it is somehow like Parallax Mapping but also different and similar to ray-tracing the depth map or Relief Texture Mapping.

Anyway, looks nice on some objects if feature details are not too small (smaller than liner step size). I think it would look nice on walls and panels like the ones made for Doom3 (or like my tile1 sample).

And I also have a GeForece6800 GT. The fullscreen mode I talked about was 640x480 (use Alt+Enter in my demo and not maximize for game like full screen mode).

SirKnight
09-09-2004, 04:59 AM
I looked at the ini file and the default is 640x480. So when I run in that res instead of 1024 I get the same peformance as Pentagram.

-SirKnight

fpo
09-09-2004, 05:06 AM
Great so... have you checked the other maps included like the relief.rm?

I want to make a nice room now with all geometry using it to see if several relief maps together still runs fast. Turning mip-mapping on makes it faster when using larger tile sets (use +,- keys to select the tiling mode).

The F4 keys will switch to observer mode where you can fly anywhere in scene with arrows and S/X keys.

I have a ATI9800 but it will not run this shader (too long too many texture reads).
Anyone knows if new ATI X800 videocard can do larger programs or more texture reads? Or is it just a faster ATI9800 but same shader compexity support?

Zeross
09-09-2004, 05:51 AM
Originally posted by fpo:
Anyone knows if new ATI X800 videocard can do larger programs or more texture reads? Or is it just a faster ATI9800 but same shader compexity support?[/QB]Yes it can do 512 instructions of any type (texture or ALU) but the limitation of only 4 level of texture indirections remains.

fpo
09-09-2004, 06:57 AM
Ok, I posted a new version now that can switch between ARBFP1 and NVFP30 profiles. I think someone with a ATI X800 card mmight be able to run it with the ARBFP1 option now.

Please, if someone here can run this on a new X800 ATI card please posts results here.

flamz
09-09-2004, 07:33 AM
doesn't work on my FireGL X2.
probably too many texture indirections.

dorbie
09-09-2004, 08:40 AM
fpo, you still haven't said why you limit yourself to planar faces, I think you're selling your technique short. Sending the interpolated tengent space vectors to the fragment program should just work after some (optional?) normalization.

fpo
09-09-2004, 08:57 AM
hi dorbie... the problem with tangent space is that on a curved surface, the ray used to ray-intersect the depth map should be curved in order to get the correct outlines at silluete.

only using tangent space (and no surface curvature information), when rendering a fragment we would be considering the surface flat at that fragment.

and in my shader I use the uv mapping information to project any point in the object space into the texture map... but I think this could be solved with some aproximation of the mapping function.

I will try it over the weekend to see if I can get somewhere with it. I have another displace map demo that uses tangent space and maybe I can use the same approach there with this new shader.

dorbie
09-09-2004, 09:43 AM
I understand this, but I don't think it is a problem. Having a true curved ray treatment would cause more problems than it solves when you consider that the curvature would change across facets and a fragment cannot track this, it would therefore lead to discontinuities between facets. The straight ray approximation after surface contact is still a better solution than previous approaches and may even be a necessary approximation on a surface of varying curvature.

You still have silhouette issues (offset clamping has been proposed in the past), but other solutions have these issues.

fpo
09-09-2004, 10:27 AM
thanks, I will try that so... it would be excelent to see those relief maps applied to any geometry.

We had a long holiday (monday and tuesday) here in Brazil and I have accumulated some work I need to finish for tomorow. I will try to implement the tangent space version as soon as possible (maybe over the weekend). If need any help I will post something here...

Thanks for the support dorbie... see you are from san diego... was in LA a few weeks ago... nice time there.

knackered
09-09-2004, 12:39 PM
Wow wow wow. I've seen it all now. One quad. Stunning.

SirKnight
09-09-2004, 01:31 PM
Playing around with this demo a bit more I found some spots where some artifacts pop up. Here is a screenshot: http://sirknighttg.tripod.com/artifacts.jpg

Also I noticed that this technique from a distance looks quite good, but once you get close to the surface things start to look really bad. Not only are you getting those black spots as shown above, but also the lighting looks streaky, jaggy black areas, pixelation, specular highlight is warping all around, etc. I'm sure this can all be resolved but currently it has many problems at close up.

EDIT: BTW, for that link to work you have to copy it and paste into the address bar.

-SirKnight

fpo
09-09-2004, 01:56 PM
Yes... there are some artifacts if you move too close and with angle almost parallel to polygon. But if you compare that close view to the normal map version it will still look better.

Also, there is a double precision option in the demo that will make things look much better from close ups (Ctrl+P I think) but double the texture reads.

It is not a perfect solution but im many cases look better than same thing with only normal mapping even with the artifacts. If you make panels with a reduced depth range things will look even better (too deep objects are a problem).

You can also make higher resolution maps (included samples are only 512x512).

-NiCo-
09-09-2004, 01:57 PM
Bumpmapping, parallax mapping, etc... all have one thing in common. They only require a simple quad as input to the vertex units. The impression of relief is created by the fragment program.

To get some kind of speed comparison, I made a program that creates a vertex buffer object with associated index buffer from a 512x512 depthmap.

On a GF6800 GT this textured mesh is drawn at ~180 fps (1600x1200). I did not include lightning in the fragment program yet. The vertex program seems to be the bottleneck.

The relief mapping techniques do have the advantage that they require a smaller amount of memory on the graphics card.

Nice demo! but like Sirknight mentioned, there are some artifacts. But I believe most of them are inherent to using relief mapping techniques.

Nico

SirKnight
09-09-2004, 02:16 PM
Using double precision did not help for me. I tried all combinations of settings and the same artifacts were still present. But this is a good start none the less.

-SirKnight

SirKnight
09-09-2004, 03:17 PM
I just tried out your newest version and the performance is a bit lower. Even using NVFP30 which did give a boost compared to ARBFP1. Also I can't select NVFP40 mode even though I have an NV40 card.

-SirKnight

fpo
09-09-2004, 03:36 PM
I have disabled the nv40 option in that post as I was only testing the arbfp1/nvfp30 thing there. I still would like to know it it can run on ATI X800 card...

I have a NV40 version that does early exit on loop at first intersection (should save several texture reads). But looks like overhead for using the true loop instruction is bigger than the saved reads. Also if one fragment exits early, it still has to wait for all other fragments running in parallel so you always get the time for the slowest fragment in batch.

I will enable the NV40 option again now... it runs slower than the version using all unrolled loops and if conditional stores but is good to testing PS 3.0 features in Cg.

fpo
09-09-2004, 04:07 PM
ok, NV40 enabled version uploaded. same url for zip file as before.

but selecting the NV40 option will make things slower. I'm using the following compiler options:

-profile fp40 -DRM_NV40 -ifcvt none -unroll none

dorbie
09-09-2004, 04:23 PM
Extra precision helped a lot on my system with the edge definition of the texture implied geometry.

lgrosshennig
09-10-2004, 05:55 AM
Thats some nice stuff fpo, I'll enjoy playing with it.

I get 28fps on a 5950 Ultra using the NV30 path and 14fps using the ARP path.

Nice work.

EDIT: I just realized I had v-sync enabled w/o its 37fps and 20fps with default settings.

Sunray
09-10-2004, 07:21 AM
Awesome. It would be nice if you could toggle between Parallax and Relief in the demo.

Sunray
09-10-2004, 02:54 PM
I added Parallax Mapping. Replace the shader with this one (http://sunray.cplusplus.se/dump/cg_relief_map.cg) , choose standard bump mapping and enable shadows (enabling shadows will enable parallax mapping :) ).

fpo
09-10-2004, 05:33 PM
Great Sunray... loved you parallax mapping addition. We can now compare bump, parallax and relief. Excelent work!

I have incorporated your parallax mapping code into my demo and added a new menu option for it. I also made all source code available this time so you can just load the project in VC++ and re-compile the executable (all lib/h files needed for the build should be there now).

Make sure to read the included PDF file that explains how it works and why we have a few artifacts at the edges depending on depth map detail.

Corrail
09-11-2004, 05:41 AM
First of all: Very nice Demo, fpo!

But these artifacts when view almost parallel are really ugly. Why don't decrease the linear search steps depending on the angle from the view-vector to the surface? Parallax mapping doesn't look as good as relief mapping but doesn't have any artifacts like this.

fpo
09-11-2004, 06:52 AM
Thanks Corrail, having different number os linear search steps based on angle with relief surface is a good idea.

But we can not have variable loop steps for every fragment (not even with PS 3.0 I think).

With PS 2.0 the number of steps must be known at compile time so we would need to recompile the shader every frame the view angle changes.

With PS 3.0 we can have the number of loop steps passed in as a parameter (constant for all fragments of a polygon). But the best way would be to compute the number loop steps per fragment depending on fragment angle (not polygon angle)... but this is not possible yet.

It is difficult to produce the correct depth with high quality and speed. This is the best I could think about (tried other forms of releif maps/displace maps but all much slower and with much worse artifacts).

Corrail
09-11-2004, 08:03 AM
When you use PS3.0 the maximum passes of a loop is deteremined per Drawcall (Thanks for this info to Demirug @ 3dcenter.de). So for your project you can use a for loop with a fixed number of passes. Depending on the angle between the view vector and the surface you can use break to jump out of the loop earlier.
So this is possbile using PS3.0. IMHO you can do that with PS2.0 too.

WarnK
09-11-2004, 12:30 PM
Running this on my X800 Pro (with cats 4.8) I get a blank black square (990 fps too) running in relief arbfp1 or the nvfp30 path. Normal and parallax works fine.

sk
09-11-2004, 08:57 PM
Hi Fabio. Congratulations with the ShaderTech compo!

I recently experimented in this area having come across the following Siggraph sketch a little while before the conference:

Displacement Mapping with Ray-casting in Hardware (http://www.cs.ualberta.ca/~keith/siggraph_sketch/sketch.pdf)

This describes an improvement upon naive heightfield ray marching. As I couldn't get the accompanying demo to run - in the same directory as the pdf on last inspection - I thought I'd implement it myself and try out some other ideas for extending basic parallax mapping.

One thing to bear in mind with binary searching is that you won't necessarily zone in on the first hit. How critical this is depends, of course, on the height field in question and it may be possible to work around - I've not had time to look at your implementation to see if you're doing anything clever in that respect. In practice I didn't find that to be much of a problem in my limited testing, in fact the results were asthetically better than incremental marching (with more steps) for the test map I was using.

With regard to artefacts at shallow view angles, I found that simply scaling the displacement based on V.N was enough. Of course this isn't at all correct physically but it again looked good and was cheap to add -- the sort of 'solution' (read: hack) game devs like :) .

This could work well in conjunction with varying the number of samples in the same way. It may also be possible to vary the sampling per-pixel with ps3.0 hardware as that model supports a loop break, although I believe you're then restricted to using manual lod texture lookup. If it's indeed implementable it probably won't be fast and incoherency could also affect the speed.

Finally, concerning curved surfaces, you could look at the following paper if you haven't already, which discusses that situation and a lot more besides:

Generalized Displacement Mapping (http://research.microsoft.com/users/xtong/xtong.html)

fpo
09-12-2004, 04:09 AM
Thanks WarnK for testing it on X800... thought it might run. Maybe too many texture reads and too long shader. Thanks anyway.

Hi SK ... thanks for the excellent links and suggestions. I was also at siggraph this year but did not see that sketch.

Here is another interesting pixel based displace map that I found on the web... different approach to curved surfaces:
http://www.gris.uni-tuebingen.de/publics/paper/Hirche-2004-Hardware.pdf

The Generalized Displacement Maps looks awesome (would like to see a running demo for that) but can only apply to a small tile surface. In Parallax and Relief can use much larger textures with much more different detail in it.

A binary search alone is good but will be wrong in cases where the ray intersects more than one point in depth map (see fig 7 in my pdf file). If there are several intersections with map for a given ray, the binary search can return any of them... so we must find a local (closest) point inside object and then search for a intersection around that region (for the closest intersection).

That is why we have the artifacts... is some section of the depth map on an intersection is smaller (thinner) than the linear search step size we might miss it and end up with next intersection instead of first (that gives the cutting off artifact where we see through first intersection into back of object shown here by SirKnight).

sk
09-12-2004, 04:53 AM
Originally posted by fpo:
Here is another interesting pixel based displace map that I found on the web... different approach to curved surfaces:
http://www.gris.uni-tuebingen.de/publics/paper/Hirche-2004-Hardware.pdf
Ah yes, I tried to dig up that paper for my previous post but I'd forgotten the conference and title! I haven't actually read it in full yet either, so I probably would have ended up mischaracterising it.


Originally posted by fpo:
The Generalized Displacement Maps looks awesome (would like to see a running demo for that) but can only apply to a small tile surface. In Parallax and Relief can use much larger textures with much more different detail in it.Indeed, it would be great to see it in action, although I agree that the limited texture size makes it impractical for a lot of applications. There may be some mileage with texture synthesis in the future as the authors suggest and it's an interesting paper nonetheless.


Originally posted by fpo:
A binary search alone is good but will be wrong in cases where the ray intersects more than one point in depth map (see fig 7 in my pdf file). If there are several intersections with map for a given ray, the binary search can return any of them... so we must find a local (closest) point inside object and then search for a intersection around that region (for the closest intersection).
That's exactly the problem I was describing (or trying to) before, although rather tersely. I'll take a look at the pdf.

The idea of performing an initial march followed by a binary search within the intersection region did cross my mind as a combined solution a while ago - it's a logical follow on - but I didn't get the chance to try it out and I was concerned about the potential instruction count anyway.


Originally posted by fpo:
That is why we have the artifacts... is some section of the depth map on an intersection is smaller (thinner) than the linear search step size we might miss it and end up with next intersection instead of first (that gives the cutting off artifact where we see through first intersection into back of object shown here by SirKnight).Good description.

pro_optimizer
09-13-2004, 05:52 AM
Hi I'm new to these forums.

Your demo is awesome, fpo.
I was really speechless when I first read your paper, this is the way to go in the future!
(excuse my enthusiasm)

I experimented a bit with the shader yesterday in order to get rid of the artifacts and came up with the following solution:

The revisited shader walks now with a constant step size in the Texture Plane instead of the z-Axis. This way there are almost no visile artifacts at small viewing angles, but a lot more texture reads on the other hand.
But because there are fewer pixels to draw at small angles the fps stays more or less constant with varying view direction.
Unfortunately the new stuff is a lot slower (probably because the high branch penalty on NV40 and the mass of texture lookups -> about 50 fps fullscreen on an GeForce 6800 GT).

Here is the important part of the Code:

...
// RAY INTERSECT DEPTH MAP WITH BINARY SEARCH
// RETURNS INTERSECTION DEPTH OR 1.0 ON MISS
float ray_intersect_rm(
in sampler2D rmtex,
in float2 dp,
in float2 ds,
in float dotprod)
{
#ifdef RM_NV40

// *** NV 40 path ***
float depth_step= max(dotprod*0.08, 0.005);
const int binary_search_steps=5;

// current size of search window
float size=depth_step;
// current depth position
float depth=0.0;
// best match found (starts with last position 1.0)
float best_depth=1.0;


// search front to back for first point inside object
while (depth <= 1.0)
{
depth+=size;
float4 t = f4tex2D(rmtex,dp+ds*depth);
if (depth>=t.w)
#ifdef RM_DOUBLEDEPTH
if (depth<=t.z)
#endif
break;
}

#else

// *** Non - NV40 path ***
...
#endif



// recurse around first point (depth) for closest match
for( int i=0;i<binary_search_steps;i++ )
{
size*=0.5;
float4 t=f4tex2D(rmtex,dp+ds*depth);
if (depth>=t.w)
#ifdef RM_DOUBLEDEPTH
if (depth<=t.z)
#endif
{
best_depth=depth;
depth-=2*size;
}
depth+=size;
}

return best_depth;
}
...The 4th parameter of the function is the dot product between entrance angle and z-axis (dot(axis_z.xyz,v) for the first call in main_frag_rm() and dot(axis_z.xyz,l) for the shadow tracing call.

The factors in the depth_step calculation are a speed/quality tradeoff and the max() avoids exceeding the pixel shader limits.

If we get this raytracing fast, it could become a real alternative to vertex displacement mapping.

fpo
09-13-2004, 08:15 AM
Good work pro_optimizer!
I see you make a variable linear seach size depending on dot product of view anlge and polygon normal.

The only problem is that this only works with FP40 profile and it is still much slower than FP30 when using true loops/jumps. Maybe for NV50...

pro_optimizer
09-13-2004, 09:15 AM
Thanks, fpo.
I have an idea how you can optimize the raymaching through the heightmap.

Currently it is kinda comparable to a blind flight: If you do not want to miss the surface then you must walk with smaller step size.
But it can be made more geometry aware by providing additional information in a second texture. I am thinking of filtering the height map with a variable size maximization filter and storing the extended height plus the filter kernel size per texel.
This way you can walk with xy_stepsize=filter kernel size (which can be rather large in most cases), as long as you stay above the filtered height. When you are below, you can continue with smaller step size sampling only from the heightmap cannel.
Alternatively, one might provide fixed size filtered heightmaps in rgb (like 50 texels, 20 texels and 10 texels radius) and true height in alpha.
This way one can switch between 4 different walking speeds depending on the current ray height.
This will reduce the number of steps taken considerably while keeping the same detail for the edges, at the cost of some more calculation overhead.

By the way, in the current implementation it is obviously not possible to view the heightmap from the side and one might think that one cannot define a heightmap for, say the side polygons of a box surrounding a 3d landscape.
But in fact the heightmap would be the same, only the texture coordinates must include the z-coordinate in the heightfield (effectively expressing the 3d position of the vertex in heightmap space). This way the raytracer would not start at depth=0.0 and walk until depth=1.0 but might instead start at a depth value defined by the current texcoord and walk until it leaves the heightmap or the depth range 0.0 to 1.0.

Unfortunately, I cannot build the program because it is missing the paralelo3d.h file. Is it available on the net?

Edit: Ignore the previous line, I have now downloaded the version with the library.

fpo
09-13-2004, 12:38 PM
That version I posted with all source code does not have the FP40 option (I removed that as it was getting slower than FP30 option). But as your new ideas require the FP40 otion I will put it back and repost the demo later tonight.

davepermen
09-14-2004, 04:49 AM
yeah, use two colour channels for the min and max height, and recalculate that for all lods (store in the mipmaps for example). that way, you can quickly see if you can hit at all over big regions..

would be like a quadtree then.

pro_optimizer
09-15-2004, 07:07 AM
I have experimented a bit with the NV40 path and got it about 2.5 times as fast as the last version I posted, still without the optimizations I proposed last time. It is now about 210 fps fullscreen on my GeForce 6800 GT compared to 230 fps for the fp30 path. But it does not support RM_DOUBLEDEPTH (code would become a mess) and RM_SHADOWS (had no time yet) at the moment. You should try it anyway just to see how nice it looks :-)

Since I have no webspace, I will post all of the changed code here (hopefully it is not bad post style). I renamed a lot of variables in the main fragment shader for myself so that I could understand it better (sorry!).
The only structural change is that ray_intersect_rm() now takes the 3d entry point and entry vector in heightmap space and returns the intersection point in heightmap space (because this makes a lot of things easier)




/////////////////
// RELIEF MAPPING

frag2screen main_frag_rm(
// interpolated fragment data
vert2frag IN,
// normalmap + heightmap
uniform sampler2D rmtex:TEXUNIT0, // rm texture map
// color map
uniform sampler2D colortex:TEXUNIT1, // color texture map
// these define heightmap space
uniform float4 axis_pos, // base vertex pos (xyz)
uniform float4 axis_x, // base x axis (xyz:normalized, w:length)
uniform float4 axis_y, // base y axis (xyz:normalized, w:length)
uniform float4 axis_z, // base z axis (xyz:normalized, w:length)
// these are in object space (hopefully)
uniform float4 camerapos, // camera position (xyz)
uniform float4 lightpos, // lightposition (xyz)
// material factor
uniform float4 specular, // specular color (xyz:rgb, w:exponent)
// for depth correction
uniform float4 modelviewprojz, // 3rd column from modelview projection matrix
uniform float4 planes) // near and far plane distances (near,far,near*far,1/(far-near))
{
frag2screen OUT;

// entry position (_Obj is object space, _Hgt is heightmap space)
float4 entryPos_Obj;
float3 entryPos_Hgt;
// entry vector (+normalization)
float3 entryVec_Obj, entryVec_Hgt;
// traced position
float4 tracePos_Obj, tracePos_Hgt;
// light vector
float3 lightVec_Obj;
// remove us
float d,dl;


// *** Ray Intersection ***

// calculate entry position
entryPos_Obj = IN.opos;
entryPos_Hgt = project_uvw(entryPos_Obj.xyz - axis_pos.xyz,axis_x,axis_y,axis_z);

// calculate entry vector
entryVec_Obj = normalize(IN.opos.xyz - camerapos.xyz);
entryVec_Hgt = project_uvw(entryVec_Obj,axis_x,axis_y,axis_z);


// perform raytracing
tracePos_Hgt = ray_intersect_rm(rmtex,entryPos_Hgt,entryVec_Hgt);
tracePos_Obj = tracePos_Hgt.x*axis_x + tracePos_Hgt.y*axis_y + tracePos_Hgt.z*axis_z + axis_pos;

// have we hit the texture?
if (tracePos_Hgt.w > 0)
{
// *** Specular Normal Mapping ***

// get rm and color texture points
float4 normal = f4tex2D(rmtex,tracePos_Hgt.xy);
float3 color = IN.color.xyz*f3tex2D(colortex,tracePos_Hgt.xy);

// expand normal from normal map in local polygon space
normal.xy = normal.xy*2.0 - 1.0;
normal.z = sqrt(1.0 - dot(normal.xy,normal.xy));
normal.xyz = normalize(normal.x*axis_x.xyz + normal.y*axis_y.xyz - normal.z*axis_z.xyz);

// compute diffuse and specular terms
lightVec_Obj = normalize(tracePos_Obj.xyz - lightpos.xyz);
float diff = saturate(dot(-lightVec_Obj,normal.xyz));
float spec = saturate(dot(normalize(-lightVec_Obj - entryVec_Obj),normal.xyz));

// compute final color
OUT.color.xyz = color*diff + specular.xyz*pow(spec,specular.w);
OUT.color.w = 1;

#ifdef RM_DEPTHCORRECT
// *** Depth Correction ***
tracePos_Obj.w = 1;
float depth = dot(-tracePos_Obj,modelviewprojz);
OUT.depth = (planes.z/depth + planes.y)*planes.w;
#endif
} else
OUT.color.w = 0;

return OUT;
}




// RAY INTERSECT DEPTH MAP WITH BINARY SEARCH
// RETURNS INTERSECTION POINT OR (0,0,0,0) ON MISS
float4 ray_intersect_rm(
in sampler2D rmtex,
in float3 entryPos_Hgt,
in float3 castVec_Hgt)
{
float4 res = float4(0, 0, 0, 0);

#ifdef RM_NV40

// *** NV 40 path ***
// currently no support for RM_DOUBLEDEPTH

const float linear_cast_radius = 0.005; // in uv plane (e.g. 0.1 = 10 steps to cross the whole texture)
const int binary_search_steps = 5;

float4 t;

// renormalize cast vector (to uv-length = linear_cast_radius, but smaller if vector would cross the ground plane)
castVec_Hgt = castVec_Hgt *= 1.0/castVec_Hgt.z;
castVec_Hgt *= min(1.0,rsqrt(dot(castVec_Hgt.xy, castVec_Hgt.xy))*linear_cast_radius);


// [ perform linear search ]
do
{
entryPos_Hgt += castVec_Hgt;
t = f4tex2D(rmtex, entryPos_Hgt.xy);
} while (entryPos_Hgt.z < t.w);


// [ perform binary search ]
for(int i=0;i<binary_search_steps;i++)
{
castVec_Hgt *= 0.5;
entryPos_Hgt += castVec_Hgt*sign(t.w - entryPos_Hgt.z);
t = f4tex2D(rmtex, entryPos_Hgt.xy);
}


// store output
if (t.w <= 0.996)
{
res.xyz = entryPos_Hgt;
res.w = 1;
}
#else

// left it out at because of the changed parameters
res.xyz = entryPos_Hgt;
res.w = 1;

#endif

return res;
}

// PROJECT A 3D POINT INTO HEIGHT MAP
float3 project_uvw(in float3 p,in float4 u,in float4 v, in float4 w)
{
return float3(dot(p,u.xyz)/u.w, dot(p,v.xyz)/v.w, dot(p,w.xyz)/w.w);
}

In order to make it work, you must replace the original functions and change the appropriate function declarations.

pro_optimizer
09-15-2004, 07:34 AM
Oh, and...

The layered speed optimization could look like the following pseudocode. It assumes that the filtered map is in the blue channel and that the filter size is fixed. Unfortunately my Visual Studio is completely messed up atm so I cannot built anything at all (-> no program for doing the filtering from me).



// *** NV 40 path ***
// currently no support for RM_DOUBLEDEPTH
const float linear_cast_radius = 0.005; // in uv plane
const float linear_filter_radius = 0.1; // e.g. 0.1 for a 512x512 map would be 51 pixels radius, same unit as linear_cast_radius
const int binary_search_steps = 5;

float4 t;

// renormalize cast vector
castVec_Hgt *= 1.0/castVec_Hgt.z;
castVec_Slow = castVec_Hgt * min(1.0,rsqrt(dot(castVec_Hgt.xy, castVec_Hgt.xy))*linear_cast_radius);
castVec_Fast = castVec_Hgt * min(1.0,rsqrt(dot(castVec_Hgt.xy, castVec_Hgt.xy))*linear_filter_radius);

// [ perform layered linear search ]
do
{
// walk fast (over the plateaus)
do
{
entryPos_Hgt += castVec_Fast;
t = f4tex2D(rmtex, entryPos_Hgt.xy);
} while (entryPos_Hgt.z < t.z);

// hit plateau -> recover to the last save position
entryPos_Hgt += castVec_Hgt*(t.z-entryPos_Hgt.z);

// OptMe: eliminate binary search if (t.z = t.w) here
// (will be very cool for larger flat areas)

// walk slow (through heightmap)
do
{
entryPos_Hgt += castVec_Slow;
t = f4tex2D(rmtex, entryPos_Hgt.xy);
} while (((entryPos_Hgt.z - t.w) * (entryPos_Hgt.z - t.z)) < 0); // while both comps have different results

} while (entryPos_Hgt.z < t.w);

// [ perform binary search ]
...


The OptMe optimization would make it faster (at least 25%, i guess) than fp30 when looking directly onto the surface, while the layering optimization would make it faster when viewing at sharp angles (could be about the same performance as fp30 then, but without the artifacts)

fpo
09-15-2004, 04:24 PM
Thanks for your updates pro_otimizer... I've been too busy this week at work and did not have any time to check your new ideas yet.
I will try it out over the weekend together with some new ideas for using it with curved surfaces (generic meshes like a teapot for example).
I will post something here as soon as I have some good results...

pro_optimizer
09-19-2004, 05:56 AM
OK, I've got my VS working again and tested all the mentioned stuff, but I must admit that it does NOT cut the cake. The two additional loops just create too much overhead so that it is even a bit slower than before. I did not expect that the conditional statements are that slow... But having realized this now, I did something totally different which gives up to 100% speed gain especially in the dreaded sharp-angle cases. Simply unroll the linear search loop by some factor and use the conditional operator instead:



// [ perform linear search ]
do
{
entryPos_Hgt += castVec_Hgt;
t = f4tex2D(rmtex, entryPos_Hgt.xy);
entryPos_Hgt += (entryPos_Hgt.z >= t.w?0:castVec_Hgt);
t = f4tex2D(rmtex, entryPos_Hgt.xy);
entryPos_Hgt += (entryPos_Hgt.z >= t.w?0:castVec_Hgt);
t = f4tex2D(rmtex, entryPos_Hgt.xy);
entryPos_Hgt += (entryPos_Hgt.z >= t.w?0:castVec_Hgt);
t = f4tex2D(rmtex, entryPos_Hgt.xy);
entryPos_Hgt += (entryPos_Hgt.z >= t.w?0:castVec_Hgt);
t = f4tex2D(rmtex, entryPos_Hgt.xy);
} while (entryPos_Hgt.z < t.w); // while both comps have different results
instead of:



// perform linear search
do
{
entryPos_Hgt += castVec_Hgt;
t = f4tex2D(rmtex, entryPos_Hgt.xy);
} while (entryPos_Hgt.z < t.w);
The same goes for the single if statement after the binary search.

And I have already another idea how to make the step size adaptive (with minimal overhead this time), maybe I implement it later this evening...

Btw, has anyone tested this shader on a GeForce 6800 Ultra?

fpo
09-19-2004, 10:37 AM
Hi pro_optimizer... your suggestions are great and work good for the FP40 profile. Excelent work in the optimizations!

Much better fps now and the best thing is that it solves all the artifacts found when looking at sharp angles using the adaptative precision level in linear search based on viewangle and normal.

I have integrated the code to my original demo... please redownload again... includes a FP40 option in render menu now running that code.

We must implement support for self shadows now in FP40 version... the adaptative precision should also help on the shadow edges when light it as sharp angles.

pro_optimizer
09-19-2004, 11:11 AM
Thanks, this is what I wanted to hear.

The shadowing code should be something like this:
(all you need to do is transform the incidence angle into heightmap space (project_uvw()) and basically subtract it from the traced position in heightmap space to determine the light entry position)



frag2screen main_frag_rm(
// interpolated fragment data
vert2frag IN,
// normalmap + heightmap
uniform sampler2D rmtex:TEXUNIT0, // rm texture map
// color map
uniform sampler2D colortex:TEXUNIT1, // color texture map
// these define heightmap space
uniform float4 axis_pos, // base vertex pos (xyz)
uniform float4 axis_x, // base x axis (xyz:normalized, w:length)
uniform float4 axis_y, // base y axis (xyz:normalized, w:length)
uniform float4 axis_z, // base z axis (xyz:normalized, w:length)
// these are in object space (hopefully)
uniform float4 camerapos, // camera position (xyz)
uniform float4 lightpos, // lightposition (xyz)
// material factor
uniform float4 specular, // specular color (xyz:rgb, w:exponent)
// for depth correction
uniform float4 modelviewprojz, // 3rd column from modelview projection matrix
uniform float4 planes) // near and far plane distances (near,far,near*far,1/(far-near))
{
frag2screen OUT;

// entry position
float4 entryPos_Obj;
float3 entryPos_Hgt;
// entry vector (+normalization)
float3 entryVec_Obj, entryVec_Hgt;
// traced position
float4 tracePos_Obj, tracePos_Hgt;
// light vector
float3 lightVec_Obj, lightVec_Hgt;
// light entry position
float3 lightPos_Hgt;

float shadow = 1.0;
const float ambient=0.2;
const float bias=0.03;

// *** Ray Intersection ***

// calculate entry position
entryPos_Obj = IN.opos;
entryPos_Hgt = project_uvw(entryPos_Obj.xyz - axis_pos.xyz,axis_x,axis_y,axis_z);

// calculate entry vector
entryVec_Obj = normalize(entryPos_Obj.xyz - camerapos.xyz);
entryVec_Hgt = project_uvw(entryVec_Obj,axis_x,axis_y,axis_z);

// perform raytracing
tracePos_Hgt = ray_intersect_rm(rmtex,entryPos_Hgt,entryVec_Hgt);
tracePos_Obj = tracePos_Hgt.x*axis_x + tracePos_Hgt.y*axis_y + tracePos_Hgt.z*axis_z + axis_pos;

// have we hit the texture?
if (tracePos_Hgt.w > 0)
{
// *** Specular Normal Mapping ***

// light vector
lightVec_Obj = normalize(tracePos_Obj.xyz - lightpos.xyz);

// compute diffuse color
float3 color = IN.color.xyz*f3tex2D(colortex,tracePos_Hgt.xy);

// load & expand from normal map
float4 normal = f4tex2D(rmtex,tracePos_Hgt.xy);
normal.xy = normal.xy*2.0 - 1.0;
normal.z = sqrt(1.0 - dot(normal.xy,normal.xy));
normal.xyz = normalize(normal.x*axis_x.xyz + normal.y*axis_y.xyz - normal.z*axis_z.xyz);


#ifdef RM_SHADOWS
// calculate light entry vector
lightVec_Hgt = project_uvw(lightVec_Obj, axis_x, axis_y, axis_z);

// calculate light entry pos
lightPos_Hgt = tracePos_Hgt - lightVec_Hgt*tracePos_Hgt.z/lightVec_Hgt.z;

//perform raytracing
lightPos_Hgt = ray_intersect_rm(rmtex,lightPos_Hgt,lightVec_Hgt);
shadow = (lightPos_Hgt.z < tracePos_Hgt.z-bias)?0:1;

// compute diffuse and specular influence
float diff = shadow*saturate(dot(-lightVec_Obj,normal.xyz));
float temp = saturate(dot(normalize(-lightVec_Obj - entryVec_Obj),normal.xyz));
float3 spec = specular.xyz*pow(shadow*temp,specular.w);
#else
// compute diffuse and specular influence
float diff = saturate(dot(-lightVec_Obj,normal.xyz));
float temp = saturate(dot(normalize(-lightVec_Obj - entryVec_Obj),normal.xyz));
float3 spec = specular.xyz*pow(temp,specular.w);
#endif

// compute final color
OUT.color.xyz = color*diff + spec;
OUT.color.w = 1;

#ifdef RM_DEPTHCORRECT
// *** Depth Correction ***
tracePos_Obj.w = 1;
float depth = dot(-tracePos_Obj,modelviewprojz);
OUT.depth = (planes.z/depth + planes.y)*planes.w;
#endif
} else
OUT.color.w = 0;

return OUT;
}
It works and it has no jagged edges,
but there is a small flaw in it which I could not fix yet (have no internet connection at home): A small shadowed hole sometimes appears where displacement=0).
And the performance varies a lot...
This is because at sharp light incidence angles the raytracer has much more to do while you still need to draw the same amount of fragments. (not like in the view-angle case)
We should try to combine the depth-correct relief mapping with shadow mapping (if the precision is not too lousy), so we can generate soft shadows (with random jittering, you know).

fpo
09-19-2004, 12:50 PM
But you are not actualy doing the adaptative linear walk step in that latest code.

Also image looks good because you use 0.005 in linear search step (200 steps maximum). Would be better with variable size based on view direction like you said before.

fpo
09-19-2004, 04:21 PM
OK pro_optimizer... last FP40 code was not good (wrong lighting and not that fast as was doing step size 0.005 all time!).

Posted a new version now with correct lighting and adaptative number of steps in linear search based on ray angle with normal.

Also shadows are working ok now in FP40 mode. Looks like the artifacts are gone with FP40 now for color and shadows. Please redownload demo from same url as last post.

Tell me if you get better framerates now in FP40 mode pro_optimizer.

mbue
09-20-2004, 01:28 PM
Hi FPO,

as far as I can see, your step size adaption is essentially the same as in pro_optimizer's code. In his version, it was implicitly done in the renormalization section of ray_intersect_rm.

Since the length of castVec_Hgt is constant only in two dimensions (uv plane), at a small angle, the vector would be very short (approaching linear_cast_radius) --> up to 200 steps to the ground, and at a greater angle (~ 90 degrees), the length of the vector approaches 1.0 (thanks to the usage of min) --> only one step to the ground.

Very cool technique, can we see more height maps or a new video?

Greetings, mbue

fpo
09-20-2004, 02:20 PM
Thanks for clearing things mbue. And sorry pro_optimizer for the unfounded comments, but could not get your re-normalization code.

Here is a link for a 3dsmax 6 plugin that generates relief maps. This was the plugin used to create the sample files included in demo.

pNormalMap6.dlu (http://www.paralelo.com.br/arquivos/pNormalMap6.dlu)

Just install it, run max and model your object in XY plane. Then from a top view, select all objects to compose the relief map and click the render relief button in utility panel. You can select the resolution and antialiase factor.

divide
09-21-2004, 02:56 AM
hehe, I simply do this with a procedural gradient texture, and set it from black to white as the geometrie get closer to the point of view. Then I set auto-illumination to the max, and render the scene.

(here's a screenshot using my own opengl displacement routine: http://www.divideconcept.net/d2k4/render/d2k4logo.jpg )

fpo
09-21-2004, 08:45 AM
Good video divide! But the parallax mapping should not look that bad I think. Good displace on cube anyway. But is that software rendering?

Please post some demo we could run and check it out. I would like to see a full room all using pixel displace maps like that and some lights. I will get a frind to model something for me soon in that direction...

divide
09-21-2004, 09:21 AM
Actually parallax looks really bad when you want to push the deepness a bit to enhance the displacement. All parallax examples you use to see has very few depth (bricks, misc. patterns, etc...). To fully establish a visual comparaison between parallax mapping and my method, I had to give the same depth to each one. That's why parallax really looks bad on the comparaison.

First stage of my engine was software rendering, I wanted to establish a prototype if such a thing was possible in realtime. Now I'm moving on the hardware adaptation, and begin to have some nice results.
Don't worry I'll post a demo when it will be advanced enough.. But for now still optimizing the thing :)

pro_optimizer
09-21-2004, 12:43 PM
Thanks, mbue. Sorry if my comments were a bit brief in the renormalization section.
Fpo, Looking at your code, is seems as if entryVec_Hgt becomes infinitely small when it gets parallel to the texture plane. And castVec_Hgt cannot become greater than 0.1 (due to max(rayangle*0.1, 0.02)), which is unfortunate when you look directly onto the surface. But this is only theoretical since I could not test it at home yet (I will do this tonight and tell you the results tomorrow).
Anyways, good that you got the shadows right, my code was admittedly a bit lame.

Nice work, divide! Your video is quite impressive. Do you use the same basic algorithm as fpo?
If yes, do you have some optimizations which we haven't found yet? After having tested two very different ways of accelerating it depending on the actual geometry (with preprocessing) with no success, I come to the conclusion that one cannot make it much faster than it is now :-/

Apart from this there are a few things which are still to be solved. One thing is rendering polygons wich show the heightmap from the side or from an arbitrary angle, therefore we must detect if the ray leaves the heightmap (or better the range covered by the top polygon). This becomes hard when texture repeat is on and when we want to render only parts of the heightmap (I am seeing this shader already running on heightmapped characters ;-).
And second, maybe someone finds a way to do this on curved surfaces.

Last thing: The textures+heightmaps which are used in humus' well-known parallax mapping demo look incredibly real when rendered with relief mapping (especially the floor texture). You should have a look at them!

fpo
09-21-2004, 04:51 PM
Found the problem with the lighting in FP40 profile. It was getting darker than all other versions (bump, parallax, fp30).

The problem was when projecting back to object space after the ray intersection. We must multiply by the axis length (w component) as all ray tracing is in normalized [0,1] texture and depth space.

So correct way to do it is:

...
// perform raytracing
tracePos_Hgt = ray_intersect_rm(rmtex,entryPos_Hgt,entryVec_Hgt,r ayangle);
tracePos_Obj = axis_pos +
tracePos_Hgt.x*axis_x*axis_x.w +
tracePos_Hgt.y*axis_y*axis_y.w +
tracePos_Hgt.z*axis_z*axis_z.w;
...I have already updated the demo zip with this fix... just re-download.

fpo
09-21-2004, 06:51 PM
Yes pro_optimizer, the original parallax mapping rockwall sample looks great here (and can also tile properly and generate nice shadows).

I had to invert the depth map (as in my shader I consider 0 at the top and 1 at bottom). I pasted the normal and inverted depth maps together and used a depth range of 4% of width. Looks excelent!

Just posted demo again including that new relief map. Hope there is no problem is using that images... enjoy!

divide
09-22-2004, 12:53 AM
Originally posted by pro_optimizer:
Nice work, divide! Your video is quite impressive. Do you use the same basic algorithm as fpo?
If yes, do you have some optimizations which we haven't found yet? After having tested two very different ways of accelerating it depending on the actual geometry (with preprocessing) with no success, I come to the conclusion that one cannot make it much faster than it is now :-/
I use a different approach to the problem, which use a few preprocessing to achieve the displacement. However I'll tell more about this when it will be fast enough to run @20fps fullscreen on my fx5200.


Originally posted by pro_optimizer:

Apart from this there are a few things which are still to be solved. One thing is rendering polygons wich show the heightmap from the side or from an arbitrary angle, therefore we must detect if the ray leaves the heightmap (or better the range covered by the top polygon). This becomes hard when texture repeat is on and when we want to render only parts of the heightmap (I am seeing this shader already running on heightmapped characters ;-).
I also thought of theses problems, and I'm gonna implement this after I fully optimized the first step.
However IMO rendering only part of the height map using a mask isn't a good answer. If we had to create a full head, I would rather think of displacing a cube with each side using a different displacement map. There would be no border effect because of the answer to the first problem (stoping the ray after it exited the volume defined for each polygon).


Originally posted by pro_optimizer:

And second, maybe someone finds a way to do this on curved surfaces.
Yes that's the point number one to generalize use of displacement mapping..
I think I have some answer to this problem too, which is very close to the problem of stopping the ray outside of his displacement volume.
But not time yet to think about it ;)


Originally posted by pro_optimizer:

Last thing: The textures+heightmaps which are used in humus' well-known parallax mapping demo look incredibly real when rendered with relief mapping (especially the floor texture). You should have a look at them!Yeah I know this demo, I was really impressed when I saw it the first time. Parallax is a nice trick to render 3d patterns !

pro_optimizer
09-24-2004, 07:57 AM
It was obviously a very good decision to generalize the raytracer so far that it only takes the ray direction and a point to start from: What can you do when you have a heightmap and a raytracer if not raytraced reflections?
This is probably the sickest shader you have seen for quite a time, but it is only a slight variation of the original version. It renders the relief map with a reflection depth of 2 (effectively doing 3 raytraces (+3 when you turn shadows on))

Here is the changed fragment shader:



const int bounces = 3;
#ifdef RM_DOUBLEPRECISION
const float view_res = 0.005;
const float shadow_res = 0.01;
#else
const float view_res = 0.0075;
const float shadow_res = 0.015;
#endif


/////////////////
// RELIEF MAPPING

frag2screen main_frag_rm(
// interpolated fragment data
vert2frag IN,
// normalmap + heightmap
uniform sampler2D rmtex:TEXUNIT0, // rm texture map
// color map
uniform sampler2D colortex:TEXUNIT1, // color texture map
// these define heightmap space (we should pass this as matrix plus its inverse transpose to save the costly project_uvw()s)
uniform float4 axis_pos, // base vertex pos (xyz)
uniform float4 axis_x, // base x axis (xyz:normalized, w:length)
uniform float4 axis_y, // base y axis (xyz:normalized, w:length)
uniform float4 axis_z, // base z axis (xyz:normalized, w:length)
// these are in object space (hopefully)
uniform float4 camerapos, // camera position (xyz)
uniform float4 lightpos, // lightposition (xyz)
// material factor
uniform float4 specular, // specular color (xyz:rgb, w:exponent)
// for depth correction
uniform float4 modelviewprojz, // 3rd column from modelview projection matrix
uniform float4 planes) // near and far plane distances (near,far,near*far,1/(far-near))
{
frag2screen OUT;
// entry position (_Obj is object space, _Hgt is heightmap space)
float4 entryPos_Obj;
float3 entryPos_Hgt;
// entry vector (+normalization)
float3 entryVec_Obj, entryVec_Hgt;
// traced position
float4 tracePos_Obj, tracePos_Hgt;
// light vector
float3 lightVec_Obj, lightVec_Hgt;
// light position
float3 lightPos_Hgt;
// surface normal (NEW)
float3 normal_Hgt, normal_Obj;
// summed color for this fragment (NEW)
float4 finalColor = (0,0,0,0);

const float shadow_threshold=0.02;
const float shadow_intensity=0.4;
// current ray-trace bounce (NEW)
float bounce = 1;

// *** Ray Intersection ***

// calculate entry position
entryPos_Obj = IN.opos;
entryPos_Hgt = project_uvw(entryPos_Obj.xyz - axis_pos.xyz,axis_x,axis_y,axis_z);

// calculate entry vector
entryVec_Obj = normalize(IN.opos.xyz - camerapos.xyz);
entryVec_Hgt = project_uvw(entryVec_Obj,axis_x,axis_y,axis_z);

// manually unrolling this would be faster unless someone hints "-unroll all" or so to the compiler
for (int i=0;i<bounces;i++)
{
// perform raytracing
tracePos_Hgt = ray_intersect_rm(rmtex,entryPos_Hgt,entryVec_Hgt, view_res);
tracePos_Obj = tracePos_Hgt.x*axis_x*axis_x.w + tracePos_Hgt.y*axis_y*axis_y.w + tracePos_Hgt.z*axis_z*axis_z.w + axis_pos;

// have we hit the texture?
if (tracePos_Hgt.w > 0)
{
// *** Specular Normal Mapping ***

// light vector
lightVec_Obj = normalize(tracePos_Obj.xyz - lightpos.xyz);
// compute diffuse color
float3 color = IN.color.xyz*f3tex2D(colortex,tracePos_Hgt.xy);
// load & expand from normal map
normal_Hgt = f3tex2D(rmtex,tracePos_Hgt.xy)*2 - 1;
normal_Hgt.z = sqrt(1.0 - dot(normal_Hgt.xy,normal_Hgt.xy));
normal_Obj = normal_Hgt.x*axis_x.xyz + normal_Hgt.y*axis_y.xyz - normal_Hgt.z*axis_z.xyz; // this will be wrong for an unnormalized texture matrix

#ifdef RM_SHADOWS
// calculate light entry vector
lightVec_Hgt = project_uvw(lightVec_Obj, axis_x, axis_y, axis_z);
// calculate light entry pos
lightPos_Hgt = tracePos_Hgt.xyz - lightVec_Hgt*tracePos_Hgt.z/lightVec_Hgt.z;
// perform shadow raytracing
float4 shadowhit = ray_intersect_rm(rmtex,lightPos_Hgt,lightVec_Hgt,s hadow_res);
shadowhit.w = shadowhit.z<tracePos_Hgt.z-shadow_threshold?shadow_intensity:1.0;
color *= shadowhit.w;
specular *= shadowhit.w>0.998?1.0:0.0;
#endif
// compute diffuse and specular influence
float diff = saturate(dot(-lightVec_Obj,normal_Obj));
float temp = saturate(dot(normalize(-lightVec_Obj - entryVec_Obj),normal_Obj));
float3 spec = specular.xyz*pow(temp,specular.w);

// compute final color (NEW)
finalColor.xyz += (1/bounce)*(color*diff + spec);
finalColor.w = 1;

// prepare for reflection (NEW)
entryPos_Hgt = tracePos_Hgt;
entryVec_Obj = reflect(entryVec_Obj, normal_Obj);
entryVec_Hgt = project_uvw(entryVec_Obj,axis_x,axis_y,axis_z);
};
bounce++;
}

OUT.color = finalColor; // writing to this register is relatively slow

#ifdef RM_DEPTHCORRECT
// *** Depth Correction ***
tracePos_Obj.w = 1;
float depth = dot(-tracePos_Obj,modelviewprojz);
OUT.depth = (planes.z/depth + planes.y)*planes.w;
#endif

return OUT;
}
And the raytracer now takes an additional quality parameter (so we don't need to run it with highest perecision in every case):



// RAY INTERSECT DEPTH MAP WITH BINARY SEARCH
// RETURNS INTERSECTION POINT OR (0,0,0,0) ON MISS
float4 ray_intersect_rm(
in sampler2D rmtex,
in float3 entryPos_Hgt,
in float3 castVec_Hgt, in float linear_cast_radius)
{
float4 t, res = float4(0, 0, 0, 0);

// after this, cast vector reaches from polygon to ground plane
castVec_Hgt *= 1.0/abs(castVec_Hgt.z);
// after this, it has a length of linear_cast_radius IN THE UV-PLANE, which can be quit long in 3d
// but it won't become (due to min(0.2, ...)) longer than 1/5 (linear search unroll factor) of the poly->ground vector
castVec_Hgt *= min(0.2,rsqrt(dot(castVec_Hgt.xy, castVec_Hgt.xy))*linear_cast_radius);

// [ perform linear search ]
do
{
entryPos_Hgt += castVec_Hgt;
t = f4tex2D(rmtex, entryPos_Hgt.xy);
entryPos_Hgt += (entryPos_Hgt.z >= t.w?0:castVec_Hgt);
t = f4tex2D(rmtex, entryPos_Hgt.xy);
entryPos_Hgt += (entryPos_Hgt.z >= t.w?0:castVec_Hgt);
t = f4tex2D(rmtex, entryPos_Hgt.xy);
entryPos_Hgt += (entryPos_Hgt.z >= t.w?0:castVec_Hgt);
t = f4tex2D(rmtex, entryPos_Hgt.xy);
entryPos_Hgt += (entryPos_Hgt.z >= t.w?0:castVec_Hgt);
t = f4tex2D(rmtex, entryPos_Hgt.xy);

// uv-clipping (so no infinitely long surface-parallel rays occur)
//t.w = (clamp(entryPos_Hgt.x,0, 1) != entryPos_Hgt.x?-1:t.w);
//t.w = (clamp(entryPos_Hgt.y,0, 1) != entryPos_Hgt.y?-1:t.w);

// needed for upwards facing cast vector (e.g. after reflection or when using an arbitrary heightmap matrix)
t.w = (entryPos_Hgt.z<0?-1:t.w);
entryPos_Hgt.z = max(entryPos_Hgt.z, -1);
} while (entryPos_Hgt.z < t.w);

if (t.w > -1)
{
// [ perform binary search, manually unrolled this time ]
castVec_Hgt *= 0.5;
entryPos_Hgt += castVec_Hgt*sign(t.w - entryPos_Hgt.z);
t = f4tex2D(rmtex, entryPos_Hgt.xy);
castVec_Hgt *= 0.5;
entryPos_Hgt += castVec_Hgt*sign(t.w - entryPos_Hgt.z);
t = f4tex2D(rmtex, entryPos_Hgt.xy);
castVec_Hgt *= 0.5;
entryPos_Hgt += castVec_Hgt*sign(t.w - entryPos_Hgt.z);
t = f4tex2D(rmtex, entryPos_Hgt.xy);
castVec_Hgt *= 0.5;
entryPos_Hgt += castVec_Hgt*sign(t.w - entryPos_Hgt.z);
t = f4tex2D(rmtex, entryPos_Hgt.xy);
castVec_Hgt *= 0.5;
entryPos_Hgt += castVec_Hgt*sign(t.w - entryPos_Hgt.z);
t = f4tex2D(rmtex, entryPos_Hgt.xy);
} else t.w = 1.0;

// store output
res.w = (t.w <= 0.996?1:0);
res.xyz = entryPos_Hgt;
return res;
}
It has a slightly improved bounds checking in it (for arbitrary rays). And it is also a bit faster because the compiler did not unroll the binary search loop, so I did it manually.

The absolutely best map for this is tile1.rm. When you want to render an open map (like angel.rm) and have texture clamping enabled, you need to uncomment the uv-clipping in the raytracer. Otherwise it will be very slow.

If you have an idea how to make this faster or more beatiful, please let me know.

SKoder
09-25-2004, 08:10 AM
Fpo:
Your demo is super!
i wanted to ask two questions:
First:
Where is vertex shader?
And the second:
What do you put in
opos,
hpos
and the axix_pos

fpo
09-25-2004, 01:26 PM
Hi pro_optimizer... tried your code and I do get the reflections... intresting. But it is so slow now as for the so many ray intersections required. Also reflection quality is not that good.. very aliased.

And hi SKoder, my shader does not require a vertex program... if uses only a frament program.

In my shader, opos is the fragment position in object space (passed in as TEXCOORD1) and hpos the position in homogeneous space(opos*modelviewprojection).

The axis_pos is the origin for the uv mapping spcae. So for a 1x1 quad tile, axis_pos is the left-lower corner vertex of the polygon and axis_x and axis_y are the connecting edges with its size in w component. You can think it as the uv mapping representation of the 1x1 texture tile in object space. Look at my pdf file (found in first post here) and at fig 5 you will see them as P, X, Y and Z.

pro_optimizer
09-26-2004, 07:37 AM
Hi fpo, you are right, the aliasing is a bit distracting. The problem is that the reflection direction is not calculated analytically, but is just drawn from the normal map instead. And this cannot be solved properly, but it looks good when the normals do not vary so much. (like on the pyramid)
It was more of a test to see if it can be done in realtime and how it would look... I just thought you guys might be interested.

There is a bug when you activate shadows so there cannot be specular interreflections in shadowed regions, because the specular parameter must not be changed from one pass to the other.
It is correct (and looks more realistic) with this code snippet:



shadow = 1;
#ifdef RM_SHADOWS
// calculate light entry vector
lightVec_Hgt = project_uvw(lightVec_Obj, axis_x, axis_y, axis_z);
// calculate light entry pos
lightPos_Hgt = tracePos_Hgt.xyz - lightVec_Hgt*tracePos_Hgt.z/lightVec_Hgt.z;
// perform shadow raytracing
float4 shadowhit = ray_intersect_rm(rmtex,lightPos_Hgt,lightVec_Hgt,s hadow_res);
shadow = shadowhit.z<tracePos_Hgt.z-shadow_threshold?0:1;
color *= shadow?1:shadow_intensity;
#endif
// compute diffuse and specular influence
float diff = saturate(dot(-lightVec_Obj,normal_Obj));
float temp = saturate(dot(normalize(-lightVec_Obj - entryVec_Obj),normal_Obj));
float3 spec = shadow*specular.xyz*pow(temp,specular.w);
And it gets 2.6 times as fast (13fps -> 34 fps) when you do only 2 bounces and manually unroll the for loop in main_frag_rm().

Btw, I get now as much as 300 fps in the normal fp40 path with:
* unrolled binary search loop
* writing to finalColor instead of OUT.color
* big if condition if main_frag_rm() replaced by a "finalColor.w = (tracePos_Hgt.w>0?1:0);"

mbue
09-26-2004, 10:49 AM
Hi,

for those who don't own a Geforce 6800 (humus, can you hear me?), some screen shots of pro_optimizer's relief mapping (some of them including reflections):

http://www.inf.tu-dresden.de/~s5806533/reliefm/

Greetings,

mbue

fpo
09-26-2004, 12:14 PM
Ohh no... what have you done with screen09.jpg!!

Its a full landscape including reflections in the river and a pyramid!!! Did you use my normal map max plugin to generate it?

Did you add some other map (or color texture alpha) to modulate specular and reflection factor in your samples?

Also screen08 and screen5 are very nice. Could you send me these new sample files so I could integrate them to the original the demo? or just post them somewhere people can download and try?

mbue
09-26-2004, 12:26 PM
Hi,

all you can see on the screen shots was done by pro_optimizer. The landscape map was taken from a simple Turbo Pascal voxels demo written by Steven H Don many years ago.

More on the maps has to be posted by pro_optimizer, I guess.

Greetings,

mbue

divide
09-26-2004, 10:18 PM
I'd really like to have the color+height map from screen08 and 09 too !

by the way, do you know some websites with free color+height maps of real landscape ?

mbue
09-26-2004, 11:02 PM
Color and height maps of the shots should be no problem, but any additional maps (like normal maps) have to come from pro_optimizer (if necessary). I have to get some more sleep, then I will upload the maps.

Ventura
09-26-2004, 11:21 PM
I was just wondering what peoples thought were on the future of this kind of technique.

pro_optimizer + fpo its really great work! :cool:

Is this the sort of thing that should be left to actual geometry so that you dont have to have multiple shadowing/refleciton techniques or eat lots of texture resources?? or are massive fragment shaders that will run well on future hardware, and replace loads or geometry and free up bandwidth in some areas definately a good thing :confused:

M.

gaby
09-26-2004, 11:27 PM
Really impressive... :D Is this technics need really high definition normal maps ? What appen when we are in high normal map sub-sampling situation ?

Gaby

fpo
09-27-2004, 04:52 AM
Hi... just made some modifications to suuport any object geometry (works in tangent space now).

Check out some screenshots for a teapot and sphere at:
rockwall_teapot.jpg (http://www.paralelo.com.br/img/rockwall_teapot.jpg)
rockwall_sphere.jpg (http://www.paralelo.com.br/img/rockwall_sphere.jpg)
relief_teapot.jpg (http://www.paralelo.com.br/img/relief_teapot.jpg)
relief_sphere.jpg (http://www.paralelo.com.br/img/relief_sphere.jpg)

I will post the udpated code soon... but need to resolve the shadows first. And many things to modify in demo for new parameters.

mbue
09-27-2004, 05:15 AM
Hi, I've just uploaded three maps. Not sure if pro_optimizer really used them. You can find them at

http://www.inf.tu-dresden.de/~s5806533/reliefm/

As you know, height maps have to be rescaled and biased before they can be used with the shader.

fpo: Fantastic new results. Is this applicable for games?

Greetings,

mbue

divide
09-27-2004, 06:51 AM
fpo what is your hardware ? (so I can compare with the fps on the screenshots)

SirKnight
09-27-2004, 07:52 AM
Originally posted by fpo:
Hi... just made some modifications to suuport any object geometry (works in tangent space now).

Check out some screenshots for a teapot and sphere at:
http://www.paralelo.com.br/img/rockwall_teapot.jpg
http://www.paralelo.com.br/img/rockwall_sphere.jpg
http://www.paralelo.com.br/img/relief_teapot.jpg
http://www.paralelo.com.br/img/relief_sphere.jpg

I will post the udpated code soon... but need to resolve the shadows first. And many things to modify in demo for new parameters.That is really nice. I look forward to seeing what you did that allows this to be used on more complex geometry besides a quad.

-SirKnight

pro_optimizer
09-28-2004, 06:48 AM
Woow! This looks good, fpo! Very promising...
Looking forward to seeing your code.
Unfortunately I don't have the FX Composer (103 megs and no own internet connection) so I cannot test it yet



Really impressive... Is this technics need really high definition normal maps ? What appen when we are in high normal map sub-sampling situation ?

Gaby
Yes, for the reflections a 16 bit per channel floating point normal map would be best.
You would not believe how inaccurate the normals in a standard 8bit normal map are! And the thing gets even worse when you apply a bilinear filter to the normals (they become shorter and change their direction), not to mention what happens to them when you store only two components and recalculate the third with sqrt(1 - (x + y)).
What I did to make the reflections look a lot better (than in the screenshots I sent mbue) is to store all three components in the relief map and normalize them explicitly. And you can go a step further (when you accept floating point textures) and store the arcsin of each component, and restore the normal with a vectorized normal = sin(angles) call. This makes them interpolate kinda "bispherical" instead of bilinear.



Originally posted by fpo:

Ohh no... what have you done with screen09.jpg!!

Its a full landscape including reflections in the river and a pyramid!!! Did you use my normal map max plugin to generate it?

Did you add some other map (or color texture alpha) to modulate specular and reflection factor in your samples?

Also screen08 and screen5 are very nice. Could you send me these new sample files so I could integrate them to the original the demo? or just post them somewhere people can download and try?
No, all I needed to do was to replace the "for(...) {" statement in the main fragment shader with a "while(dot(color, color) == 0) {" followed by a color = float3(1,1,1). :cool:

The maps for Screen5 come from a demo on www.humus.ca (http://www.humus.ca) (it is was called "Self shadow bumpmapping").
Btw: Have you thought of using a horizon map to quickly decide if you need to do a shadow raytrace or not? Would require 2 additional texture accesses in the main shader but can save a lot of raytracing power...



Originally posted by ventura:

I was just wondering what peoples thought were on the future of this kind of technique.

pro_optimizer + fpo its really great work!

Is this the sort of thing that should be left to actual geometry so that you dont have to have multiple shadowing/refleciton techniques or eat lots of texture resources?? or are massive fragment shaders that will run well on future hardware, and replace loads or geometry and free up bandwidth in some areas definately a good thing

Thanks a lot!
At least I hope that this becomes the future of computer games in one way or the other... At the moment it might still be too slow for games (at least in high resolutions and when you have a lot of texture detail per frame), look at Doom III which does only normal mapping... (yes, and stencil shadows, I know)
If the UE3 really does only do parallax mapping (which is what I believe), then it is greatly inferior to this technique. :D Maybe for UE4...

Oh, and the planar fp40 path is now @350fps-640x480 or 43fps-1600x1200 (but cannot remember what I did...)

I will be away for 2 weeks starting from tomorrow (vacancies), so I will not be able to post anything in this time.

Second post without code. :)

fpo
09-29-2004, 01:48 PM
I have implemented the new relief mapping shader in tangent space using NVIDIA FX Composer.

I have included normal, parallax and relief (with and without self-shadows). You can now apply the effect to any geometry that includes good texture mapping, normals and tangent space vectors (not only planar quads like before).

Relief Mapping for NVIDIA FX Composer 1.7:
relief_mapping_fx.zip (http://fabio.policarpo.nom.br/files/relief_mapping_fx.zip)

Enjoy,
FPO

SirKnight
09-30-2004, 05:36 AM
Really great stuff. It was fun playing with all the tweaks in fx composer. :)

I'd like to see this technique running in a "game world" environment. So, once I find time between my university work I'll add it to my "mini engine", as I like to call it, that loads doom 3 maps and add this as a rendering option as a console param.

Hmm...maybe I'll add this technique to my ray traced image I'm working on for my graphics professor for our next assignment. I bet he would freak out. :D Basically he just wants us to "impress him" so I think adding this would do the trick. :)

Awesome!

-SirKnight

fpo
09-30-2004, 11:00 AM
Thanks SirKnight... and having an engine that can read doom3 maps is so cool... I must see that!
But how are you going to generate depth maps for all those normal maps from Doom3 in order to use relief maps (do not think they have depth maps included).

How did you find out about the Doom3 map specs? I once did a converter from Quake3 to my own 3D engine (Fly3D2) and would be nice to be able to convert some Doom3 maps now to my newer engine that uses pixel lighting all around (Fly3D3).

SirKnight
09-30-2004, 01:22 PM
What it currently does is load the .proc files which contain all the compiled geometry from the .map. The .map and .proc files (as well as the aas and cm files) are all text files and are very easy to figure out how to load and render. What I havn't done yet is parse the .map files to extract the lights and their related data. Also the models are text files which are easy to understand how to load and render them. There are the md5mesh and md5anim (i think) extensions.

What I'll probably do about the textures to show this technique in a game like environment is just make my own textures, so I'll be able to make my own depth maps and stuff. If I wanted to use the doom 3 textures I guess I could take the regualar diffuse texture and convert it to a greyscale map and just use that as the depthmap. It may not be the best way but it would be something at least. That problem with using doom 3's textures is the whole issue of copywrighted material so sticking with my own textures will probably be the way to go. Of course as soon as I figure out how to add my own textures into DoomED. :)

-SirKnight

SirKnight
09-30-2004, 01:50 PM
fpo, I found something you might be interested in. In a search I was just doing on the cm files doom 3 uses I found this page where this dude made a simple doom 3 .proc loader that renders everything with the portal vis technique just like doom 3 does. He also figured out how to use the cm files and do collisions. Pretty cool. I think I'll be checking into his code to see how the collisions are done using cm files. :D

This link is here: http://developer.infi.nl/index.php?ID=5

Have fun.

-SirKnight

dorbie
09-30-2004, 09:31 PM
Nice job fpo, I thought it would 'just work'.

fpo
10-01-2004, 08:06 AM
Thanks for the link SirKnight... looks intresting... I will check it later sometime over the weekend. Also, was that post in last page (moderated by Dorbie) from you? Or was it fake?!?!

And thanks Dorbie... since your post about the possibility about doing it in tangent space I have been thing about how to do it. And now I did it! The problem was with the depth factor as in tangent space we do not know the size of the mapping (can have more or less tiles) so depth was undefined for a generic object with generic mapping.

dorbie
10-01-2004, 11:03 AM
fpo, that post is definitely forged, apart from the ridiculous contents the account is a different one and SirKnight contacted me about the abuse. I thought I'd label the post fake instead of deleting it so people are aware that this is happening. No question, it's a forged post and you don't have to be Sherlock Holmes to figure out who did it.

It's a shame it happened in your excellent thread. Just ignore it.

[edit: forged post now deleted, the forger edited it back & I wasn't going to get into an edit war with some anonymous malicious moron.]

unreal
10-01-2004, 11:31 AM
Very nice results! Is there Any "translation" of this shader to GLSL? (dont ask why. Lets say i Have problems with Cg... nothing more..)

SirKnight
10-01-2004, 12:44 PM
I actually started on the GLSL version. I have the vertex program done, I'll probably do the fragment stuff later today. It's pretty much a no brainer to convert over as long as you know the syntax and varaible types for both languages.

-SirKnight

fpo
10-01-2004, 01:57 PM
Good SirKnight... please send me the GLSL version of the shader when you have it done. Are you using Render Monkey tool for that? The good thing about that tool is that it also supports GLSL.

SirKnight
10-01-2004, 03:58 PM
Ok here they are. I did not test them in a demo but they have been ran through 3d lab's GLSL validator. But they should work because what I did was take the original Cg/HLSL code, paste it into a file and went through line by line and converted to GLSL syntax.

[VERTEX SHADER]

varying vec2 texCoord;
varying vec3 eyeSpaceVert;
varying vec3 eyeSpaceTangent;
varying vec3 eyeSpaceBinormal;
varying vec3 eyeSpaceNormal;
varying vec3 eyeSpaceLight;
varying vec3 lightColor;

attribute vec4 vertPos;
attribute vec4 vertColor;
attribute vec3 vertNormal;
attribute vec2 vertTexCoord;
attribute vec3 vertTangent;
attribute vec3 vertBinormal;

uniform vec3 objectSpaceLight;
uniform vec3 inLightColor;

void main()
{
// Vertex position in object space
vec4 objectSpaceVert = vec4( vertPos.x, vertPos.y, vertPos.z, 1.0 );

// Modelview rotation only part
mat3 modelViewRot;
modelViewRot[0] = gl_ModelViewMatrix[0].xyz;
modelViewRot[1] = gl_ModelViewMatrix[1].xyz;
modelViewRot[2] = gl_ModelViewMatrix[2].xyz;

eyeSpaceVert = (gl_ModelViewMatrix * vertPos).xyz;

eyeSpaceLight = (gl_ModelViewMatrix *
vec4( objectSpaceLight.x, objectSpaceLight.y, objectSpaceLight.z, 1.0 )).xyz;


eyeSpaceTangent = modelViewRot * vertTangent;
eyeSpaceBinormal = modelViewRot * vertBinormal;
eyeSpaceNormal = modelViewRot * vertNormal;

gl_FrontColor = vertColor;

texCoord = vertTexCoord;
lightColor = inLightColor;

gl_Position = ftransform();
}[FRAGMENT SHADER]

varying vec2 texCoord;
varying vec3 eyeSpaceVert;
varying vec3 eyeSpaceTangent;
varying vec3 eyeSpaceBinormal;
varying vec3 eyeSpaceNormal;
varying vec3 eyeSpaceLight;
varying vec3 lightColor;

uniform sampler2D texmap;
uniform sampler2D reliefmap;

uniform float tile;
uniform float depthFact;
uniform float shine;

uniform vec3 diffuse;
uniform vec3 specular;

float RayIntersectRm( vec2 dp, vec2 ds );

void main()
{
vec4 t;
vec3 p,v,l,s,c;
vec2 dp,ds,uv;

float d;

vec3 tempSpecular = specular;

// Ray intersect in view direction
p = eyeSpaceVert;
v = normalize( p );
s = normalize( vec3( dot( v, eyeSpaceTangent ), dot( v, eyeSpaceBinormal ),
dot( v, eyeSpaceNormal ) ) );

s *= depthFact * 0.2 / dot( eyeSpaceNormal, -v );
dp = texCoord * tile;
ds = s.xy;
d = RayIntersectRm( dp, ds );

// get rm and color texture points
uv = dp + ds * d;
t = texture2D( reliefmap, uv );
c = texture2D( texmap, uv ).rgb;

// expand normal from normal map in local polygon space
t.xy = t.xy * 2.0 - 1.0;
t.z = sqrt( 1.0 - dot( t.xy, t.xy ) );
t.xyz = normalize( t.x * eyeSpaceTangent - t.y * eyeSpaceBinormal + t.z * eyeSpaceNormal );

// compute light direction
p += s * d;
l = normalize( p - eyeSpaceLight.xyz );

// ray intersect in light direction
dp += ds * d;
s = normalize( vec3( dot( l, eyeSpaceTangent ), dot( l, eyeSpaceBinormal ),
dot( l, eyeSpaceNormal ) ) );

s *= depthFact * 0.2 / dot( eyeSpaceNormal, -l );
dp -= d * s.xy;
ds = s.xy;

float dl = RayIntersectRm( dp, s.xy );

if( dl < (d - 0.05) ) // if pixel in shadow
{
c *= 0.4;
tempSpecular = vec3( 0.0, 0.0, 0.0 );
}

// compute diffuse and specular terms
float att = max( 0.0, dot( -l, eyeSpaceNormal.xyz ) );
float diff = max( 0.0, dot( -l, t.xyz ) );
float spec = max( 0.0, dot( normalize( -l - v ), t.xyz ) );

// compute final color
vec4 finalColor;

finalColor.xyz = att * ( c * diffuse * diff + tempSpecular.xyz * pow( spec, shine ) );
finalColor.w = 1.0;

gl_FragColor = finalColor;
}

float RayIntersectRm( vec2 dp, vec2 ds )
{
const int linearSearchSteps = 16;
const int binarySearchSteps = 6;

float depthStep = 1.0 / float( linearSearchSteps );

// current size of search window
float size = depthStep;

// current depth position
float depth = 0.0;

// best match found (starts with last position 1.0)
float bestDepth = 1.0;

// search front to back for first point inside object
for( int i = 0; i < linearSearchSteps - 1; ++i )
{
depth += size;

vec4 t = texture2D( reliefmap, dp + ds * depth );

if( bestDepth > 0.996 ) // if no depth found yet
if( depth >= t.w )
bestDepth = depth; // store best depth
}

depth = bestDepth;

// recurse around first point (depth) for closest match
for( int i = 0; i < binarySearchSteps; ++i )
{
size *= 0.5;

vec4 t = texture2D( reliefmap, dp + ds * depth );

if( depth >= t.w )
{
bestDepth = depth;
depth -= 2.0 * size;
}

depth += size;
}

return bestDepth;
}Hope ya like 'em. :D

-SirKnight

fpo
10-02-2004, 09:01 AM
Thanks SirKnight... I have integrated the GLSL Relief Mapping into the ATI Render Monkey tool now.

Relief Mapping for ATI Render Monkey 1.5 (HLSL and GLSL):
relief_mapping.rendermonkey.zip (http://www.paralelo.com.br/arquivos/relief_mapping.rendermonkey.zip)

The wired thing is that in Render Monkey OpenGL mode my scene looks all inverted and I had to use front face culling instated and invert normal map Y direction. I think Render Monkey passes the same matrices (view, modelview, projection, etc...) to OpenGL as for DirectX but they use different coordinate systems. The OpenGL projection matrix must Z inverted in relation to DirectX as I remember.

The shader works ok and look fine on GLSL even with self-shadows... the only thing is that it is marrowed and you need to use front face culling instead of normal back face culling.

SirKnight
10-02-2004, 11:02 AM
I just realized I have the old render monkey version 1. Guess I need to upgrade. :)

That is weird about what render monkey is doing to the render. Something is definately not right about that.

-SirKnight

SirKnight
10-02-2004, 12:25 PM
I opened the shadermonkey file in a text editor and found the GLSL shader and I noticed that the changes you made do not comply strictly to the spec as the 3dlabs GLSL validator gave me about 12 errors. I'd be carefull about doing that if I were you. The biggest things it choked on was the use of float3 and float4 instead of vec3 and vec4 and the use of saturate which is not a standard GLSL function, hence the reason I used max. There were a few other small issues the validator barked at.

Also, I don't want to start a war about it or anything, but I noticed all my spacing I put in the shaders for better readability is gone. With everything all smashed together like how you have it now makes the code much more difficult to read. When I look at code all smashed together like that I spend 2x as much time or more figuring out what is going on because I see nothing but a bunch of text all ramed together. This isn't as big of a deal as the non-conforment syntax is but it is an issue I tried to solve. :)

-SirKnight

fpo
10-02-2004, 01:01 PM
Sorry for the float3/float4/saturate calls... Render Monkey did not comply about them. I have modified the shader and re-uploaded the zip and hope this version is ok now with 3dfx. Can you check if there is something else that should be chnaged to conform to 3dfx specs?

Also, its a pain to copy code from the forum as all formating is lost and all code gets into a single line after paste. So I just made the GLSL version myself looking over yours.

mbue
10-02-2004, 01:28 PM
Interesting. Seems to be a problem with Internet Explorer. However, copying from Mozilla Firefox preserves line breaks.

SirKnight
10-02-2004, 02:14 PM
Originally posted by fpo:
Sorry for the float3/float4/saturate calls... Render Monkey did not comply about them. I have modified the shader and re-uploaded the zip and hope this version is ok now with 3dfx. Can you check if there is something else that should be chnaged to conform to 3dfx specs?

Also, its a pain to copy code from the forum as all formating is lost and all code gets into a single line after paste. So I just made the GLSL version myself looking over yours.You mean 3d labs, not 3dfx. hehe. :D

Ok so the formatting from copying and pasting messed everything up and you retyped it all. I see. I thought that wasn't the case and you just went through the trouble in taking all spaces away to make a little harder to read code. ;) I mean either way, it's no big deal, I just prefer more spacing as it's more helpfull to me and I thought I'd just throw that in for the heck of it.

You can download on the 3d labs site the GLSL validator program which is a pretty helpfull util to see if your shaders have any errors and to see if you did anything that is not standard.

-SirKnight

dorbie
10-05-2004, 11:32 AM
For future reference, if you want to copy & paste code from here, just hit the edit post icon first, then you can copy & paste from the edit box. No need to actually edit the post of course.

fpo
10-06-2004, 05:07 PM
Thanks dorbie... but only moderators can edit posts from other people. It works just fine using the quote option instead.

And SirKnight, I checked all shaders now for 3dlabs compatibility... it passes the validation tool now. But do not think they have a video card that can run this shader yet.

Also, could someone (WarnK?!) please try the new HLSL version in ATI X800... maybe in HLSL it works there. Please try the FX Composer or Render Monkey version with it and post results here.

I have updated all shader demos now. The previous tangent space version did not have correct self-shadows. I fixed that now and shadows should be ok. I have also implemented the tangent space version in OpenGL demo with the FP30 and FP40 options.

Ffelagund
10-08-2004, 06:18 AM
Hello,
I have created a project for Shader Designer for this relief mapping.
This shader looks really great!!

This is the url:
http://www.typhoonlabs.com/relief.rar (http://www.typhoonlabs.com/relief.rar)

I've uploaded some videos too.
http://www.typhoonlabs.com/relief.avi
http://www.typhoonlabs.com/relief2.avi

In the name of all OpenGL community, thanks for this great shader :)

M/\dm/\n
10-13-2004, 01:41 AM
Finally I've found time to read about this technique, and it's AWESOME.

Right now I'm running on 6800GT & fps from previous builds are above 45 in all cases hitting 100+ at 1024x768 in some configs. I have problems runing any of recent demos though, as they are just showing black screen @ 40-85 fps.

As far as I understand shadow mapping will hurt more than it can solve, as there is a need to render scene from light/lights point of view, so killing any performance gains of spared instructions. And artifacts of smaller steps will multiply with usual sm depth precision issues. Multiple light sources are out of question because of the performance impact too.

So one thing I'm currently thinking about is how would Z-first pass work with this technique in complex scenes. As far as I know modifying depth will turn early Z off, but without Z correction it might give nice speedup. An if shaddow mapping is too expensive & there is no intesecting geometry it might be a valid and good solution to use z pass.

Does anybody remember URL of conditional loop optimization on NV40??? I remember I have seen one with ?exact? numbers of overhead for every kind of loop/branch, as well as suggested do's and dont's.

If I remember correctly, then loops with variables passed as parameters to fragment program are ~2x+ faster than ones that uses parameters from inside fragment program (wasn't there 3 cycles vs. 6???).
So it might be good idea to precalculate needed step count for linear/binary search in vp and pass as a parameter to fp. Or in extreme case on CPU. As all we need is one inverse dot of object plane normal & view vector muled with step count. And we got all the vectors there anyway...

I think I read that light count in FarCry was passed as value to fp on SM3.0 path, so minimizing loop overhead to minimum.

Maybe some folks from Nvidia might throw some info on smart use of loops/branches for NV40.

BTW, there is a new NVemu tool on developer.nvidia.com for those who still fly on GF4.

SKoder
10-15-2004, 04:14 AM
Is there a way to make this shader run on Raeon9xxx?
Maybe multipass?
What do you think?

SKoder
11-14-2004, 10:06 AM
Done it!

My Relief Mapping shader renders a scene with Relief Mapping in 7 passes.

It runs on my Radeon 9800pro, and i want to know where it also runs.

You can get it here(RenderMonkey) Relief Mapping (http://evolution.times.lv/Relief.rar)

jpeter
11-15-2004, 08:48 AM
It also works in my ATI 9700, but not perfect.

EarthQuake
03-17-2005, 10:24 PM
I had this running on a athlon xp 2500 and geforce 5200 a few months back and it ran at 5-10 fps but looked great so i wanted to try it out with my new system... I downloaded it and not all i get is a blank screen saying like 60 fps, and in the tangent space one i just get a black screen no text at all. Im running on an athlon 64 3000+ with a gigabyte 6600 GT now, is there anything special i need to install to view this that maybe i had installed on the old system or anyone else thats having this problem? I have the CG toolkit installed that you need for some other demos i've found too btw.

M/\dm/\n
03-18-2005, 03:10 AM
The newest uploaded version fails for me the same way too. AXP, 6800GT. One of the previous versions ran fine though.

I guess you have to run through source code to track the problem down and fix it, if you wish to see latest version running...

fpo
03-19-2005, 12:45 PM
Here are the demos updated for Cg 1.3. Should work fine on any GeForce6800 or simmilar video card.

Download from:

Relief Mapping in Tangent Space Demo (http://fabio.policarpo.nom.br/files/reliefmap2.zip)

Relief Mapping in Tangent Space Paper (http://fabio.policarpo.nom.br/docs/ReliefMapping_I3D2005.pdf)

Mikkel Gjoel
03-21-2005, 03:43 AM
I'm unable to switch to nv40-rendering in the latest demo - worked fine previously.
(6600GT, 71.84)

So, when can we expect a paper describing the silhuette-corrected version of relief mapping? :)

\\hornet

minpu
09-01-2011, 09:37 PM
fpo, I am sorry but the link you posted does not work for me, could you upload to another host? or send to my email, please? ( pnt1614@gmail.com )

And when I implemented the original relief mapping idea, I got slicing artifacts, but I really do not know the reason?

Thank

Alfonse Reinheart
09-01-2011, 11:09 PM
This was posted six years ago. The person you're addressing has not posted on this forum for five years.