PDA

View Full Version : OT: fastest ray-triangle line -triangle intersection algo

V-man
08-20-2002, 07:50 PM
I tried out the Magic software intersection codes. I made some minor changes to make them run faster.

I benchmarked at 93,000 tests per second in some cicumstances. In some cases, my own code runs faster (6X -> 550,000) and sometimes much slower.

93000 is a pretty low number isnt it? Anyone know of a good source that can reach the millions and even hundreds of millions?
I will take it and write an SSE version if I have too.

Thanks a plenty,
V-man

Kilam Malik
08-21-2002, 12:57 AM
In my ray triangle intersection I first reduce the problem to 2D.

1) Calculate Intersection of ray with the plane in which the triangle lies.
2) Take the highest coordinate of the normal vector of the plane. This one is dropped in the triangle points and the intersection point. Now you have a 2D problem.
3) You have P1, P2, P3 and X now. Calculate the area of A1 = (p1,p2,p3) and the area of A2 = (p1,p2,x)+(p2,p3,x)+(p3,p1,x). If a2 > a1 then x is outside the triangle.

I don't know how fast it is compared to other algo's. Which algorithms do you use?

Kilam.

Maj
08-21-2002, 02:50 AM
Could try the references in this paper: http://graphics.cs.uni-sb.de/Publications/2001/InteractiveRenderingWithCoherentRa yTracing.pdf (http://graphics.cs.uni-sb.de/Publications/2001/InteractiveRenderingWithCoherentRayTracing.pdf)

Dodgy c&p:

Bary. Pluecker Bary. speedup
C code SSE SSE 4-1
min 78 77 22 3.5
max 148 123 41 3.7

Table 1: Cost (in CPU cycles) for the different intersection algorithms. 41 cycles correspond to roughly 20 million in-tersections per second on a 800 MHz Pentium-III. Measured by using the internal Pentium-III CPU counters. In our implementation, the SSE code for intersecting four rays with a single triangle requires 86-163 CPU cycles. Amortizing this cost over the four rays results in only 22 to 41 cycles per intersection, which corresponds to roughly 20 to 36 million ray-triangle intersection test per second.

EDIT: And here's (http://graphics.stanford.edu/courses/cs348b-98/gg/intersect.html) the Barycentric algorithm they're using.

[This message has been edited by Maj (edited 08-21-2002).]

V-man
08-21-2002, 04:22 PM
Originally posted by Kilam Malik:
In my ray triangle intersection I first reduce the problem to 2D.

1) Calculate Intersection of ray with the plane in which the triangle lies.
2) Take the highest coordinate of the normal vector of the plane. This one is dropped in the triangle points and the intersection point. Now you have a 2D problem.
3) You have P1, P2, P3 and X now. Calculate the area of A1 = (p1,p2,p3) and the area of A2 = (p1,p2,x)+(p2,p3,x)+(p3,p1,x). If a2 > a1 then x is outside the triangle.

I don't know how fast it is compared to other algo's. Which algorithms do you use?

Kilam.

My idea's have revolved around tricks like this. Calculating the areas of each of those triangles will make you pay some cycles. I used the dot product approach. Too bad Intel doesn't have a dot product instruction to make life simpler.

Maj, sounds like nice pages there. I will have to read them in detail and see if they give me ideas.

PS: My own algorithm is reaching 558000 now, but not good enough since in some cases, it goes way lower.

V-man

WhatEver
08-21-2002, 04:58 PM
There was a disscusion like this at gamedev.net in the math forum. I tried all their methods and I found one that was the fastest. I don't know if it's THE fastest, but it's fast for me http://www.opengl.org/discussion_boards/ubb/smile.gif

inline unsigned int s3dVecInTriangle3f2(S3Dvecfv p, S3Dvecfv v0, S3Dvecfv v1, S3Dvecfv v2)
{
S3Dvec3f e10,e20;
S3Dvec3f vp;
float a, b, c, d, e, x, y, ac_bb;

s3dVecSubtract3f(e10, v1, v0);
s3dVecSubtract3f(e20, v2, v0);

a = s3dVecDot3f(e10,e10);
b = s3dVecDot3f(e10,e20);
c = s3dVecDot3f(e20,e20);

ac_bb = a*c - b*b;

s3dVecSubtract3f(vp, p, v0);

d = s3dVecDot3f(vp,e10);
e = s3dVecDot3f(vp,e20);

//x = d*c - e*b;
//e = e*a + d*b;

x = d*c - e*b;

if ( x < 0 )
{
return false;
}

y = e*a - d*b;

if ( y < 0 )
{
return false;
}

return (x+y) <= ac_bb;
}

zed
08-21-2002, 07:49 PM
a famous example http://www.acm.org/jgt/papers/MollerTrumbore97/
beware though this code (+ most others out there is flawed, ie it will fail sometimes when intersection falls on the triangless border (yes i know triangles strictly dont have a border as its infinitly thin)

__
|./|
|/.|

but its a pain when u have a quad made up of 2 tris and a ray passes through both (no intersection) cause its on the diagonal

btw some links to pt in polygon routines (i believe only one handles the point on border cases though http://www.opengl.org/discussion_boards/ubb/frown.gif) http://astronomy.swin.edu.au/pbourke/geometry/insidepoly/ http://geometryalgorithms.com/Archive/algorithm_0103/algorithm_0103.htm http://www.dfanning.com/tips/point_in_polygon.html http://vbdesign.hypermart.net/cadpages/point_in_polygon.htm http://home.earthlink.net/~bobstein/inpoly/ http://www.whisqu.se/per/docs/math26.htm http://www.cg.tuwien.ac.at/courses/AlgoCompCart/Lecture05/plumbline.html http://www.acm.org/tog/resources/RTNews/html/rtnv5n3.html http://condor.informatik.uni-oldenburg.de/~stueker/graphic/index.html http://www.acm.org/tog/editors/erich/ptinpoly/

davepermen
08-22-2002, 06:51 AM
if you only want to intersect triangles with rays, say you want for example fast raytracing, i suggest that you convert all your rays to plueckerspace (needs another 3 coordinates to the existing ray, or simply a 6d vector (plueckerspace is homogene 5d..)

your triangles as well get some additional data, namely they get the 3 rays surrounding it, the edges. those, as well, in plueckerspace. the final intersectiontest is simply 3 6d dotproducts, and the comparison, if all are bigger than zero (or below, depending on your data setup..).

its one of the fastest, but uses more precalculated data. it depends, but for a raytracer its great (and you can trace all sort of convex polygons, so quads as well for example..)
http://www.flipcode.com/pluecker oder http://www.flipcode.com/tutorials/pluecker i dont remember.. good luck http://www.opengl.org/discussion_boards/ubb/biggrin.gif

jwatte
08-22-2002, 05:34 PM
You probably know more about your triangles than just that they're 500,000 arbitrary triangles.

If, for example, they don't move between tests (but the ray does) then you can put them in a spatial index, such as a kd-tree. That will let you cull away a lot of non-colliding triangles very fast.

If they move, but they move in groups -- such as meshes in a game, for example -- you can treat each group with a bounding volume (such as a sphere) and quickly cull away large groups, only to test those whose bounding volume intersect the ray.

If you generate these in software constantly, so you really don't know anything about them a priori, but you know the ray as you generate them, you can run the test as part of triangle generation. That's likely to cache much better.

V-man
08-22-2002, 08:48 PM
OK, to begin with, I tried Whatever's code (and it resembles Magic Softwares code but is super short!)

Just over 1 million tests/ sec and Maj's link gives an even better one -> 3.7 million I think?
I have come up with an even better one, but unfortunatly I'm having some trouble. Can anyone tell me if this makes sense :

You compute the point of intersection as usual.
Then you do

vector1= vertex1 - POI
vector2= vertex2 - POI
vector3= vertex3 - POI

If (point inside triangle, then none of the 3 components are all positive or all negative)
if there is
return 0
else
return 1

It has the potential of being blinking fast but.... I'm not sure but could my problem be precision error. I have no idea why it isn't working. Works on paper!

V-man

V-man
08-23-2002, 01:59 PM
Ah! mistake in my logic. That test is not enough.

V-man

WhatEver
08-23-2002, 08:06 PM
I'm sure it's the same code you had V-man. When I obtained my point in triangle code from gamedev it did some extra calculations like obtaining the uv position of the point...it was a raytracing algo so that's understandable. I just removed all that stuff from the code because we don't need that for simple collision detection http://www.opengl.org/discussion_boards/ubb/smile.gif.

Would you mind posting your new code that you got from Majs' link? That seudo code makes no sense to me http://www.opengl.org/discussion_boards/ubb/frown.gif. I want to try it out and see the performance difference.

V-man
08-25-2002, 05:58 AM
I don't know pseudo code either, but my first conversion to C++ worked http://www.opengl.org/discussion_boards/ubb/smile.gif

float u0, v0, u1, v1, u2, v2, beta, alpha;
long i1, i2;
float floaterror=1e-5;

if(lineVector[0]*normal3D[0]+lineVector[1]*normal3D[1]+lineVector[2]*normal3D[2]<=floaterror)
{ //Don't bother checking if they intersect (to simplify the algorithm)
return 0;
}
//(x, y, z)=(x, y, z)+t*(x, y, z)
t[0]=(normal3D[0]*(triangle3D[0]-startingPointOfLine[0])+normal3D[1]*(triangle3D[1]-startingPointOfLine[1])+normal3D[2]*(triangle3D[2]-startingPointOfLine[2]))/(normal3D[0]*lineVec tor[0]+normal3D[1]*lineVector[1]+normal3D[2]*lineVector[2]);
pointOfIntersection[0]=startingPointOfLine[0]+t[0]*lineVector[0];
pointOfIntersection[1]=startingPointOfLine[1]+t[0]*lineVector[1];
pointOfIntersection[2]=startingPointOfLine[2]+t[0]*lineVector[2];

if((normal3D[1]>normal3D[0])&&(normal3D[1]>normal3D[2]))
{
i1=0;
i2=2;
}
else
{
i1=0;
i2=1;
}

u0=pointOfIntersection[i1]-triangle3D[i1];
v0=pointOfIntersection[i2]-triangle3D[i2];
u1=triangle3D[3+i1]-triangle3D[i1];
v1=triangle3D[3+i2]-triangle3D[i2];
u2=triangle3D[6+i1]-triangle3D[i1];
v2=triangle3D[6+i2]-triangle3D[i2];
if(u1==0.0)
{
beta=u0/u2;
if((0.0<=beta)&&(beta<=1.0))
{
alpha=(v0-beta*v2)/v1;
}
}
else
{
beta=(v0*u1-u0*v1)/(v2*u1-u2*v1);
if((0.0<=beta)&&(beta<=1.0))
alpha=(u0-beta*u2)/u1;
}

return (alpha>=0.0)&&(beta>=0.0)&&(alpha+beta<=1.0);

/*edit*/ minor fix

[This message has been edited by V-man (edited 08-25-2002).]

WhatEver
08-25-2002, 08:03 AM
Thank you, it worked!

I tested it on 2304 triangles and I went from 188 fps to 198 fps(it was rendering a scene with 4765 tris with a few effects enabled).

[This message has been edited by WhatEver (edited 08-25-2002).]

davepermen
08-25-2002, 08:30 AM
that code calculates twice the dotproduct between normal3D and lineVector. you should get it even faster.. http://www.opengl.org/discussion_boards/ubb/biggrin.gif

davepermen
08-25-2002, 08:52 AM
struct Triangle {
v3 a,b,c;
Triangle(){}
Triangle(v3 a,v3 b,v3 c) : a(a),b(b),c(c) {
}
bool Hit(f32 start[3],f32 end[3]) {
f32 _ray[6];
_ray[0] = end[0] - start[0];
_ray[1] = end[1] - start[1];
_ray[2] = end[2] - start[2];
_ray[3] = start[1]*end[2] - end[1]*start[2];
_ray[4] = start[2]*end[0] - end[2]*start[0];
_ray[5] = start[0]*end[1] - end[0]*start[1];
f32 _e01[6];
f32 _e12[6];
f32 _e20[6];
_e01[0] = b.x - a.x;
_e01[1] = b.y - a.y;
_e01[2] = b.z - a.z;
_e01[3] = a.y*b.z - b.y*a.z;
_e01[4] = a.z*b.x - b.z*a.x;
_e01[5] = a.x*b.y - b.x*a.y;
f32 pd0 = _e01[0]*_ray[3] + _e01[1]*_ray[4] + _e01[2]*_ray[5] + _e01[3]*_ray[0] + _e01[4]*_ray[1] + _e01[5]*_ray[2];
_e12[0] = c.x - b.x;
_e12[1] = c.y - b.y;
_e12[2] = c.z - b.z;
_e12[3] = b.y*c.z - c.y*b.z;
_e12[4] = b.z*c.x - c.z*b.x;
_e12[5] = b.x*c.y - c.x*b.y;
f32 pd1 = _e12[0]*_ray[3] + _e12[1]*_ray[4] + _e12[2]*_ray[5] + _e12[3]*_ray[0] + _e12[4]*_ray[1] + _e12[5]*_ray[2];
_e20[0] = a.x - c.x;
_e20[1] = a.y - c.y;
_e20[2] = a.z - c.z;
_e20[3] = c.y*a.z - a.y*c.z;
_e20[4] = c.z*a.x - a.z*c.x;
_e20[5] = c.x*a.y - a.x*c.y;
f32 pd2 = _e20[0]*_ray[3] + _e20[1]*_ray[4] + _e20[2]*_ray[5] + _e20[3]*_ray[0] + _e20[4]*_ray[1] + _e20[5]*_ray[2];
return pd0 < 0 &amp;&amp; pd1 < 0 &amp;&amp; pd2 < 0;
}
void Draw() {
glBegin(GL_TRIANGLES); {
glVertex3fv(&amp;a.x);
glVertex3fv(&amp;b.x);
glVertex3fv(&amp;c.x);
} glEnd();
}
};

works. based on plueckerspace, but without any precalculation of data. and this can be the fastest one of all, if you would optimize it for example with sse to calculate all 3 edges at the same time.. _IFF_ you would do it for a raytracer you would precalc most of this (the 3 pluecker edges and the pluecker space ray) and you could gain even much more..

oh, and the 3 ifs in the end are not needed with sse as well http://www.opengl.org/discussion_boards/ubb/biggrin.gif

V-man
08-25-2002, 07:58 PM
oK, gone have to spend some time learning plueker space technic then.

Are you working on a raytracer dave?
I made one this summer (finally) but its dog slow. Smarter technics are very much necessary like partitioning space, bounding box, better representation of data, ...

I had to make an AVI movie to see things as it got rotated. 2:30 hours of computer time!

V-man

davepermen
08-25-2002, 09:37 PM
yep, i've done some tiny raytracers yet, but nowhere as fast as naturesuxx. what have you traced? with some spheres and planes, you get (incl shadows and reflections) normally some fps, if you don't mess something really much up.. currently a friend of me works at a speed optimized math library for the aegis engine, wich provides stuff like squareroots in 3 cycles on a p3.. can't wait to get that stuff done, then it should get much faster.. http://www.opengl.org/discussion_boards/ubb/biggrin.gif

and, as i said (i think..), if you percalc the plueckerspace coords for the triangles, and for the rays, you get the intersection much faster. (only the dotproducts in the end..) its about the fastest check. no divisions, no squareroots, just some mads.. would love to get hw dotproduct and hw crossproduct, oh well http://www.opengl.org/discussion_boards/ubb/biggrin.gif

V-man
08-26-2002, 06:35 AM
I saw the naturesux page and they had a pretty advanced software. I think they are using direct3d to do some of the rendering. I even saw another page that started out talking about how raytrace would eventually be a faster solution to scanline (in the case of extremely high LOD) but then slowly went to suggest using opengl to aid rendering.

The problem with my raytracer is that it's polygonal. I've used my opengl engines design to design this thing, so spheres are made with polygons.
But anyway, in the real world, there aren't much objects made with sphere cylinders and cubes.

V-man

davepermen
08-26-2002, 10:56 AM
there aren't much made with polygons eighter.. http://www.opengl.org/discussion_boards/ubb/biggrin.gif nor spines nor anything.. they are a) volumetric (think about it, 3d geometry is in fact flat.. proven by the new siggraph paper wich demonstrates unwrapping a mesh onto a 2d texture http://www.opengl.org/discussion_boards/ubb/biggrin.gif)
and b) they are made of particles..

LETS DO VOXELIZE! http://www.opengl.org/discussion_boards/ubb/biggrin.gif

i just love raytracers.. well.. and the spheres are useful anyways.. speedup by 300% if you have boundingspheres around each cuple of triangles.. http://www.opengl.org/discussion_boards/ubb/biggrin.gif

i've rewritten my intersection test a bit, now its even more visible how good sse would fit into it.. very low amount of multiplications and additions.. have to set up some code and test it, possibly at the weekend, we'll see..

and yeah, a "normally good" raytracer scales about logarithmically with complexity.. its basically the same as in the unreal engine (the software version). it doesn't mather how big and how complex and what ever the level is. only what is visible mathers, and even that is not _that_ important. in the end, more or less unreal only cares about the amount of pixels on screen (+the characters..). portals with beamtree are about the way raytracers work..its cool http://www.opengl.org/discussion_boards/ubb/biggrin.gif

and yes, the first hit, the one with the camera - rays can be done with rastericers, thats really a simple but effective optimization..

and yes, opengl can help raytracing as well, can't wait for my r300.. http://www.opengl.org/discussion_boards/ubb/biggrin.gif first test done directly with rastericers, then generate the rays there per pixel and rerender all the triangles, but fullscreen (i will render on 320x240..), and instead of a triangle simply the components get sent into the pixelshader, and you render a fullscreenquad. that way you do all the ray-triangle intersection in the pixelshader..

thats about the basic idea, i think shadows could be done quite good, and possibly first reflections as well.. we'll see how much can be done yet.. naturesuxx roxx.. its awesome (and it is quite general useable, not like heaven7 wich fakes about everything it can http://www.opengl.org/discussion_boards/ubb/biggrin.gif (but its awesome anyways))

they even plan to code a game with naturesuxx, can't wait for that one.. http://www.opengl.org/discussion_boards/ubb/biggrin.gif (oh, and a new cpu http://www.opengl.org/discussion_boards/ubb/biggrin.gif)

V-man
08-27-2002, 02:35 PM
OK, went over the plucker coordinates. Nice to see an idea from physics in there.
The problem is a copy has to be kept of these coordinates and everything has to be pre transformed before generating the plucker.

Did you use these and benchmarked? I may try them another time. Put them on the web if you have.

let's do volumize? You do it! A wise man once told me "keep it simple you stupid idiot!"

This thing takes needs way too much cpu power. Using tricks in opengl for games is the only option and will be for a very long time, sadly.
V-man

davepermen
08-27-2002, 07:24 PM
actually the power of today is big enough to start dropping hacks..

about pluecker.. if you do it right, you can get a ray-mesh check down to about one dotproduct6 per triangle, and some comparisons (3).. i hope at the weekend i have the time to implement that.. would boost my raytracer by a factor 3..

zed
08-27-2002, 07:35 PM
>>let's do volumize? You do it! A wise man once told me "keep it simple you stupid idiot!"<<

is one of my my 5 programming laws.
to tell the truth up until about 3/4 months ago i honestly thought doom3 used ray/beam tracing.
shadows can be done very quick.

davepermen
08-27-2002, 09:10 PM
Originally posted by V-man:
let's do volumize? You do it! A wise man once told me "keep it simple you stupid idiot!"

i said let's do VOXELIZE .. voxels are very simple.. just using a little much memory http://www.opengl.org/discussion_boards/ubb/biggrin.gif

and its called keep it simple, stupid.. no need for more words than needed. you know, keep it simple http://www.opengl.org/discussion_boards/ubb/biggrin.gif

can't wait for the weekend.. i want to implement my raytracing of triangles faster than raytracing spheres http://www.opengl.org/discussion_boards/ubb/biggrin.gif hope it works.. http://www.opengl.org/discussion_boards/ubb/biggrin.gif

Keermalec
08-28-2002, 01:18 AM
Puecker space seems great but way above my level in maths. Staying with 3d math I run 5% faster than with V-man's 3d to 2d conversion code. Any ideas to optimise this?

// Intersection between a line (ptA-ptB) and a plane (plA)
MVEVECTOR mveRayPlaneIntersect (MVEVECTOR ptA, MVEVECTOR ptB, MVEPLANE plA)
{
plA.n.NormaliseIt();
MVEVECTOR AB = ptB - ptA;
float K = -(plA.n.x*ptA.x + plA.n.y*ptA.y + plA.n.z*ptA.z + plA.d())
/(plA.n.x*AB.x + plA.n.y*AB.y + plA.n.z*AB.z);
MVEVECTOR ptC( ptA.x + K*AB.x,
ptA.y + K*AB.y,
ptA.z + K*AB.z );
return ptC;
}

// Tests if point ptP is within triangle DEF
bool mveIsWithinTriangle(MVEVECTOR ptP, MVEVECTOR D, MVEVECTOR E, MVEVECTOR F)
{
MVEVECTOR ptDE = mveProjectOnLine(D, E, F);
MVEVECTOR vcDE(ptDE, F);
MVEPLANE plDE(ptDE, vcDE);

MVEVECTOR ptEF = mveProjectOnLine(E, F, D);
MVEVECTOR vcEF(ptEF, D);
MVEPLANE plEF(ptEF, vcEF);

MVEVECTOR ptFD = mveProjectOnLine(F, D, E);
MVEVECTOR vcFD(ptFD, E);
MVEPLANE plFD(ptFD, vcFD);

if ((mveIsAbovePlane(ptP, plDE))&amp;&amp;(mveIsAbovePlane(ptP, plEF))&amp;&amp;(mveIsAbovePlane(ptP, plFD))) return TRUE;
else return FALSE;
}

MVEVECTOR pointOnPlane = mveRayPlaneIntersect( source, target, plane );
bool isInTriangle = mveIsWithinTriangle( pointOnPlane, triangle[0], triangle[1], triangle[2] );

[This message has been edited by Keermalec (edited 08-28-2002).]