thebigMuh 122 Report post Posted September 30, 2002 Hello! I'm currently in the process of writing a simple raytracer I require for a MAX plugin. Sofar, I'm using a BSP tree for accelerating the raytracing. I'm getting about 25000 raytraces/second in an 80.000 poly scene. Is this good? I'm currently using FP math for calculation. However, since on a P3 processor int math is about 9 times faster than fp, I'm thinking about rewriting the raytracer to operate entirely with 32bit integers. Since currently my intersectTriangle() function uses barycentric coordinates to determine if a found ray/plane intersection lies within the boundaries of a triangle (and those coordinates are in the range 0.0 - 1.0), I was wondering if there is a fast and efficient way to calculate a ray/triangle intersection entirely with integer math. Also, I'm not sure if a BSP tree is an optimal approach of acceleration for the type of raytracing I require. Does anybody here have any links to different acceleration schemes (Octree, or maybe something entirely different)? And I don't mean something general like "build a tree of 8 cubes/level, subdiv until subdiv limit or face limit is reached, then raytest the cubes", but something a bit more specific. One of the beautiful things about my current BSP builder is that it's freakingly fast, as vivisecting a scene consisting of about 3 million polygons takes just about 2 seconds (including precomputation of the plane equation and the area for each triangle). I'm also thankful for every other links to raytracing information you could provide. For eyecandy, this is a picture rendered using my plugin. The raytracer I'm using is responsible for placing the "dirt" you can see accumulating at all the ridges and corners. Using a 15 level bsp tree, this image rendered in about 3mins 20sec. Thanks alot Ciao, ¡muh! [edited by - thebigMuh on September 30, 2002 4:49:41 AM] 0 Share this post Link to post Share on other sites

davepermen 1047 Report post Posted September 30, 2002 if you have a p3, don''t even think about integermath. a) we''ve proven that the speed is not much bigger (and you get huge troubles in big scenes with fixedpoint math.. don''t try it..) and b) for floatingpoint math on p3, use sse. you can boost up your system very much.. (much faster than integermath)int is not 9 times faster as float..the pic looks sweet, just want to state that..for informations on fast raytracing shemes, checkwww.openRT.de it has some further links, some papers with test on different settings and trees, etc..one other way is instead of a tree, using a grid, a simple 3dgrid wich you then rasterice through in fact, drawing 3d lines.. as this can get optimized till the max, it can get quite fast (espencially with equally dense scenes /me thinks)good luck for further improvements, it looks cool..but yes, you can get much more ray/triangle intersections per second (assuming thats what the 25''000 refer to)..but its a good start anyways. hope to see more "take a look around" - limp bizkitwww.google.com 0 Share this post Link to post Share on other sites

thebigMuh 122 Report post Posted September 30, 2002 >if you have a p3, don''t even think about integermath. a) we''ve >proven that the speed is not much bigger (and you get huge >troubles in big scenes with fixedpoint math.. don''t try it..) >and b) for floatingpoint math on p3, use sse. you can boost up >your system very much.. (much faster than integermath)Thanksalot for that advice! I would''ve had to rewrite the better part of my raytracer for that change. I don''t have a P3, but an Athlon by the way, so the FPU part is quite strong with this CPU. I''ll look into implementing a separate SSE/3dNow! path into the tracer, maybe that will speed stuff up significantly.> int is not 9 times faster as float..I got that from an internet page that compared P3 int and fpu performance to UltraSparc and PA-Risc machines. Don''t have the link anymore, sorry...> the pic looks sweet, just want to state that..Thanks! ;-)> for informations on fast raytracing shemes, check> www.openRT.de it has some further links, some papers with test > on different settings and trees, etc..I''ll do that, however it seems that the page is down right now.> one other way is instead of a tree, using a grid, a simple > 3dgrid wich you then rasterice through in fact, drawing 3d > lines.. as this can get optimized till the max, it can get > quite fast (espencially with equally dense scenes /me thinks)I built the BSP tree to be adaptive, that means that you have a balance setting that controls if you get a completely balanced tree, or an unbalanced tree, but with the planes always exactly halving the parent node''s space.> but yes, you can get much more ray/triangle intersections per > second (assuming thats what the 25''000 refer to)..No, that''s not ray/triangle intersections, but real raytracing performance. That means that in the scene you can see above, about 4''n''something million rays were cast. If you take that and devide it by the time that was spent purely with raytracing in my plugin, you get about 25.000 ray/scene intersections per second.Ciao, ¡muh! 0 Share this post Link to post Share on other sites

JuNC 236 Report post Posted September 30, 2002 I'd like to second the sweet pic bit Would you mind elaborating on the added dirt? I quite like the look it produces.EDIT: Just a thought, have you tried using deferred calculation of the BSP? Or is it built according to the view anyway? I'm not sure how well it would work with raytracing but could be worth a thought.[edited by - JuNC on September 30, 2002 12:15:36 PM] 0 Share this post Link to post Share on other sites

davepermen 1047 Report post Posted September 30, 2002 how about this? your camera shoots planes of rays out. horizontal and vertical planes..first approach:shoot one horizontal plane at a time, check the triangles (even in a tree, octree could be cool for this, but dunno) against the plane. trace only against the resulting list of triangles..plane-triangle:hit = sign(plane.normal dot triangle.a + plane.D) == sign(plane.normal dot triangle.b + plane.D) == sign(plane.normal dot triangle.c + plane.D)that way you could, espencially on scenes with small triangles, efficently cull tons of triangles away.second approach:the same, but generating lists for horicontal and vertical. then using a boolean intersection on the two lists, and check against those triangles...possibly its not that useful for triangles, but for objects that are quite a bit more difficult to hit with rays than with planes, thats possibly a good solution..dunno, but it sounds fun, and moves around in my head since monts and years, so i had to state it one time officially:D "take a look around" - limp bizkitwww.google.com 0 Share this post Link to post Share on other sites

thebigMuh 122 Report post Posted September 30, 2002 So, to cut things short, here's the source code for the intersectTri function. #define FLOAT2INTCAST(f) (*((int *)(&f)))#define FLOAT2UINTCAST(f) (*((unsigned int *)(&f)))static inline float det3x3(float a1, float a2, float a3, float b1, float b2, float b3, float c1, float c2, float c3){ return a1 * ( b2 * c3 - b3 * c2) - b1 * ( a2 * c3 - a3 * c2) + c1 * ( a2 * b3 - a3 * b2);}static inline BOOL intersectTri(Point3 &R0, Point3 &Rd, float &dist, float maxLen, dFace &tri/*, Point3 &baryCoord, Point3 &hitPos*/){ float baryCoord; float Vd = tri.p.n.x * Rd.x + tri.p.n.y * Rd.y + tri.p.n.z * Rd.z; if (FLOAT2UINTCAST(Vd) <= 0x80000000U) // [Vd >= 0.0f] this checks for backfacing return FALSE; float V0 = -(tri.p.n.x * R0.x + tri.p.n.y * R0.y + tri.p.n.z * R0.z + tri.p.d); float t = V0 / Vd; // Solve the plane equation, get the distance to the intersection point if (FLOAT2UINTCAST(t) > 0x80000000U || // [t < 0.0f] t >= maxLen) // [t >= maxLen] and this checks if the tri is behind the ray (negative) return FALSE; // or bigger than an already recorded hit // Intersection with plane. Point3 Pi = Point3(R0.x+Rd.x*t, R0.y+Rd.y*t, R0.z+Rd.z*t); // Get the barycentric coordinates of the hitPoint. // If any of the components are > 1.0 the hit is outside the triangle // I pulled the function into this one here, since we only need to determine // further barycentric coordinates if the previous ones are already valid baryCoord = det3x3(Pi.x, Pi.y, Pi.z, tri.b.x, tri.b.y, tri.b.z, tri.c.x, tri.c.y, tri.c.z) * tri.area; if (FLOAT2UINTCAST(baryCoord) <= 0x80000000U && FLOAT2INTCAST(baryCoord) <= 0x3F800000) { baryCoord = det3x3(tri.a.x, tri.a.y, tri.a.z, Pi.x, Pi.y, Pi.z, tri.c.x, tri.c.y, tri.c.z) * tri.area; if (FLOAT2UINTCAST(baryCoord) <= 0x80000000U && FLOAT2INTCAST(baryCoord) <= 0x3F800000) { baryCoord = det3x3(tri.a.x, tri.a.y, tri.a.z, tri.b.x, tri.b.y, tri.b.z, Pi.x, Pi.y, Pi.z) * tri.area; if (FLOAT2UINTCAST(baryCoord) <= 0x80000000U && FLOAT2INTCAST(baryCoord) <= 0x3F800000) { dist = t; return TRUE; } } } return FALSE;} And here's some explanation:The two defines on the top are used for comparing floating point values. Don't ask, it works, and is a lot faster.The det3x3(a1, a2, a3, b1, b2, b3, c1, c2, c3) function calculates the determinant of a 3x3 matrix in the form | a1 b1 c1 || a2 b2 c2 || a3 b3 c3 | The intersectTri function takes a heapload of arguments, which are: Point3 &R0 // The origin of the rayPoint3 &Rd // The direction unit vector of the rayfloat &dist // This is actually for output, contains the distance to the intersectionfloat maxLen // The maximum already recorded hit distancedFace &tri // A triangle The triangle class contains three vertices (a, b, c), a plane equation as a struct (with n being the vector over(a, b, c) and d), and the precalculated det3x3 in area.Since for my use only the distance to the intersection is of interest, I discard the rest of the information.I hope I didn't forget to mention anything essential.Any ideas for improvement?Ciao, ¡muh!PS: Since this is a dirtmap, I don't have a camera per se. I shout out (min_samples to max_samples) rays for every rendered pixel. This is one of the reasons this raytracer has to be fast. If you have a 1024x768 rendering and shoot out only 16 rays per pixel, you already get about 12.5 million rays.I also benchmarked the pure performance of the intersectTri function above, and I get about 4.000.000 ray/triangle intersections per second.[edited by - thebigMuh on September 30, 2002 6:02:44 PM][edited by - thebigMuh on September 30, 2002 6:04:00 PM] 0 Share this post Link to post Share on other sites

thebigMuh 122 Report post Posted September 30, 2002 hmmseems like he doesn''t like my #definessorryCiao, ¡muh! 0 Share this post Link to post Share on other sites

davepermen 1047 Report post Posted October 2, 2002 after the define add another \n so there are two.. one gets cut, the other stays.. its a little bug those forums do have bool intersect_triangle(ray ray,v3 a,v3 b,v3 c) { v3 e0 = b - a; v3 e1 = c - a; v3 p = ray.dir.cross(e1); f32 det = e0.dp3(p); v3 t = ray.orig - a; f32 u = t.dp3(p); v3 q = t.cross(e0); f32 v = ray.dir.dp3(q);#ifdef CHECK_ONLY return det>0&&v>0.f&&u+v<det&&u>0.f&&u<det;#else u /= det; v /= det; return det>0&&v>0.f&&1.f-u-v>0&&u>0.f&&1.f-u>0.f;#endif}15-;21*;7+ thats about the fastest i know, except you want to set up plueckerspace for it everywhere before (if well done, intersection gets down there to about 6 mults and 6 adds.. hehe:D) but don't try it:D uses lots of memory that approach.. "take a look around" - limp bizkitwww.google.com[edit] looks like "#sometext someothertext" messes it up:D[edited by - davepermen on October 2, 2002 6:16:33 AM] 0 Share this post Link to post Share on other sites

thebigMuh 122 Report post Posted October 2, 2002 Wow, thanksalot! At the moment, I''m precomputing the plane equation for each triangle as well as the determinant (that''s about 20 bytes per tri).Just so that I''m getting this correctly:e0 -> vector from a to be1 -> vector from a to cp -> cross product of ray direction (normalized, I guess) and e1det -> dot/scalar product of e0 and pt -> vector from ray origin to au -> dotproduct of t and pq -> dotproduct of t and e0v -> dotproduct of ray direction and qSo, if I''m not mistaken, this will give me the u and v coordinates of the intersection on the triangle.That''s a nice way of calculating it, especially since it doesn''t require a (precomputed) plane equation. However, to finally get the distance from the ray origin to the intersection, I would still have to calculate the real intersection point from u and v, and then the distance from the ray origin to the intersection point (which requires a square root). Since I require only the distance, but not the intersection point itself, this might actually be slower than my algo. I will certainly test this, however, maybe it is still faster.I will also try getting a kind of hybrid solution (using the plane equation for determining the distance of the hit, and the U/V approach for determining if a hit with the plane is a hit inside the triangle).However, what do you mean with prestoring the Plücker coordinates? I don''t see where I could prestore anything different than e0 and e1 for this calculation.Again, thanksalot! That gave me a few new ideas.Ciao, ¡muh! 0 Share this post Link to post Share on other sites

davepermen 1047 Report post Posted October 2, 2002 pluecker coordinates are 6dimensional points wich represent an edge in 3d. you store the three edges of the triangle in pluecker space, as well as the ray is stored in plueckerspace (pluecker coords represent rays/edges/lines, what ever). then you dotproduct (6d and its swizzled..) against the 3 edges to determine on wich side of the edge you pass by. this for the 3 edges and you know if you passed inside or not.now because you share all edges of the triangle, you can calculate a dotproduct for reach edge one time against the ray, then for each triangle store the signs of the 3 edgedotproducts wich define if it goes through or not. then your calculation is: dotproduct of each edge with the ray, store the sign for each edge.for each triangle, compare the 3 egde-signs it has against the 3 signs of the egdes. if they are the same, you got a hit.then you can get the barycentric coords with ease..that is the fastest method mathwise, most data is precalculated then.. but on the other hand, each edge is 6 floats, and you need a full edge array, and per triangle 3 signs, as well as 3 indicees for the egdes.. quite a bit more data, wich can make is actually slower than doing your or my way.. btw, my way gives t quite simple, i just forgot to add it, sorry:Djust calculate t/=det, and return that.. its the t you want.. "take a look around" - limp bizkitwww.google.com 0 Share this post Link to post Share on other sites

thebigMuh 122 Report post Posted October 2, 2002 Cheers!That''s nice, since it allows for an early out for me:if (t >= maximumAlreadyRecordedHitDistance) return false;Which will spare me a few more cross and dot products.Thanksalot, can''t wait to implement it!Ciao, ¡muh! 0 Share this post Link to post Share on other sites

davepermen 1047 Report post Posted October 2, 2002 bool intersect_triangle(ray ray,v3 a,v3 b,v3 c,f32 old_t,f32& new_t,f32& new_u,f32& new_v) { v3 e1 = c - a; v3 p = ray.dir.cross(e1); v3 t = ray.orig - a; f32 det = e0.dp3(p); if(det<=0.f) return false; t /= det; if(t<=0.f || t>=old_t) return false; f32 u = t.dp3(p); if(u<=0.f || u>=det) return false; v3 e0 = b - a; v3 q = t.cross(e0); f32 v = ray.dir.dp3(q); if(v<=0.f || u+v>=det) return false; u /= det; v /= det; new_t = t; new_u = u; new_v = v; return true;} so thats what you like, right? "take a look around" - limp bizkitwww.google.com[edit] i forgot return true:D[edited by - davepermen on October 2, 2002 9:12:20 AM] 0 Share this post Link to post Share on other sites

thebigMuh 122 Report post Posted October 2, 2002 Yup ;-)However, I can completely ommit the new u and v values, since I don''t need them.Ciao, ¡muh! 0 Share this post Link to post Share on other sites

davepermen 1047 Report post Posted October 2, 2002 why ever.. don''t you need them for uhm.. texturing and lighting? well i do:D "take a look around" - limp bizkitwww.google.com 0 Share this post Link to post Share on other sites

thebigMuh 122 Report post Posted October 2, 2002 That''s only a dirtmap I''m building. It doesn''t do texturing and lighting, it only tells the renderer the amount of dirt for a certain pixel. It does that by sending out rays in a hemisphere and averaging the distance each ray needed to score the closest hit. However, since QMC sampling needs a quite high raycount to achieve more or less noise free images, and I cannot do subsampling, the raytracer has to be as fast as possible.Thanks again for the routines!Ciao, ¡muh! 0 Share this post Link to post Share on other sites

thebigMuh 122 Report post Posted October 2, 2002 BTW: I just noticed that your t magically changes from a vector to a float and back again...Ciao, ¡muh! 0 Share this post Link to post Share on other sites

IainC 127 Report post Posted October 4, 2002 Just wanted to add to the "SWEET PIC!!!!!!" thing My raytracer''s nothing on yours; for anyone following this thread and thinking "I have GOT to write one of those" though, it might be useful - small, OO, commented source.Keep up the good work, the dirt particularly is excellant. www.coldcity.comcode, pics, life 0 Share this post Link to post Share on other sites