Sign in to follow this  

Bounding slabs performance bottleneck

This topic is 4106 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

In writing an intersection test for rays and axis aligned boxes, I seem to have come across a performance bottleneck. I am using a bounding slabs approach, with the boxes being hierarchically arranged into an octree. The code works as expected, but when running a profiler (AMD codeanalyst) on my program, it consistently highlights the intersection code as taking up to 50% of cpu time. Delving deeper, it reports a very large number of cache misses and pipeline stalls. I do expect the code to take up a lot of cpu time, since it is part of an octree and the program is a raytracer, but codeanalyst gives it statistics which indicate that it is poorly optimised. I couldnt say what makes optimised assembly, but I can post the high level code here so hopefully someone can tell me if there is anything particularly silly. The brunt of the intersection technique works by checking the distance along the ray that the pairs of planes making up the box intersect it, with intersection dismissed if an 'exiting' intersection occurs before an 'entering' one. Given that the planes are axis aligned the intersection tests with the ray can be simplified. The following is an extract from the member function of the Box class that determines ray intersection. Axis aligned planes are represented by a single coordinate representing their location in the dimension they reside in, and the index of that dimension. Planepoints is an array of 6 floats. float a,b; //go through the pairs of planes, does matter what order the pairs come in array for (int t = 0; t < 6; t+=2) { int axis = this->planes[t].axis; float rayOrigin = ray.origin.values[axis]; float rayDirection= ray.direction.values[axis]; a = (this->planePoints[t] - rayOrigin) / rayDirection; b = (this->planePoints[t + 1] - rayOrigin) / rayDirection; //continues... } Not really sure what other information to provide. I am using an Athlon XP 2500+. I think it might be a matter of memory being unable to keep the fp unit fed with data, since simply adjusting the timings of my ram to be more aggressive gave a large increase in program performance.

Share this post


Link to post
Share on other sites

This topic is 4106 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this