Jump to content
  • Advertisement
Sign in to follow this  

Bounding slabs performance bottleneck

This topic is 4470 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

In writing an intersection test for rays and axis aligned boxes, I seem to have come across a performance bottleneck. I am using a bounding slabs approach, with the boxes being hierarchically arranged into an octree. The code works as expected, but when running a profiler (AMD codeanalyst) on my program, it consistently highlights the intersection code as taking up to 50% of cpu time. Delving deeper, it reports a very large number of cache misses and pipeline stalls. I do expect the code to take up a lot of cpu time, since it is part of an octree and the program is a raytracer, but codeanalyst gives it statistics which indicate that it is poorly optimised. I couldnt say what makes optimised assembly, but I can post the high level code here so hopefully someone can tell me if there is anything particularly silly. The brunt of the intersection technique works by checking the distance along the ray that the pairs of planes making up the box intersect it, with intersection dismissed if an 'exiting' intersection occurs before an 'entering' one. Given that the planes are axis aligned the intersection tests with the ray can be simplified. The following is an extract from the member function of the Box class that determines ray intersection. Axis aligned planes are represented by a single coordinate representing their location in the dimension they reside in, and the index of that dimension. Planepoints is an array of 6 floats. float a,b; //go through the pairs of planes, does matter what order the pairs come in array for (int t = 0; t < 6; t+=2) { int axis = this->planes[t].axis; float rayOrigin = ray.origin.values[axis]; float rayDirection= ray.direction.values[axis]; a = (this->planePoints[t] - rayOrigin) / rayDirection; b = (this->planePoints[t + 1] - rayOrigin) / rayDirection; //continues... } Not really sure what other information to provide. I am using an Athlon XP 2500+. I think it might be a matter of memory being unable to keep the fp unit fed with data, since simply adjusting the timings of my ram to be more aggressive gave a large increase in program performance.

Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!