Raytracing algorithm optimization

Started by
5 comments, last by phresnel 14 years, 1 month ago
I am interested in raytracing, and I'm currently trying to do some raytracing on a low resolution. I know that most realtime raytracers use subsampling and possibly other tricks to speed things up. I also read an article in hugi nr 23, by lycium, thats says that the raytracing algorithm can be speeded up with: First hit optimization Shadow check optimization Pre-calculate per-frame constants And it says that you must not use a general quadratic intersector to intersect a sphere. Can somebody explain these concepts to me, because I don't understand them and there is no source code attached.
Advertisement
For those interested, the article can be found on the hugi issue on this site:
http://www.hugi.scene.org/main.php?page=hugi23

I included the relevant text here:

"Algorithmic optimisation

Many ray tracers use spacial subdivision techniques (e.g. octrees, BSPs, KD-trees, etc) to speed up ray tracing, the idea being to reduce the number of intersection tests performed. While this is an absolute must for large (more than ~500 objects) scenes, with small scenes such as the one's we'll be using, it's just overhead. This is a hotly debated subject; this is just my logical reasoning. I haven't performed any official tests to verify this, but I'm fairly confident that this is the case. If anyone has other results from tests, please enlighten me :)

So while spacial subdivision is good for large scenes (to the point where in really large scenes with high resolution output, scene complexity is an almost negligible performance factor, making it the future for rendering [this is my hypothesis, don't flame me please!] in years to come), it's really useless for the real time ray tracing we'll be doing for the next two years or so. Who knows after then?

So the next thing we can do falls into two sub-categories:

1.) Improve the efficiency of all intersection tests.
2.) Exploit temporal and spacial coherence.

The solution to the first problem is very straightforward: just go through all your algorithms and equations, and make sure they're absolutely optimal. This normally involves special-casing equations for certain objects (e.g. not using a general quadric intersector to intersect a sphere) and factorising equations to minimise the number of operations required to perform the intersection. Also, sometimes you only need to know if an intersection occurs, and don't really care about where exactly it occurs, this can be really well optimised.

Now let me list some examples of exploiting spacial and temporal coherence.

Speedup tricks
Aside from improving algorithms and using the current machine more effectively via specialised instruction sets, you can also improve speed greatly by just using your head a little. If you see a part of your code that's taking a particularly long time to execute, ask yourself if maybe there isn't a way that this could either be removed, precalculated, simulated or otherwise avoided.

Some notable speedup tricks include:

First hit optimisation
If any of the values you calculate during the course of rendering involve the origin of the ray and some other value(s) that remain constant for each frame (sphere radii, object positions etc), you can precalculate those values and store them as private data members of the object's data structure, and just reuse them whenever you need to calculate stuff from primary rays (the rays spawned from the camera or eye). This only works for primary rays, because the reflected or refracted rays obviously do not propagate from the camera's position, which means you'll have to write a specialised intersection routine for primary rays (or, as you'll see later, rays with any fixed origin).

This is a real must for any real time ray tracer, as primary rays constitute a large percentage (relatively, compared to "normal" ray tracing) of the total rays fired, since the scene complexity and difficulty (erm, for want of a better adjective describing the number of reflective and refractive surfaces :) are much lower for this type of scene.

Shadow check optimisation
Shadow checks are very expensive when you have lots of objects in the scene (n objects = n shadow checks PER INTERSECTION). But since all these shadow rays propagate from a fixed origin (the point of intersection), you can precalculate the same values as mentioned above for the shadow rays' origin. This should save you an enormous amount of time.

Again, this really is a must, because it will greatly reduce the speed impact that additional objects incur when being added to the scene.

Render at 1/2 or 1/4 resolution and interpolate
This can be seen in Rubicon 1 and 2. Though it isn't really a trick, it's hardly a kosher way of improving rendering performance either :) You needn't interpolate in normal squares, I've heard of many triangular interpolation schemes (such as the one used in Spinning Kids' "I feel like I could" demo) which look considerably better than standard square-based interpolation. I'm not really sure if there's a speed hit or not when interpolating along triangles rather than squares, but I'm sure there must be (it's just simpler to interpolate along squares). Which raises the question of whether or not those cycles should be used to improve the interpolation, or reduce the amount of interpolation and use squares. Tricky, but I'm sure it wouldn't go that far... some concrete performance tests would be great, but there aren't exactly vast pools of real time ray tracing info on the net :)

Pre-calculate per-frame constants
As mentioned in the introduction, some values related to intersections are constant for the entire frame (that is, are independent of ray orientation), and can thus be pre-calculated. Vector dot products are really good pre-calculation fodder, since the amount of time taken to load the floating point values over the slow system bus (I'm using a Celeron, so I try to avoid using my 83.3mhz system bus and use the high clock speed instead :), to multiply them, sum them, then store them in a floating point register... this is no walk in the park. But loading a single float from RAM is hardly straining, and it also saves you tons of CPU work."
The first thing is, ray tracing lends itself well to many CPUs. You can also assume most future expansion in computing power will be to add more cores and CPUs. So plan from the start to make that the case. Possibly even using 16 or 32 threads handling their own little box onscreen to keep future expansion in mind. You can't do a ton with raytracing easily today, but if you had 1024 cores at the same power it would be pretty easy.

Most of ray tracing is just collision detection. Not exactly easy to make that happen quickly, or even happen correctly. Even when you understand it all thoroughly, it is hard to make good implementations of things like bsp and octree and kdtree and heirarchical AABB tree that work right, don't take up too much space, don't fragment memory, and don't choke the data or instruction caches. Especially when for raytracing you will have a very large number of these structures that can have lots of data in them.

Christer Ericson's book realtime collision detection, though geared more to physics, would probably help a lot. Jacco Bikker's articles could help, as well.

This is my thread. There are many threads like it, but this one is mine.

Thanks for your response, but actually I'm looking for small optimizations on the algorithm-level, because I want to raytrace in realtime, I won't use complex scenes, just 3 spheres with reflection and phong lightning or something would be ok.

But, I was suprised to see so many optimizations listed in the article i posted, that i never heard of and i wasnt completely able to understand, so I thought that maybe someone here could explain a few of those to me.
Thanks nonetheless!
Quote:
First hit optimization

This means that in line of sight cases where you don't need to find an exact intersection point, just that there is an intersection (like in the case of shadow determination, where you just need to know if "something" is between the surface and the light), you can exit your intersection test early.

Quote:Shadow check optimization

Not sure without reading more of his article, but it might be same as above.

Quote:Pre-calculate per-frame constants

In other words, if something is constant across the frame, you can store the calculation instead of recomputing it for every ray. Eg. if you have a calculation based on the camera position and a spheres radius, you don't need to calculate it for every ray since it doesn't change. You need to balance the cost of calculating a value vs the cost of reading from memory though. Sometimes small calculations can be cheaper than bloating your data.

Quote:And it says that you must not use a general quadratic intersector to intersect a sphere.

He was just using this as an example of making sure you use the optimal solution for the problem. In other words, there are faster ray/sphere intersection methods than the quadratic solution.

Hope that helps some.
Quote:Original post by FBMachine
Quote:
First hit optimization

This means that in line of sight cases where you don't need to find an exact intersection point, just that there is an intersection (like in the case of shadow determination, where you just need to know if "something" is between the surface and the light), you can exit your intersection test early.

I think that this is the shadow ray optimization. I did not read the article, but I've already heard that "first hit optimization" when talking about rasterization/raytracing hybrids: you don't trace eye rays, you just rasterize the scene. For each pixel you automatically have the primitive, and thus the intersection point and the normal (in addition to surface properties. As eye rays are usually very coherent you can exploit the rasterization benefits.
Another optimization you can employ (but results might not be very good): you only trace rays for pixels with even coordinates ((0,0), (2,2), (4,4)...), or you can just use any other strategy you think could fit. The missing intersection points are evaluated by interpolating the surrounding pixels intersection coordinates if the primitives they belong to are the same...

In general, eye rays are very coherent. That cannot be said for most of the others, but for primary it is, and you can try to explot that. For simple scenes and for whitted style RT eye rays are most of the work that must be done anyway, so decreasing the time required for that step can heavily speed up the whole render.
Heavenly: ompf.org/forum. Lycium and Jacco Bikker (who has written this not infamous introduction on performant ray tracing) are there, too.

In comparison, lyc wrote about demo'ish and faky ray tracer (which is highly cool), and Jacco wrote about unfaked near real time ray tracing as is modern these days (which is highly cool, too).

This topic is closed to new replies.

Advertisement