Sign in to follow this  

Realtime raytracing

This topic is 4299 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm following a course called "Computer Graphics". We have to write a raytracer, worth 7 of the 20 points of the exam. Actually we have a choice between two things we can make: a rasterizer (where speed is important) or a raytracer (where speed isn't important but the effects are). So the raytracer doesn't have to be realtime, but I want to amaze the professor by making it realtime. Currently, I can raytrace 4 flat shaded, untextured spheres, at 20fps at 400x300 pixels. Now I still have to add Phong shading and "transformations of objects", and later we will get a second part of the assignment, which will probably add things like reflections and translucency. I'm wondering, if I'd still be able to get it somewhat realtime after all those effects are added. I mean, for example I have to add "transformations of objects", that means I have for every pixel, for every object, to add a multiplication with a 4x4 matrix... Which is maybe more work than everything that's done per pixel right now! Is there a way to do this fast too? Oh yeah, I used to compile my code with "g++ *.cpp -lSDL", and it was 30fps with 1 sphere. Then I compiled it with "g++ *.cpp -lSDL -O3" instead, and it was 65fps with 1 sphere!! Are there other such optimizations that g++ can do, that could make it even faster?

Share this post


Link to post
Share on other sites
I've saw some time ago a real-time raytracer that raytraced 5 reflecting spheres, so I think there's some room left.

What you should really try is to use the vector units of your processor - i.e. SSE/SSE2. There's a compiler switch for GCC for auto-vectorization but you'll probably want to hand-optimize it anyway. Take a look at http://www.openrt.de/, they have some publications on how they optimized their renderer, might give you some helpful hints.

Share this post


Link to post
Share on other sites
And, if you are allowed, then use the GPU Shaders for extra math computational power. With a good video card, this more than doubles the available processing power.

Share this post


Link to post
Share on other sites
Check out the forums on this site.
They have produced some VERY fast raytracers and there's some code floating around there to help you.
Also look up Havran's work, he has done a lot of research on fast raytracing (almost getting realtime global illumination).

Share this post


Link to post
Share on other sites
Maybe you want to take a look at this library:
http://www.libsh.org/

Basically it allows to use your GPU as a general vector processor, i.e. add pairwise two arrays of numbers and such. You can use it to calculate intersections of rays and spheres in parallel, for instance.

Share this post


Link to post
Share on other sites
one obvious thing is that you dont have to transform every ray...just transform the objects at the beginning of each frame (still keep the matrices and the original coords, but do the transformation only per object per frame). this is especially true with spheres, which are invariant under rotation (unless you start adding surface shaders).

Share this post


Link to post
Share on other sites
Heaven Seven is an old (2000) example of real time ray tracing (in 64k no less).

You'll want some form of space partitioning, to help speed up intersection tests (probably wont help with 4 objects but it becomes very useful very quickly). There where some tricks with using interpolation instead of complete samples, but I not remembering them.

Share this post


Link to post
Share on other sites
Quote:
Original post by Cocalus
Heaven Seven is an old (2000) example of real time ray tracing (in 64k no less).

You'll want some form of space partitioning, to help speed up intersection tests (probably wont help with 4 objects but it becomes very useful very quickly). There where some tricks with using interpolation instead of complete samples, but I not remembering them.


One of the (trivial) tricks I know consists in sampling only one pixel every two, and then interpolating the others. In alternative, one could store the intersection point and the normal for each pixel in a buffer. You could do the intersection one pixel every two, interpolating both the normal and the coordinates for the other pixels and then shade each pixel.

Share this post


Link to post
Share on other sites
I have not ever written one, but i would say the big problem will be a good data structure, like a kd-tree. And if the scene is dynamic, it is not so easy.

Share this post


Link to post
Share on other sites
Realtime raytracing is non trivial.

GPU acceleration is not going to help you much since most tests of the different thesis I have seen still fall behind CPU ray tracing.

Suggestions to implement a real time raytracer:
1. Familiarize yourself with SIMD (SSE instructions for vectorization)
2. Implement your math library using SSE
3. Use kd-trees for space partitioning (Use SAH for building the kd-tree)
4. Do packet tracing on the kd-tree (See Ingo Wald's Phd Thesis), this will allow you to trace 4 rays at once
5. Multi-thread your ray tracer, having a number of threads = to the number of logical processors, and distributing the work equally.

Share this post


Link to post
Share on other sites
Realtime raytracing is easily possible, provided you are willing to sacrifice at least one of the following:

- Image resolution
- Rendering accuracy
- Special effects
- Nontrivial geometry


Don't forget that raytracing has a very nasty cost curve. A few compiler tricks and some clever optimizations might get you a huge speed boost in very trivial scenes, but those advantages will basically disappear as soon as your scene actually has something interesting in it.

If you're really interested in being impressive, write a photon mapper or a path tracer instead. The visual results will be much more interesting. Any realtime effects you're likely to be able to accomplish in the time you have aren't going to look very impressive, because of the concessions you have to make to gain speed. If you'd been doing raytracing and realtime demos for ten years maybe you'd have a solid shot at it, but IMHO you could choose a much more realistic and effective goal.


Put it this way: making a global illumination solution is a well-solved problem. Either Monte Carlo path tracing or photon mapping are your best bets here; photon mapping is probably the easiest to test in a short time because renders are relatively fast (compared to MCPT). These problems have been solved before many times, and the available documentation and research on them is vast. By contrast, nobody has solved the high-quality realtime raytracing problem yet, and documentation is sparse.

I don't want to sound like I'm trying to discourage you from your goal, but pragmatically speaking, I personally think you could hedge your bets a bit and line up a potentially much more successful project.


Just my grumpy tuppence [smile]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
If you go by the suggestion of ApochPiQ, and instead focus on Global Illumination, you will learn a lot more theory too!
If you go for realtime raytracing with simple effects you will learn about optimization techniques, tricks to skip calculations etc.

Share this post


Link to post
Share on other sites
I took that CG course last year.

I believe trying to make it realtime is shooting yourself in the foot.
They'll give you a higher grade if you add more features than they request but I doubt they'll give you points for dropping features so that you could make it run realtime.
Also, keep in mind that IIRC you'll have to use 3D meshes consisting of thousands of triangles, so trying to show off with 4 bouncing spheres while they were expecting dismembered naked ladies may not work as well as you'd like. (such a model should be renderable in a few seconds, but that's still at least 10x slower than needed for realtime)

Dutré likes global illumination (CG2, which I took this year), so I would suggest as others already did, go for that. It doesn't have to run fast, so you can easily let the pc's in building B render for a few hours (or days) using Monte Carlo ray tracing or some smarter technique like photon mapping.

Lastly, if you're "just" going for a high grade:
I got a 19 (which is not exceptional for that course) with CSG, 3D textures, and Perlin noise (or maybe it was my superdooper scripting language, which I claimed of on the presentation that it supported variables, conditionals, loops, classes, inheritance and polymorphism, while in reality I was just using java itself to describe scenes and having the raytracer load the scene class dynamically) (yes, I wrote it in java). The presentation consists mostly of having a bored assistant (Ares or Pieter presumably) just asking to see a few of your pictures so that they can check if certain features are implemented. Having a dozen ready rendered at 1024x768 with nice AA will do the trick.

Share this post


Link to post
Share on other sites
I followed that course too, and currently I'm doing my master thesis on frameless rendering, which can be described as a common name for techniques to speed up raytracing in animations.

To be brief, real time ray tracing is not really something extravagant. All depends on the resolution you want it to run on, and even more, the features you want to support.

consider the following:
- a sphere is about the easiest primitive to render(besides a plane), try to compare it to a 500k polygon model, could be as wel 100-500x slower, depending on your sublinear intersection structures (bounding box, bsp, ...), without those it would be 500.000x slower.
- rendering soft shadows could be 10x slower than using no shadows if you want some quality
- recursive ray tracing (reflections, refractions) slows down the process with a factor equal to your recursion depth, 10 is not uncommon.
- and if you want all the above to look nice, you need AA, again a few factors slower.

All in all, a simple scene like you described, is probably 5000x faster to render than a scene that is loaded with only simple raytracing principles (excluding any global illumination), so your 20fps might drop to 0.004fps, at least, that's only 250 seconds for a picture. I remember rendering 1000x1000, fully featured pictures in maybe an hour or so, which was pretty decent.

As said above, it's better to leave the idea of trying to make it real time, and go for features. In the end, if you go for raytracing, the only thing you need to show is results, "nice pictures" if you'd like, so focus on image quality, and don't lose your time on pure speed, because you won't be able to show the rendering speed anyway, most probably.

Share this post


Link to post
Share on other sites
if you just want to render your image faster your best bet is to simply distribute the program across many computers. break up the image into square tiles and hand a tile off to each computer. other then that your first step is to impliment a good acceleration structure, such as k-d tree or adaptive grids. there are tons of papers on these. the guy talking about simd stuff also brings up a good point, but this only works for bundles of rays that are close together. so they dont help you for secondary reflections much cuz the only the first pass(the visable surface determination) are the rays so coherent. specular lobes too but tracing 4 close rays can lead to bias in statistical simulation that might take some thinking to get right. first order is nice kd tree and it's really important to take your time reading some papers on how to quickly travers and more imporatantly what huristics to use when building the grid, the basic idea is to maximize the size of regions with empty space, but no point in talking about it.

one simple an obvious speed up is to only look for the first interscetion when tracing shadow rays, instead of trying to find the closest one. trace square regions of the image instead of scanline by scanline to maximize cache coherence. instead of testing a ray against one object at a time use simd and such. not this is different then tracing bundles of ray in parallel and is much easier to impliment. good luck

Share this post


Link to post
Share on other sites
Quote:
Original post by Eelco
the whole SIMD route is the last one you should take. hardly trivial, and for what? a 2x speedincrease if youre lucky.


If you do it correctly, your average SIMD solution is four-vectors so you can get a 4x speed increase.

And I disagree that it's the last thing you should do. It's difficult to vectorize code if you didn't plan for it to start with, doing things like keeping your data in Structure-of-Arrays form instead of Array-of-Structues, etc.

Share this post


Link to post
Share on other sites
Quote:
Original post by superpig
Quote:
Original post by Eelco
the whole SIMD route is the last one you should take. hardly trivial, and for what? a 2x speedincrease if youre lucky.


If you do it correctly, your average SIMD solution is four-vectors so you can get a 4x speed increase.

And I disagree that it's the last thing you should do. It's difficult to vectorize code if you didn't plan for it to start with, doing things like keeping your data in Structure-of-Arrays form instead of Array-of-Structues, etc.


When Eelco wrote 2x I'm sure he's talking from practical experience.
SIMD instructions can work with 4 times as much data at the same time as the traditional FPU.
That doesn't mean that it's 4 times as fast, in fact there's much more room for instruction scheduling using the FPU.
Furthermore it's only 4 times as fast when all memory worked on is in the L1 cache, in real life that's almost impossible to achieve (for any real problem/data).
If you just want the closest hit of a few sphere you might be able to get a close to 4x speed increase.
I typically see a 1.5 to 3x speed increase when using SSE (in tight loops).
For ray tracing I think that the guys at the forum I mentioned earlier got an average speedup of 2.4x (very dependent on the number of recursions, shadow rays etc) in their attempts at realtime ray tracing.
Just wanted to clarify a few things about the 4x performance boost.
Edit:
I don't want people to think that ANY algorithm could become 4x faster by using SSE.
I.e the best performance increase I got from a ray-triangle intersection was 1.4x (yes it's probably easy to get a better speed up if you're testing 4 rays against a triangle or 1 ray against 4 triagles or even better 4 rays against 4 triangles).

[Edited by - eq on March 10, 2006 9:30:54 AM]

Share this post


Link to post
Share on other sites
I'm sure your professor would be impressed if you did anything approaching a functional raytracer in GPU. It would be fun (in fact, I may be doing just that for a project in my class).

I was thinking of packing vertices, polygon indices, kdTree nodes, and a table to associate polygons to kdTree nodes into textures... then rendering the usual single quad over the near clip pane to fill view, and doing everything in a fragment shader.

I have done a very simple project before that only handled spheres with one level or recursion (to do refraction/reflection), but it ran in realtime. Once again thought, that wasn't a *real* raytracer; more of a contrived situation, really.

A simple scene with a single ray per pixel (not useful for what you want) would involve width*height rays; compound that a couple of times and start employing Monte Carlo techniques and it get very big, so even in GPU it isn't likely to be realtime until we get better hardware. But GPU can definately speed things up.

Share this post


Link to post
Share on other sites
Quote:
Original post by Ysaneya
I just want to confirm from practical experience that implementing SIMD instructions for a real-time raytracer brings a performance increase of x2 in average.


Depends mostly on how much coherence is to be found in the ray set you're tracing (are we talking primaries? shadow rays? secondaries? etc...) and how large are your ray bundles; while you can expect a x2.5 speedup with primaries/shadows via 2x2 bundles, you can exploit coherency even more with larger bundles (see longest common traversal sequence, MLRTA and so on).

I won't spam this forum with yet another link to our forum, eh, but that's what we're discussing out there ;)

Share this post


Link to post
Share on other sites
I got this so far, now only at 10fps (but it's on a computer dating from 2001):

Image hosting by TinyPic

Now I'm going to implement the other features, and it can't be realtime anymore, but I'll just make a simple scene that can be done realtime to show during the defense (I can control the camera with the keyboard to fly through it and look in all directions), and then a more complex one that will have to be rendered during the defense and includes all required features.

Share this post


Link to post
Share on other sites

This topic is 4299 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this