Ray Tracing

Started by
5 comments, last by Neutrinohunter 16 years, 3 months ago
Okay, I think I have the basics of a ray tracer implemented but I am unsure how to get from a pixel coordinate to an world coordinate for all my points. I'm using OpenGL for this so there might be a function which does this but I'm not sure how well it would work with movement. I have a Camera class which has its position, direction and spherical polar coordinates for its latitude and longtitude if this helps. I don't really have much idea with this or my maths is correct so would appreciate any help you could give me. Neutrinohunter
Advertisement
I know basically two methods:
-you generate the ray in the object space of the camera, then transorm it with its world space.
-The other way is to transform the four corners of the view plane in world space, and then interpolate across the surface.

I use the second method, wich in pseudocode would be:
Point A,B,C,D, Pos;A = (top-left corner in camera space)B = (top-right corner in camera space)C = (bottom-left corner in camera space)D = (bottom-right corner in camera space)Pos = (position of the camera in camera space, usually (0,0,0))wA = CamToWorld * A;wB = CamToWorld * B;wC = CamToWorld * C;wD = CamToWorld * D;wPos = CamToWorld * Pos;for(int y = 0; y < yres; ++y)for(int x = 0; x < xres; ++x){Point p = Interpolate(A, B, C, D, x, y);Ray r;r.origin = wPos;r.Direction = (p - wPos).Normalize();}

This is more or less what I use to generate rays (I could send you real code if you need it). I can't say that this method is better than transforming camera space rays with the World matrix of the camera, but this is the one I use.


I think its all these coordinate systems which are throwing me off. Also what is the CamToWorld transformation? Also what is Camera space? If thats OpenGL code I would be most grateful.


I have the code below exactly the same i.e

for each pixel in surface {
Ray r = someOrigin;
r.direction = someOrigin - pixelPoint;
r.SelfNormalize();
}

Its just the transformation side, I tried something along the lines of
a couple of matrices and a scaling factor but to be honest my maths isn't the best.

neutrinohunter
Well, I'm not a math guru myself, so I will try to explain in poor words (hopening that they are also correct):
Every object in the virtual scene is defined in a local space. For example, in its local space the camera is centered in (0,0,0), but when moved in the scene it can be placed everywhere, i.e. (10, -2, 22). This transformation usually changes both position and direction (viewing vector, in the camera example, orientation, with other items), but can also include other transformation: scale, or projections, for example.
This transformation is performed through matrices and it is what I call CamToWorld. This is done in software as well as in hardware renderer.

What I do to generate rays is building up this matrix (you can find the matrices for most common transformations everywhere, wikipedia included IIRC). Then you multiply the matrix for the four corners of the 'screen'. Their coordinate depend from how you decide to build your camera, an example is:

A = (-1, 0.5, 1.5)
B = (1, 0.5, 1.5)
A = (-1, -0.5, 1.5)
A = (1, -0.5, 1.5)

This defines a rectangle aligned with the xy plane and perpendicular to the z axis. It is 2 units width and 1 unit heigth, and 1 unit far from the origin. These are local coordinates, and suppose that the camera point of view is at (0,0,0).

What you do is to take your matrix and multiply it for the three vertices. Then interpolate through the plane to get xres * yres points over the surface and for each point (pixel) you build a ray in the following way:

r.origin = camera_pos_in_the_scene;
r.direction = (current_point_on_the_surface - r.origin).Normalize();

I could send you both my geometric library and the code I used to build camera ray, but be warned that:
-everything is work in progress
-The two parts don't work togheter since I recently rewrote the vector libarary but not yet the camera class.

Honestly, I suggest to try to implement this yourself. It can be hard at first as it was for me, but then at least you will know better what you're doing.
Ok, ive had a stab but I don't think I've gone wrong. I think ive tested my intersection code enough to know that the image I'm getting is wrong.

I was using the idea that i could transform from pixel to world using the inverse transformation of the modelview matrix inside opengl.

Something like:

Point = [Transformation along -n for eye post][Inv of View][Pixel]

Unfortunately I'm getting nothing except a black screen (background colour), so I'm thinking that my ray function isn't correlating to the actual rays going out (i.e a sphere should be drawn if it is in view of the camera)

neutrinohunter

I believe this does what you're asking

/* Converts screen pixel coordinates with a given depth into global 3D position(0,0) is upper left corner of screen */template<typename T>void screenPosToGlobal(int xpos, int ypos, T depth, Point3<T> &gPos){	T planeheight = depth*tan(FOV*0.5); //half height of the frustum plane at this depth	T leftsf = 1.0 - 2.0*xpos/(T)SCREEN_WIDTH; //(-1,1) scale factor for left offset	T rightsf = 1.0 - 2.0*ypos/(T)SCREEN_HEIGHT; //(-1,1)	gPos = eye + dir_forward*depth		   + dir_left*leftsf*planeheight*SCREEN_RATIO           + dir_up*rightsf*planeheight;}/* Converts screen pixel coordinates to a global 3D position with given-Z coordinate(0,0) is upper left corner of screen */template<typename T>void screenClickToGlobal(int xpos, int ypos, T global_z, Point3<T> &gPos){	Point3<T> rayStart, rayEnd;	gluUnProject(xpos, SCREEN_HEIGHT-ypos, NEAR_CLIP, mvmatrix, projmatrix, view, &rayStart.x, &rayStart.y, &rayStart.z);	gluUnProject(xpos, SCREEN_HEIGHT-ypos, FAR_CLIP, mvmatrix, projmatrix, view, &rayEnd.x, &rayEnd.y, &rayEnd.z);	/*	screenPosToGlobal(xpos, ypos, (T)NEAR_CLIP, rayStart);	screenPosToGlobal(xpos, ypos, (T)FAR_CLIP, rayEnd);*/		Vector3<T> ray = rayEnd - rayStart;	T dist = -(rayStart.z + global_z)/ray.z;	gPos = rayStart + dist*ray;}


Thanks, Im currently just trying to compile it.

One question though, what do the last couple of lines do in the screenClickToGlobal() function ? I understand the bit using matrices to get the idea of a viewplane, but I'm baffled to now what the last bits do exactly.

It looks like stuff for intersections: Position + (time * directionRay)
but wondering why the divide for z and the distance bit?

neutrinohunter

This topic is closed to new replies.

Advertisement