The Basics

What Raytracing Isn't

Modern GPUs use a "forward rendering" architecture: They go through each polygon one-by-one and draw the polygons to a buffer.  This is how all 3D accelerated games work, and it clearly works well.  But raytracing is almost the complete opposit.  Raytracing works through "inverse rendering" ... it goes to each pixel in the buffer and figures out the color of that pixel directly.  As a result:

(a) It tends to be slower because you need to load/unload multiple objects from memory and do operations on them per pixel.
(b) There are certain effects that are easier to do because you have access to every object at the same time.

Because a forward renderer only deals with one polygon at a time, it's often hard to do operations that involve multiple objects (such as shadows, reflections, and transparency).  Inverse renderers are very good at handling effects that require multiple objects to figure out.

How It Works

We place our camera at position p, and then create a parametric ray through a pixel using a direction vector d.  The parametric ray is described like this: Ray(t) = p + d*t  ... t = 0 is the camera, t > 0 is in front of the camera, and t < 0 is behind the camera.

What we want to do is take that Ray(t) and figure out where it intersects with our objects (in this case, a gray sphere, a tan sphere, and 2 lights):

The ray below hits the gray sphere.  We then use that contact point (cp) to trace 2 more rays, one for each light (l1 and l2).  The ray to the 2nd light hits both spheres before it gets to the light, so we don't use light number 2 in calculating the color of the sphere.

The ray to the first light doesn't hit anything before the light which tells us that the light falls on the sphere.  We now need to calculate the light's effect on the sphere.

We know the following data for computing the color at that pixel:

  • The color of the sphere (gray = 0.5, 0.5, 0.5)
  • The color of the light (a really bright white = 2, 2, 2)
  • The direction to the light (let's call that dl1 where dl1 = Normalize(l1-cp))
  • The "normal" of the sphere at  the contact point (n) ... the normal tells us which way the surface faces.

The light is computed by multiplying the color of the sphere by the color of the light and the cosine of the angle between the surface normal and the light ... which can be found through a simple dot product.

col = (0.5, 0.5, 0.5) * (2, 2, 2) * DotProduct(dl1, n)

Light could also be contributed due to reflections.  This is done by calculating the angle a ray would "bounce" off the object, then recursively call the trace function from the surface along the bounce.  The following image shows a ray bouncing between two reflective spheres before shooting off into space:

In practice, such a scene would look like this:

One ray is traced per pixel.  If it doesn't hit anything, it gets colored black.  If it hits a sphere, the lighting is calculated, and the reflections are called recursively.  This is just like rendering any regular pixel on the screen but instead of using Ray(t) = p + d*t we use Ray(t) = cp + dr*t where dr is the direction the ray bounces off the sphere. This image helps dispel one of the many misconceptions about raytracing: it does not automatically make things look good.

This more inventive camera angle helps the scene a lot:

Notice that the spheres are still perfectly smooth.  Because the raytracer is tracing the sphere, not a collection of polygons, it will always look like a perfect curve.  But it still doesn't look good.  Let's add a floor, some textures on the spheres, and a normal map for the floor.

Now we're starting to get something that's not possible on a GPU.  The marble texture is procedurally generated from a perlin noise function and has infinite resolution (because it's just a function defined in 3D space).  The normal map on the floor is a ripple that's also procedurally generated (and therefore has theoretically infinite resolution).

But the scene still looks artificial.  The colors are a little flat and there's plenty of jagged edges.  This is where post-process effects come in.

This is the final rendered image.  There's a light bloom filter to simulate dust on the lens, a vignette filter to simulate a lens hood, a 4x recursive adaptive anti-alias function (anywhere from 1 to 256 rays per pixel) and gamma correction to simulate film response.  (I should add an exposure curve function but I haven't had the time.  The gamma correction works okay for our purposes.)

The reason the image on the bottom looks good is that I have all the time in the world to do the highest quality texture generation and post process filtering.  All of the colors are handled as double-precision floats the whole way through. 

This sort of precision isn't possible with realtime rendering, but these post-process effects could be done on a forward-rendered image as well.  As GPUs are now supporting floating point values for colors, we're starting to see these post-process effects appear in games.