# Faster Ray Traced Terrain Shadow Maps

## Contents

# Introduction

In this tutorial we present a method for generating ray traced terrain shadows at extremely higher speeds than conventional methods. There are few resources out on the web describing how to compute terrain shadows. Many prove extremely **slow**, especially with heightmaps larger than 512x512. These conventional methods can take anywhere from 5-15 minutes to generate a terrain lightmap. One such resource can be found here: Fast Computation Of Terrain Shadow Maps.

There are essentially two things that need to be computed when generating the terrain shadows. First shadows must be computed based on the slope of the terrain. Second shadows that are cast by the terrain itself must be computed. The former will not be discussed in detail here since it is somewhat trivial but source code is given at the end of this tutorial demonstrating the technique. The latter, ray traced terrain shadows, will be discussed in great detail.

## Comparison

To give an example of how much faster the algorithm presented in this tutorial is compared to the above algorithm, the following heightmap was used to generate terrain shadow maps.

Using the above heightmap, the code presented in the tutorial above took over 10 minutes to complete on a Pentium M 2.0Ghz. The algorithm presented in this tutorial took **9.56 seconds !!** That's quite an improvement. Below is the lightmap containing the ray traced terrain shadows generated using the algorithm described in this tutorial.

## Algorithm

The basic algorithm for computing ray traced terrain shadows is given in the link mentioned above and so basics of the technique will not be discussed. What will be discussed is how to dramatically improve its speed.

The first thing one should notice is that for every point on the terrain, a ray must be traced from that location to the light source. Along the way, as described in the tutorial above, the height at the current point along the ray must be checked against the height value of the terrain at that same location. If the height is below the terrain's height then the starting point is checked as being in a shadow. This simple method, while easy, can become extremely computationally expensive for large terrains, as stated above. So a more efficient method must be come up with to improve its speed while still maintaining the same generated lightmap.

### Store Shadow Information

The first thing we need to realize is that we don't have to trace a ray from every point on the terrain all the way across until we intersect a higher point. Instead why not simply allow each point that's already been checked to store information about when it collided with another point? For example, take a look at the terrain given in the image below, with points A, B and C marked in blue.

For the purposes of this tutorial the light source must be a directional one, as stated in the tutorial given above. Suppose the light's direction is coming from the right side of the terrain, which is shown as white lines. Now suppose the current point we are at is point A. Then a ray is traced in the direction of the light source. As we travel along the ray, we check our ray height against the terrain height directly below it. If the terrain height is greater than the ray height, the current point is shadowed. In the image below, the ray traced from point A intersects the terrain below point C. So point X is the point of collision.

The next step is realizing that once we get to point B on the terrain, we should not have to trace a ray across the entire terrain again until reaching some point at or below point C. Instead, we should just be able to use information from point A to check for a collision with the terrain. We can do this by subtracting the height at point X from point C. Call this value h.

So we store this value h in point A's collision information.

Hence as we travel along the ray from point B in the direction of the light source, normally we would have to travel until we hit some point at or below point C. But now point A stores information about where a ray from point B would intersect the terrain at point C. The value h we have just stored indicates how much height was 'left over' when A intersected point C. So what we can do is take this value and add it to the height of A and see if a ray from point B intersects this new height at point A. The image below shows a ray from point B intersecting a point on the terrain below point C.

But we do not want to do this again for point B and especially for every other point. So as we travel along the ray, we check to see if the point below the ray from B has previously stored collision information. So for instance, since point A has collision information, we check to see if our current height on the ray, from point B, is less than the height at point A **+** the its **h** value. If it is less, then we do not need to travel along the ray anymore and so we shadow point B. This saves CPU time.

### Summary

This is essentially the algorithm for every point on the terrain. Here is a summary.

For every point on the terrain:

1) Move along a ray in the direction of the light source.

2) Check the point direction under the ray ( point A ), if it has collision data goto step 3, else goto step 5.

3) If ray height is less than ( point A height ) + ( point A's h value ) goto step 5. Else if point A is unshadowed and has been visited goto step 4.

4) Set current point unshadowed. Goto step 8.

5) Set current point to shadowed and store the difference between ray height and ( point A height ) + ( point A's h value ). Goto step 8.

6) If ray height is less than point A's height then goto step 7.

7) Set current point to shadowed and store the difference between ray height and point A's height. Goto step 8.

8) Move to next pixel.

## A Few More Important Details

### Where To Start On The Terrain

Before we can just code the above algorithm, we must take a few things into consideration. First in order for this algorithm to even work correctly, we must start at the point on the terrain closest to the light source. But we have a directional light source, so which point is the closest? It is the point that has the least distance to travel before hitting the edge of the terrain.

So we see if the light direction is orientated as in the picture above to the left, then our starting point would be point P, as indicated in the picture on the right.

The second thing we must take into consideration is whether to travel across a row or down a column from each current point. In general this should not matter, but the code given below decides this based on the direction of the light source. Let's take the the rows of the terrain to be aligned along the X axis, and the columns to be aligned along the Z axis. Then if the light source's X coordinate is greater than its Z coordinate, we travel down an entire row first, then move to the next row and so forth. If the Z coordinate is greater than the X coordinate, we move down an entire column first, and so forth.

This makes perfect sense, since if the light source was pointing directly down the X axis, then as we move down an entire row, every point we get to we create a ray at that point. As we take a single step away from that point in the direction of the ray we collide with the point directly in front of us. Well the point directly in front of us has already been processed. Hence if the point in front of our current point's height + its H value is greater than out current point's height, we can stop. All it took was a single step!

### Use Of Interpolation

There is a minor problem with the above algorithm. As we move along the ray towards the light source, we do not end up exactly on any given point. Instead, we end up between points as shown below.

Hence it seems that as we travel along the ray, we never 'actually' intersect any of the points on the terrain until we reach point C. We could simply compensate for this and simply round off to the nearest integer and take that to be our current point above the terrain. But this leads to precision error and gives less than desireable results. A simple square shaped mountain on the terrain would not cast a square shadow on the ground, instead it would look more like a trapezoid.

So to alleviate this problem, as we move along the terrain, our ray position will be a floating point value. So whatever that value is we interpolate the heights of the surrounding 4 points around our current point. This will give us an exact height value of the terrain at our given ray position. In this tutorial we will be using bilinear interpolation. A tutorial for this can be found at gamedev.net.Bilinear Interpolation Of Texture Maps

## Source Code

So, let's declare our function header:

void IntersectMap(float *heightmap, unsigned char *lightmap, unsigned char shadowColor[3], int size, float lightDir[3])

The parameters are as follows:

**heightmap**- the actual heightmap for the terrain containing all height values**lightmap**- the lightmap image data we will be outputting data to**shadowColor**- the shadow color we wish to use, in the form of (Red,Green,Blue) where each component is between [0-255]**size**- the width or height of the terrain ( we're assuming the terrain is a regular square )**lightDir**- the direction of the light source (X, Y, Z)

Next we to create a 2-dimensional array storing data as to whether or not points on the terrain have already been checked, and if they have, store the **H** values discussed above. If a point has not been checked, the array will store a value of 0. If the point never intersects the terrain we will store a value of -1. If it does intersect the terrain, we store a value the **H** value.

// create flag buffer to indicate where we've been float *flagMap = new float[size*size]; for(int i = 0; i < size*size; i++) flagMap[i] = 0;

This next part may seem to be the most confusing part of all the code, but its quite simple to think about. We discussed above that we need to choose which position to start on the heightmap and which direction to travel. We would like our main loops for generating the lightmap to be as simple as possible and in the form:

for(int i = 0; i < size; i++) { for(int j = 0; j < size; j++) { // process shadows } }

But how do we determine if the outer loop is our Y position and the inner loop is our X position or vice versa? For instance, depending on the light direction, it may be more efficient to have the outer loop be our X position and the inner loop be our Y position. We also need to know, once we have our starting position, what direction we travel once we're done at that point. So we declare the following variables:

int *X, *Y; int iX, iY; int dirX, dirY;

The variables are defined as follows:

**X**- a pointer, holds the current X position as we travel along the heightmap**Y**- a pointer, holds the current Y position as we travel along the heightmap**iX**- represents the inner loop variable of our loops discussed above**iY**- represents the outer loop variable of our loops discussed above**dirX**- if less than 0 then we travel to the left after we process the current point, else travel right**dirY**- if less than 0 then we travel up after we process the current point, else travel down

Here is the code to determine how to set the above variables correctly:

// calculate absolute values for light direction float lightDirXMagnitude = lightDir[0]; float lightDirZMagnitude = lightDir[2]; if(lightDirXMagnitude < 0) lightDirXMagnitude *= -1; if(lightDirZMagnitude < 0) lightDirZMagnitude *= -1; // decide which loop will come first, the y loop or x loop // based on direction of light, makes calculations faster if(lightDirXMagnitude > lightDirZMagnitude) { Y = &iX; X = &iY; if(lightDir[0] < 0) { iY = size-1; dirY = -1; } else { iY = 0; dirY = 1; } if(lightDir[2] < 0) { iX = size-1; dirX = -1; } else { iX = 0; dirX = 1; } } else { Y = &iY; X = &iX; if(lightDir[0] < 0) { iX = size-1; dirX = -1; } else { iX = 0; dirX = 1; } if(lightDir[2] < 0) { iY = size-1; dirY = -1; } else { iY = 0; dirY = 1; } }

We must now declare some additional variables:

float px, py, height, distance; int index;

**px and py**- the position as we travel along the ray**height**- the height of the ray at the current point**index**- index into heightmap for our current point's height

Now for the rest of the function

// outer loop while(1) { // inner loop while(1) { // travel along the terrain until we: // (1) intersect another point // (2) find another point with previous collision data // (3) or reach the edge of the map px = *X; py = *Y; index = (*Y) * size + (*X); // travel along ray while(1) { px -= lightDir[0]; py -= lightDir[2]; // check if we've reached the boundary if(px < 0 || px >= size || py < 0 || py >= size) { flagMap[index] = -1; break; } // calculate interpolated values static int x0, x1, y0, y1; static float du, dv; static float interpolatedHeight, interpolatedFlagMap; static float heights[4]; static float pixels[4]; static float invdu, invdv; static float w0, w1, w2, w3; x0 = floor(px); x1 = ceil(px); y0 = floor(py); y1 = ceil(py); du = px - x0; dv = py - y0; invdu = 1.0 - du; invdv = 1.0 - dv; w0 = invdu * invdv; w1 = invdu * dv; w2 = du * invdv; w3 = du * dv; // compute interpolated height value from the heightmap direction below ray heights[0] = heightmap[y0*size+x0]; heights[1] = heightmap[y1*size+x0]; heights[2] = heightmap[y0*size+x1]; heights[3] = heightmap[y1*size+x1]; interpolatedHeight = w0*heights[0] + w1*heights[1] + w2*heights[2] + w3*heights[3]; // compute interpolated flagmap value from point directly below ray pixels[0] = flagMap[y0*size+x0]; pixels[1] = flagMap[y1*size+x0]; pixels[2] = flagMap[y0*size+x1]; pixels[3] = flagMap[y1*size+x1]; interpolatedFlagMap = w0*pixels[0] + w1*pixels[1] + w2*pixels[2] + w3*pixels[3]; // get distance from original point to current point distance = sqrt( (px-*X)*(px-*X) + (py-*Y)*(py-*Y) ); // get height at current point while traveling along light ray height = heightmap[index] + lightDir[1]*distance; // check intersection with either terrain or flagMap // if interpolatedHeight is less than interpolatedFlagMap that means we need to use the flagMap value instead // else use the height value static float val; val = interpolatedHeight; if(interpolatedHeight < interpolatedFlagMap) val = interpolatedFlagMap; if(height < val) { flagMap[index] = val - height; lightmap[index*3+0] = shadowColor[0]; lightmap[index*3+1] = shadowColor[1]; lightmap[index*3+2] = shadowColor[2]; break; } // check if pixel we've moved to is unshadowed // since the flagMap value we're using is interpolated, we will be in between shadowed and unshadowed areas // to compensate for this, simply define some epsilon value and use this as an offset from -1 to decide // if current point under the ray is unshadowed static float epsilon = 0.5f; if(interpolatedFlagMap < -1.0f+epsilon && interpolatedFlagMap > -1.0f-epsilon) { flagMap[index] = -1.0f; break; } } // update inner loop variable if(dirY < 0) { iY--; if(iY < 0) break; } else { iY++; if(iY >= size) break; } } // reset inner loop starting point if(dirY < 0) iY = size - 1; else size = 0; // update outer loop variable if(dirX < 0) { iX--; if(iX < 0) break; } else { iX++; if(iX >= size) break; } } delete [] flagMap;

## Complete Source Code

void IntersectMap(float *heightmap, unsigned char *lightmap, unsigned char shadowColor[3], int size, float lightDir[3]) { // create flag buffer to indicate where we've been float *flagMap = new float[size*size]; for(int i = 0; i < size*size; i++) flagMap[i] = 0; int *X, *Y; int iX, iY; int dirX, dirY; // calculate absolute values for light direction float lightDirXMagnitude = lightDir[0]; float lightDirZMagnitude = lightDir[2]; if(lightDirXMagnitude < 0) lightDirXMagnitude *= -1; if(lightDirZMagnitude < 0) lightDirZMagnitude *= -1; // decide which loop will come first, the y loop or x loop // based on direction of light, makes calculations faster if(lightDirXMagnitude > lightDirZMagnitude) { Y = &iX; X = &iY; if(lightDir[0] < 0) { iY = size-1; dirY = -1; } else { iY = 0; dirY = 1; } if(lightDir[2] < 0) { iX = size-1; dirX = -1; } else { iX = 0; dirX = 1; } } else { Y = &iY; X = &iX; if(lightDir[0] < 0) { iX = size-1; dirX = -1; } else { iX = 0; dirX = 1; } if(lightDir[2] < 0) { iY = size-1; dirY = -1; } else { iY = 0; dirY = 1; } } // outer loop while(1) { // inner loop while(1) { // travel along the terrain until we: // (1) intersect another point // (2) find another point with previous collision data // (3) or reach the edge of the map px = *X; py = *Y; index = (*Y) * size + (*X); // travel along ray while(1) { px -= lightDir[0]; py -= lightDir[2]; // check if we've reached the boundary if(px < 0 || px >= size || py < 0 || py >= size) { flagMap[index] = -1; break; } // calculate interpolated values static int x0, x1, y0, y1; static float du, dv; static float interpolatedHeight, interpolatedFlagMap; static float heights[4]; static float pixels[4]; static float invdu, invdv; static float w0, w1, w2, w3; x0 = floor(px); x1 = ceil(px); y0 = floor(py); y1 = ceil(py); du = px - x0; dv = py - y0; invdu = 1.0 - du; invdv = 1.0 - dv; w0 = invdu * invdv; w1 = invdu * dv; w2 = du * invdv; w3 = du * dv; // compute interpolated height value from the heightmap direction below ray heights[0] = heightmap[y0*size+x0]; heights[1] = heightmap[y1*size+x0]; heights[2] = heightmap[y0*size+x1]; heights[3] = heightmap[y1*size+x1]; interpolatedHeight = w0*heights[0] + w1*heights[1] + w2*heights[2] + w3*heights[3]; // compute interpolated flagmap value from point directly below ray pixels[0] = flagMap[y0*size+x0]; pixels[1] = flagMap[y1*size+x0]; pixels[2] = flagMap[y0*size+x1]; pixels[3] = flagMap[y1*size+x1]; interpolatedFlagMap = w0*pixels[0] + w1*pixels[1] + w2*pixels[2] + w3*pixels[3]; // get distance from original point to current point distance = sqrt( (px-*X)*(px-*X) + (py-*Y)*(py-*Y) ); // get height at current point while traveling along light ray height = heightmap[index] + lightDir[1]*distance; // check intersection with either terrain or flagMap // if interpolatedHeight is less than interpolatedFlagMap that means we need to use the flagMap value instead // else use the height value static float val; val = interpolatedHeight; if(interpolatedHeight < interpolatedFlagMap) val = interpolatedFlagMap; if(height < val) { flagMap[index] = val - height; lightmap[index*3+0] = shadowColor[0]; lightmap[index*3+1] = shadowColor[1]; lightmap[index*3+2] = shadowColor[2]; break; } // check if pixel we've moved to is unshadowed // since the flagMap value we're using is interpolated, we will be in between shadowed and unshadowed areas // to compensate for this, simply define some epsilon value and use this as an offset from -1 to decide // if current point under the ray is unshadowed static float epsilon = 0.5f; if(interpolatedFlagMap < -1.0f+epsilon && interpolatedFlagMap > -1.0f-epsilon) { flagMap[index] = -1.0f; break; } } // update inner loop variable if(dirY < 0) { iY--; if(iY < 0) break; } else { iY++; if(iY >= size) break; } } // reset inner loop starting point if(dirY < 0) iY = size - 1; else iY = 0; // update outer loop variable if(dirX < 0) { iX--; if(iX < 0) break; } else { iX++; if(iX >= size) break; } } delete [] flagMap; }

For a working example of the algorithm given above you can check out the free demo of the terrain editing software Freeworld3D. It uses the exact same code given above for generating lightmaps at extremely fast speeds.

## Optimized Source Code

This is optimized for more speed (about 100x faster in some cases, multithreaded, viable for real-time use)

// Modified by Seniltai // Went from 0.2-0.5 fps to 35-50 fps with the following algorithm // Make sure you enable OpenMP in gcc/g++ or visual studio project settings! #include <omp.h> void IntersectMap(float *heightmap, unsigned char *lightmap, int size, float lightDir[3]) { // create flag buffer to indicate where we've been float *flagMap = new float[size*size]; for(int i = 0; i < size*size; i++) flagMap[i] = 0; // calculate absolute values for light direction float lightDirXMagnitude = lightDir[0]; float lightDirZMagnitude = lightDir[2]; if(lightDirXMagnitude < 0) lightDirXMagnitude *= -1; if(lightDirZMagnitude < 0) lightDirZMagnitude *= -1; float distanceStep = sqrtf(lightDir[0] * lightDir[0] + lightDir[1] * lightDir[1]); // decide which loop will come first, the y loop or x loop // based on direction of light, makes calculations faster // outer loop // #pragma omp parallel for firstprivate(distanceStep, lightDirXMagnitude, \ lightDirZMagnitude, heightmap, lightmap, flagMap, size, lightDir) schedule(static,4) for(int y=0; y<size; y++) { int *X, *Y; int iX, iY; int dirX, dirY; // this might seem like a waste, why calculate it here? you can calculate it before... // that's because threading is really picky about sharing variables. the less you share, // the faster it goes. if(lightDirXMagnitude > lightDirZMagnitude) { Y = &iX; X = &iY; if(lightDir[0] < 0) dirY = -1; else dirY = 1; if(lightDir[2] < 0) dirX = -1; else dirX = 1; } else { Y = &iY; X = &iX; if(lightDir[0] < 0) dirX = -1; else dirX = 1; if(lightDir[2] < 0) dirY = -1; else dirY = 1; } // if you decide to just do it single-threaded, // just copy the previous block back just above the for loop if(dirY < 0) iY = size - y - 1; else iY = y; // inner loop for(int x=0; x<size; x++) { if(dirX < 0) iX = size - x - 1; else iX = x; float px, py, height, distance, origX, origY; int index; // travel along the terrain until we: // (1) intersect another point // (2) find another point with previous collision data // (3) or reach the edge of the map px = *X; py = *Y; origX = px; origY = py; index = (*Y) * size + (*X); distance = 0.0f; // travel along ray while(1) { px -= lightDir[0]; py -= lightDir[2]; // check if we've reached the boundary if(px < 0 || px >= size || py < 0 || py >= size) { flagMap[index] = -1; break; } // calculate interpolated values int x0, x1, y0, y1; float du, dv; float interpolatedHeight, interpolatedFlagMap; //float heights[4]; //float pixels[4]; float invdu, invdv; float w0, w1, w2, w3; x0 = int(px); //x1 = ceilf(px); y0 = int(py); //y1 = ceilf(py); x1 = x0 + 1; y1 = y0 + 1; du = px - x0; dv = py - y0; invdu = 1.0 - du; invdv = 1.0 - dv; w0 = invdu * invdv; w1 = invdu * dv; w2 = du * invdv; w3 = du * dv; // compute interpolated height value from the heightmap direction below ray interpolatedHeight = w0*heightmap[y0*size+x0] + w1*heightmap[y1*size+x0] + w2*heightmap[y0*size+x1] + w3*heightmap[y1*size+x1]; // compute interpolated flagmap value from point directly below ray interpolatedFlagMap = w0*flagMap[y0*size+x0] + w1*flagMap[y1*size+x0] + w2*flagMap[y0*size+x1] + w3*flagMap[y1*size+x1]; // get distance from original point to current point //distance = sqrtf( (px-origX)*(px-origX) + (py-origY)*(py-origY) ); distance += distanceStep; // get height at current point while traveling along light ray height = heightmap[index] + lightDir[1]*distance; // check intersection with either terrain or flagMap // if interpolatedHeight is less than interpolatedFlagMap that means // we need to use the flagMap value instead // else use the height value float val; if(interpolatedHeight < interpolatedFlagMap) val = interpolatedFlagMap; else val = interpolatedHeight; if(height < val) { flagMap[index] = val - height; lightmap[index] = 0; break; } // check if pixel we've moved to is unshadowed // since the flagMap value we're using is interpolated, we will be in // between shadowed and unshadowed areas // to compensate for this, simply define some epsilon value and use // this as an offset from -1 to decide // if current point under the ray is unshadowed static float epsilon = 0.5f; if(interpolatedFlagMap < -1.0f+epsilon && interpolatedFlagMap > -1.0f-epsilon) { flagMap[index] = -1.0f; break; } } } } delete [] flagMap; }

This version is a variant on the original source code. First, a very costly square root operation in the most inner loop has been removed and brought completely outside. float -> int casts been optimized, ceilf removed, made multi-threaded safe, and uses less variables.