From GPWiki
Jump to: navigation, search

High-Dynamic Range Rendering

In the real world, the difference between the brightest point in a scene and the darkest point can be much higher than in a scene shown on a computer display or on a photo. The ratio between highest and lowest luminance is called dynamic range.

The range of light we experience in the real world is vast. It might be up to 1012:1 for viewable light in everyday life, from starlight scenes to sun-lit snow and up to a dynamic range of 104:1 from shadows to highlights in a single scene. However, the range of light we can reproduce on our print and screen display devices spans at best a dynamic range of about 300:1 [Seetzen]. The human visual system is capable of a dynamic range around 103:1 at a particular exposure. In order to see light spread across a high dynamic range, the human visual system automatically adjust it's exposure to light, selecting a narrow range of luminances based on the intensity of incoming light. This light adaption process is delayed, as you'll notice when entering a dark building on a sunny day, or vice versa and is known as exposure adjustment.

Another property of the human visual system is visible, if extremely bright areas are watched. The optic nerve can overload in such a situation and produce an effect that is called glaring.

The human eye is made up of two main types of photo receptors, rods and cones. As the luminance of an area being viewed decreases the rods shut down and all vision is done through the cones. Although this isn't exactly true since there are a small number of rods on at even very low luminance. When cones become the dominant photo receptors there is a very slight shift in colors to a more bluish range. This is due to the fact that there is only one type of cone which is optimal at absorbing blues while rods come in three types(red, green, blue). This shift is know as the Blue Shift. Not only is there a shift to a bluish range but also since there are fewer photons entering the eye there is more noise and there is a general loss of detail.

Because of the restricted dynamic range of display systems, lighting calculations are clamped to the narrow range of the device, so that the images look washed out, over saturated, or much too dark on the monitor. Additionally a display is not able to reproduce phenomena such as glare, retinal burn and the loss of perception associated with rapid changes in brightness across the temporal and spatial domains. Rendering systems that attempt to sidestep these deficiencies are called high dynamic range enabled systems or for short, HDR systems.

A HDR rendering system is keeping a high range of luminance values in the whole rendering pipeline and it is compressing this high range to the small range of luminance values that can be displayed with a process that is called tone mapping. The tone mapping operator compresses HDR data into the low-dynamic range the monitor is capable to display.

A common feature list of a HDR rendering system is:

  • Capable to handle data with a higher dynamic range than 0..1
  • Tone mapping operator to compress high-dynamic range values to the low dynamic range the monitor expects
  • Light adaptation behavior similar to the human visual system
  • Glaring under intense lighting to imitate very bright areas that overload the optic nerve of the eye
  • Blue shift and night view to account for the behaviour of the human visual system under low lighting conditions

The following text will therefore discuss ways to store and keep data with a high-dynamic range, a global tone mapping operator, temporal adaptation, glare generation and the blue shift.

High-Dynamic-Range Data

To keep a high range of values the scene needs to be rendered with high-dynamic range data. Then the post-processing pipeline has to preserve the value range and precision of this data and compress it to the lower dynamic range the monitor is capable to display with the tone mapping operator.

Additionally a renderer and a post-processing pipeline should run in gamma 1.0 to make several light sources, all filtering, alpha blending, lighting and shadowing operations linear regarding brightness. With gamma 1.0 the color values will require an increased need for a higher range of values compared to gamma 2.2 compressed RGB values.

A high dynamic range of values is usually considered as a higher range of values than the 8:8:8:8 render target/texture format with its 8-bit precision or its 256 deltas in each of its channels. In textures values are even stored in much less than this due to memory constraints, while graphics hardware is capable to handle a higher precision.

Therefore the challenge to preserve a high range of color and luminance values is twofold. We have to store HDR data already in textures so that scene rendering happens with HDR texture data and additionally keep this value range and precision while running the data through the renderer and the post-processing pipeline in render targets.

Storing High-Dynamic-Range Data in Textures

Decent video console systems, DirectX 9 and DirectX 10 conform graphics cards allow the usage of DXT compressed textures. These formats compress to 4-bit or 8-bit per pixel and offer a high compression for some loss of quality. DXT5 for example quantizes the color channels to a palette of four 16-bit (5:6:5) colors for each 4x4 block. In other words 16 points in 3D space representing sixteen original colors in the block are replaced with four colors from the palette. Two of these colors are stored in the block and the other two are derived as a weighted sum of the former two.

This is the reason why DXT compressed normal maps look so bad - each 4x4 block is represented by only four different vectors.

The alpha channel is quantized to eight separate values for each 4x4 block. The eight alpha values are distributed uniformly in a sub-range of the 0..1 range. The precision of the alpha channel is therefore higher compared to the color channels.

One of the tricks of the trade to store HDR data in compressed textures is done by utilizing the underlying idea of a texture format that is called RGBE (known as Radiance RGBE format). It stores for each pixel a common exponent in the alpha channel of the texture. With this format it is possible to store a higher value range than 0..1 for the RGB channels by packing those values into a single 32-bit texture (D3DFMT_A8R8G8B8 / DXGI_FORMAT_R8G8B8A8_UINT), where RGB channels keep mantissas of the original floating-point values and the alpha channel keeps the common exponent. It was first described by Greg Ward [WardG] and it was implemented in HLSL and applied to cube maps by Arkadiusz Waliszweski [Waliszewski]. This format encodes floating point values to RGBE values by dividing the floating point value through the exponent of the largest value and then multiplying the result, that has a value range of [0.5 ..1.0] with 255 to get it into a value range of [0..255]. The common exponent is then stored in the alpha channel (this is were the E in RGBE is coming from).

If one channel keeps values significantly smaller than the channel with the greatest exponent, a loss of precision of this channel will happen. Furthermore a RGBE texture can not be compressed, because compressing the exponent would lead to visible errors.

Following the RGBE texture format, DirectX 10 now offers a DXGI_FORMAT_R9G9B9E5_SHAREDEXP format that has a better resolution than RGBE that is typically stored in 8:8:8:8 textures. Similar to RGBE the new format can not be compressed.

Compressing HDR Textures

Based on RGBE, most decent HDR texture formats keep the exponent value separate from the values in the RGB channels. This way the values in the RGB channels can be compressed for example in a DXT1/DXGI_FORMAT_BC1_UNORM texture and the exponent can be stored uncompressed in a quarter size L16 texture. This also allows filtering the value in the compressed texture with hardware.The reduction in size of the exponent L16 texture can be justified by the fact that the DXT1/ DXGI_FORMAT_BC1_UNORM compressed texture represents the original uncompressed 16 color values by only four color vectors. Therefore storing exponents only for a quarter of the colors of the original uncompressed texture should not lead to a visible quality decrease. A DXT1/DXGI_FORMAT_BC1_UNORM +L16 HDR format should end up at about 8-bit per pixel instead of the 32-bit of the original RGBE texture; similar to a DXT5/DXGI_FORMAT_BC5_UNORM compressed texture.

Higher compression of HDR data can be achieved by storing a common scale value and a bias value for all the RGB channels in a texture by utilizing unused space in the DXT1/DXGI_FORMAT_BC1_UNORM header. This compresses data to 4-bit per pixel (read more about HDR textures in [Persson]).

Gamma correcting HDR Textures

All decent hardware can remove the gamma correction (convert from gamma 2.2 to gamma 1.0) from textures while they are fetched. In Direct3D 10, sRGB encoding is handled by using one of the UNORM_SRGB formats for the shader resource or render target view. When such formats are used, conversions into linear space (on read) or from linear space (on write) are handled automatically. Unlike previous versions of Direct3D, D3D10 requires that the hardware perform sRGB conversions in the correct order relatively to filtering and alpha-blending operations.

In case a HDR texture format is used a common scale and bias value stored in the header of the file or an exponent value in a separate texture need to be stored assuming a gamma 1.0 space.This means that a texture tool needs to convert the texture first to gamma 1.0, then split up the values into a mantissa and an exponent and than convert the mantissa part -that is the RGB part- back to gamma 2.2. This way the exponent or the bias and scale values stays in gamma 1.0.

According to the roadmap for the upcoming features in DirectX, Microsoft will provide a HDR texture that takes 8-bit per pixel and offers gamma control.

Keeping High-Dynamic-Range Data in Render Targets

To preserve the precision of the data generated by the previous render operations, the post-processing pipeline needs to use render targets with enough precision. The most important render target here is the very first render target that is used to render the scene into. This render target is usually expected to support alpha blending, multi-sampling and filtering and it determines how much precision is available in the following stages.

On DirectX 9 compatible graphics cards, there is a 16-bit per channel floating-point render target format and a 10:10:10:2 render target format available.

Graphics cards that support the 10:10:10:2 render target format support alpha blending and multi-sampling of this format (ATI R5 series). Some DirectX 9 graphics cards that support the 16:16:16:16 format support alpha blending and filtering (NVIDIA G7x), others alpha blending and multi-sampling but not filtering (ATI R5 series).

Only the latest DirectX 10 compatible graphics cards (NVIDIA G8x) supports alpha blending, filtering and multi-sampling on a 16:16:16:16 render target.

The new DXGI_FORMAT_R11G11B10_FLOAT render target format available in DirectX 10 compatible graphic cards offers the best of both DirectX 9 render target formats, because it offers the memory bandwidth of the 10:10:10:2 format with more precision and source alpha blending, filtering and multi-sampling support. If more alpha blending functionality is required a 16-bit primary render target would be necessary.

The XBOX 360 supports a render target format that is called 7e3, which is a 10:10:10:2 floating-point EDRAM render target format (the corresponding resolve render target on this platform is a 16:16:16:16 floating point format). This render target format is re-configurable, so that the user can select the supported value range and with it the precision [Tchou]. A common value range here is 0..4.

The PS3 supports a 16:16:16:16 floating point render target format and a 8:8:8:8 render target format. In case the bandwidth consumption of a 16:16:16:16 floating point render target is too high, a 8:8:8:8 render target format needs to be utilitzed to store HDR data.

8:8:8:8 Render Targets

If the target hardware platform only supports 8-bit per channel render targets with multi-sampling, filtering and alpha blending or a specific situation requires this render target format, using an alternative color space is a good option to preserve precision.

A requirement list for an alternative color space is as follows:

  1. Needs to encode and decode fast enough
  2. Needs to be bilinear filterable without encoding / decoding
  3. Needs to work with a blur filter without encoding / decoding

If the requirements 2 and 3 are met, encoding / decoding only needs to be done once for the whole post-processing pipeline. In that case the conversion from RGB in gamma 1.0 to the alternative color space needs to be done once while rendering into the main render targets and then the conversion back to RGB only needs to happen once at the end of the post-processing pipeline.


Color Space
#of cycles to encode
Bilinear filtering
Blur filter
Alpha blending
RGB - Yes Yes Yes
HSV 34 No No No
CIE Yxy 11 Yes Yes No
L16uv (based on Greg Ward's LogLuv) 19 Yes Yes No
RGBE 13 No No No
Table 1 - Overview alternative color spaces

The number of cycles are measured in ATI's RenderMonkey and can only provide a rough idea on how fast the encoding algorithm works.The blur filter was a simple four tap filter with extended offsets.

HSV would be a good choice in a game engine that is build around it. The performance of encoding in HSV might be improved by using a look-up table. This color space does not support bilinear filtering or a blur filter, although bilinear filtering still might look good enough. HSV would be a good choice in a game engine that is build around it.

CIE Yxy is quite often used in a post-processing pipeline to decode/encode luminance for the tone mapping operators. It encodes reasonably fast. CIE Yxy supports bilinear filtering and it also supports applying a blur filter.

The same is true for L16uv. This is based on Greg Ward's LogLuv color space, that is used to compress HDR textures. Instead of storing a logarithmic luminance value, the luminance is distributed in the L16uv color space over two 8-bit channels of a 8:8:8:8 render target.

The last color space is RGBE. This is often used as an intermediate format for HDR textures. It does not work well with bilinear filtering, blurring and would therefore require a decoding and encoding effort every time it would be blurred.

Underlying this comparison, the CIE Yxy or the L16uv color space would be the most efficient way to encode and decode HDR data in all four channels of a 8:8:8:8 render target. Both are very similar in performance.

All alternative color spaces above do not support alpha blending. Therefore all blending operations still have to happen in RGB space.

Implementation

An implementation of a high-dynamic range renderer that renders into 8:8:8:8 render targets might be done by differing between opaque and transparent objects. The opaque objects are stored in a buffer that uses the CIE Yxy color model or the L16uv color model to distribute precision over all four channels of this render target. Transparent objects that would utilize alpha blending operations would be stored in another 8:8:8:8 render target in RGB space. Therefore only opaque objects would receive a better color precision.

To provide to transparent and opaque objects the same color precision a Multiple-Render-Target consisting of two 8:8:8:8 render targets might be used. For each color channel bits 1-8 would be stored in the first render target and bits 4 - 12 would be stored in the second render target (RGB12AA render target format). This way there is a 4 bit overlap that should be good enough for alpha blending.

Tone Mapping Operator

Most post-processing pipelines use a tone mapping operator that was developed by Erik Reinhard [Reinhard]. The underlying idea for the tone mapping operator is based on Ansel Adams zone system, used in his black and white photographs. Adam's defined 11 zones of brightness / luminance and printed them on a card for reference when he is out in the wilderness to make his photos. While shooting his photo, he made notes on the picture that he is going to take. He scribbled the picture into his notebook with roman numbers for the zones he can see for certain areas.

Back in his dark room he tried to re-create the luminance values in the photo that he saw by using his card with the zones and the blueprint from his notebook.

In the spirit of Ansel Adams approach, we create a luminance copy of the picture and use this copy to measure an average luminance of the current frame. We do not use a card with 11 different luminance zones, but we use at least an average luminance value that maps by default to zone 5 of the zone system. This is called the middle grey value and it is the subjective middle brightness value that is ~18% reflectance on the print. Then we adjust the luminance of the resulting color image with the average luminance value we measured and with the middle grey value.

While Ansel Adams tried to make up for the difference in the range of possible brightness values of the real-world compared to what was possible to store at his time on photo paper, we try to make up for the difference in the range of possible brightness and color values of what is generated by the renderer compared to what is possible to display on an average monitor.

This process is called tone mapping. The tone mapping operator compresses data with a high-dynamic range to low-dynamic range data. Tone mapping is one of the last stages in a post-processing pipeline, to assure that all operations can be done on the full range of values.

Luminance Transform

Background

To measure luminance, two steps are involved. First the luminance of the whole scene is calculated for each pixel of the scene, and then the luminance values of all pixels are averaged and the result is stored in a 1x1 texture in several passes. Retrieving luminance values from a color value is covered in equation x above.

Erik Reinhard et. all [[../../../Graphics%20Programming/My%20Projects/Post-Processing%20Pipeline/NxGenShaders/References.htm Reinhard]] developed a way to average scene luminance. They use the antilogarithm of the average log values for all sampled pixels.

InitialMeasureLuminance.gif
Equation 6

where Lum(x, y) represents the luminance for a pixel at the location (x, y), N is the total number of pixels in the image and δ is a small value to avoid the singularity that occurs if black pixels are present in the image. The logarithmic average is used over the arithmetic average in an effort to account for a nonlinear response to a linear increase in luminance.

The resulting one pixel luminance value that represents the average luminance of the whole screen is later compared to each pixels luminance. To be able to convert the color values that represent a pixel to luminance and back to color, each color value is converted first into CIE XYZ and then into CIE Yxy, where the Y value holds the luminance as shown in equation 7.

ConvertRGBToLuminance.gif
Equation 7

Then this luminance value can be adjusted and values can be converted back via CIE Yxy values back to RGB following equation 8.

ConvertLuminanceToRGB.gif
Equation 8

Implementation

To transform the color images into one luminance value that represents the luminance of the whole screen, the color value is first converted based on part of equation x that stands in brackets. To convert the color values to luminance values the same equation is used as for the color saturation effect:

const float3 LUMINANCE = float3(0.2125f, 0.7154f, 0.0721f);
 
const float DELTA = 0.0001f;
 
float fLogLumSum = 0.0f;
 
fLogLumSum += log(dot(tex2D(ToneMapSampler, TexCoord.xy + float2(-1.0f * TexelSize, -1.0f * TexelSize)).rgb, LUMINANCE) + DELTA);
fLogLumSum += log(dot(tex2D(ToneMapSampler, TexCoord.xy + float2(-1.0f * TexelSize, 0.0f)).rgb, LUMINANCE) + DELTA);
fLogLumSum += log(dot(tex2D(ToneMapSampler, TexCoord.xy + float2(-1.0f * TexelSize, 1.0f * TexelSize)).rgb, LUMINANCE) + DELTA);

This code fetches the color render target nine times. Depending on the difference in size between the render target that this codes uses to read from and the render target where it writes into, more or less texels have to be fetched. The task of the following stages is to downsample the result of this pass by halfing the render target sizes. If this pass renders into a 128x128 render target, the following stages will downsample to 64x64, 16x16, 4x4, 1x1 by fetching always 16 texels from the render target they read. The very last render target will then apply the exponential function of equation x like this:

float fSum = 0.0f;
 
fSum += tex2D(ToneMapSampler, TexCoord.xy + float2(1.5f * TexelSize, -1.5f * TexelSize)).x;
fSum += tex2D(ToneMapSampler, TexCoord.xy + float2(1.5f * TexelSize, -0.5f * TexelSize)).x;
fSum += tex2D(ToneMapSampler, TexCoord.xy + float2(1.5f * TexelSize, 0.5f * TexelSize)).x;
...
return fSum = exp(fSum / 16.0f);
...

Each pixel in the target render target represents with a 16-tap shader 16 pixels in the source render target.

The drawback of converting a color value into a luminance value by multiplying it with the LUMINANCE constant described above is, that it is not possible to convert it back to color afterwards. To be able to compare the average luminance value gained through equation 6 with the luminance of each pixel and adjusting the luminance of each pixel with this luminance value, we need to convert each pixel to luminance as shown in equation 7. Then we are able to adjust its luminance value and later we can convert back to color following equation 8. Converting a pixel color into luminance can be done with the following source code:

// RGB -> XYZ conversion 
// http://www.w3.org/Graphics/Color/sRGB 
// The official sRGB to XYZ conversion matrix is (following ITU-R BT.709)
// 0.4125 0.3576 0.1805
// 0.2126 0.7152 0.0722 
// 0.0193 0.1192 0.9505 
 
const float3x3 RGB2XYZ = {0.5141364, 0.3238786, 0.16036376, 0.265068, 0.67023428, 0.06409157, 0.0241188, 0.1228178, 0.84442666};
 
float3 XYZ = mul(RGB2XYZ, FullScreenImage.rgb); 
 
// XYZ -> Yxy conversion 
 
float3 Yxy; 
 
Yxy.r = XYZ.g; 
 
// x = X / (X + Y + Z) 
// y = X / (X + Y + Z) 
 
float temp = dot(float3 1.0,1.0,1.0), XYZ.rgb); 
 
Yxy.gb = XYZ.rg / temp;

Converting the resulting luminance value back to color the following code can be used:

// Yxy -> XYZ conversion 
XYZ.r = Yxy.r * Yxy.g / Yxy. b; 
 
// X = Y * x / y 
XYZ.g = Yxy.r;
 
// copy luminance Y 
XYZ.b = Yxy.r * (1 - Yxy.g - Yxy.b) / Yxy.b;
 
// Z = Y * (1-x-y) / y 
 
// XYZ -> RGB conversion 
// The official XYZ to sRGB conversion matrix is (following ITU-R BT.709) 
// 3.2410 -1.5374 -0.4986
// -0.9692 1.8760 0.0416 
// 0.0556 -0.2040 1.0570 
 
const float3x3 XYZ2RGB = { 2.5651,-1.1665,-0.3986, -1.0217, 1.9777, 0.0439, 0.0753, -0.2543, 1.1892};

The RGB->Yxy and the Yxy->RGB conversions are used in the bright-pass filter and the tone-mapping operator. The following section describes how a tone mapper adjusts the luminance values of single pixels with the help of the average luminance value. The luminance value stored in the variable LumImage represents the luminance of an individual pixel converted with the code above and the luminance value stored in the variable LumAverage represents the luminance converted with the chain of downsampling render targets and the constant LUMINANCE vector.

Range Mapping

Background

With the average luminance in hand, the luminance of the current pixel can be scaled with the middle grey value like this:

ScaledLuminance.gif
Equation 9

Applying this equation to all pixels will result in an image of relative intensity. Assuming the average luminance of a scene is 0.36 and the middle grey value is following Ansel Adams subjective middle brightness value of 0.18, the luminance value range of the final image would be half the value range of the original image.

Depending on day time and weather conditions, the middle grey value needs to be chosen differently. This might happen via keyframed values that depend on time of day and weather conditions or it might happen dynamically. [Krawczyk] uses the following equation to generate a middle grey value dynamically for videos.

MiddleGreyAutoEQ.gif
Equation 10

This function simply blends between a set of key values that have been empirically associated with different luminance levels. The following figure 5 visualizes equation 9 with an average luminance value range of 0..1 and a middle grey value of 0.18.

ScaledLuminance.jpg
Figure 5 - scaled luminance range

To output a pixel with a high luminance value, the average luminance value needs to be very small. Then the values can easily go over the 0..1 value range a monitor can display. If for example the average luminance value is 0.1 and the per pixel luminance value is 0.5, the resulting value already leaves the display value range of 0..1.

To map/compress relative luminance values to the zero to one range of the display device this scaled luminance value is divided by itself while adding one to the denominator.

CompressedLuminanceEQ.gif
Equation 11

This function scales small values linearly, whereas higher luminances are compressed by larger amounts. Even with a high average luminance value in a scene, this function still outputs a decent value, so that the resulting luminance varies more smoothly. The function has a asymptote at 1, which means that all positive values will be mapped to a display range between 0 and 1 as shown in figure x.

SimpleToneMappingOperator.jpg
Figure 6 - Reinhard's simple tone mapping operator

However in practice the input image does not contain infinitely large luminance values, and therefore the largest display luminances do not always reach 1 or more. In addition it might be artistically desireable to let bright areas burn out in a controlled fashion. Reinhard achieves this effect by blending the previous transfer function with a linear mapping, yielding the following tone mapping operator:

AdvancedToneMappingOperatorEQ.gif
Equation 12

Equation 12 introduces a new parameter, White, which denotes the smallest luminance value that will be mapped to white. By default, this parameter is set to the maximum world luminance. For low-dynamic-range images, setting White to smaller values yields a subtle contrast enhancement.

In other words: White minimizes the loss of contrast when tone mapping a low dynamic rnage image. This is also called whitepoint clamping. Figure 7 and figure 8 show equation 12 with a middle grey value of 0.18 and a value range of 0..4 for the luminance of individual pixels in a scene.

AdvancedToneMappingOperatorWhite4.jpg
Figure 7 - Advanced Tone Mapping operator with a White value of 2
AdvancedToneMappingOperator.jpg
Figure 8 - Advanced Tone Mapping Operator with a White value of 6

Figure 8 below figure 7 shows that the equation becomes more similar to the simple tone mapping operator if the White value is increased and if it is decreased it approaches more the form of the simple scaling function in equation 9.

Implementation

The source code for the advanced tone mapping operator described in equation 12 and visualized in figures 7 and 8 is very straightforward:

// Map average luminance to the middlegrey zone by scaling pixel luminance 
// raise the value range a bit ... 
float LumScaled = Yxy.r * MiddleGray / (AdaptedLum.x + 0.001f); 
 
// Scale all luminance within a displayable range of 0 to 1 
Yxy.r = (LumScaled * (1.0f + LumScaled / White))/(1.0f + LumScaled);

The first line follows equation 9 expecting the RGB-to-Yxy converted luminance value in the variable Yxy.r. This was shown in the paragraph named "Luminance Transform" above. The small delta value should prevent divisions through 0. In the second source code line, the value in White is already multiplied with itself before it is send down to the pixel shader.

Light Adaptation

Because we measure the luminance every frame, we can re-use the data to mimic the light adaptation process of the eye. Light adaptation happens if bright to dark or dark to bright changes happen, for example if we leave a dark room to go into a bright outdoor area or vice versa.

The retina contains two types of photoreceptors, rods and 6 to 7 million cones. The rods are more numerous, some 120 million, and are more sensitive to light than the cones. However, they are not sensitive to color.

The daylight vision (cone vision) adapts much more rapidly to changing light levels, adjusting to a change like coming indoors out of sunlight in a few seconds. Like all neurons, the cones fire to produce an electrical impulse on the nerve fiber and then must reset to fire again. The light adaptation is thought to occur by adjusting this reset time.

The rods are responsible for our dark-adapted, or scotopic, vision. They are more than one thousand times as sensitive as the cones and can be triggered by individual photons under optimal conditions. The optimum dark-adapted vision is obtained only after a considerable period of darkness, say 30 minutes or longer, because the rod adaption process is much slower than that of the cones.

In a game scenario it should not make sense to add a 30 minute light adaptation sequence when the main character comes from a very bright outdoor environment to a very dark indoor environment or a 3 seconds adaptation time to a racing game (in real-life the driver would for example try to prevent himself from being blinded by the sun to get around any adaptation time). Therefore any approximation needs to be very rough for game play reasons and most games will end up with a subtle light adaptation.

To simulate the eyes reaction to temporal changes in lighting conditions an adapted luminance value replaces the average luminance value in previous equations.

As the average luminance changes the adapted luminance should drift in pursuit. In the event of stable lighting conditions the adapted and average luminance should become equal, representing fully adjusted eyes. This behavior can be achieved by an exponential decay function [Pattanaik]:

LightAdaptionExponentialDecayEQ.gif
Equation 13<center>

Where dt is the time step between frames and tau is the adaptation rate. Human eyes adjust to light or dark conditions at different rates. When moving from a dark building to a sunny courtyard your eyes will adapt at a noticeably rapid rate. On the other hand, when moving from a sunny courtyard into a dark building it can take between ten and thirty minutes to fully adapt. This lack of symmetry is due to the rods (responsible for scoptic vision or dark-adapted vision) or cones of the retina dominating perception in dark or light conditions respectively. With this in mind tau can be found by interpolating between the adaptation rates of the rods and cones.

<center>InterpolatingTauEQ.gif<center>

<center>Equation 14

Assigning a value of 0.2 to tau for the rods and 0.4 to tau for the cones gives pleasing results in a typical scenario.

Luminance History Function

Background

To prevent very short luminance changes from influencing the light adaptation the change of the average luminance of a scene can be stored over 16 frames for example to only allow light adaptation if all average luminance values increased or decreased over the last 16 frames. Equation x compares the average luminance values of the last 16 frames with the current adapted luminance. If all the values in the last 16 frames are bigger or equal to the current adapted luminance value, light adaptation will happen. If all those values are smaller, light adaptation will also happen. If some of the 16 values of the last 16 frames are pointing in different directions, no light adaptation will happen (see also [Tchou]).

LuminanceHistoryEQ.gif
Equation 15

Equation 15 looks at first sight quite complex, but an implementation only needs to run once per frame.

Implementation

The luminance history of the last 16 frames can be stored in a 4x4 render target. This render target is then read with a shader that holds equation 15. Depending on the history of the last 16 frames, the adapted luminance value of the current frame will be returned or the adapted luminance of the previous frame.

float4 LumVector; // holds four luminance values 
float4 zGreater; // holds the result of one comparison
float4 CompResult; // holds the result of all four comparisons
 
LumVector.x = tex2D(ToneMapSampler, TexCoord.xy + float2(1.5f  * TexelSize, -1.5f * TexelSize)).x;
LumVector.y = tex2D(ToneMapSampler, TexCoord.xy + float2(1.5f  * TexelSize, -0.5f * TexelSize)).x;
LumVector.z = tex2D(ToneMapSampler, TexCoord.xy + float2(1.5f  * TexelSize,  0.5f * TexelSize)).x;
LumVector.w = tex2D(ToneMapSampler, TexCoord.xy + float2(1.5f  * TexelSize, 1.5f * TexelSize)).x;
 
zGreater = (AdaptedLum <= LumVector);
 
CompResult.x = dot(zGreater, 1.0f);
 
...
// this is repeated four times for all 16 pixels
// if the result here is 16, all the values in the 4x4 render target are bigger than the current average luminance
// if the result here is 0, all the values are smaller than the current average luminance
 
CompResult.x = dot(CompResult, 1.0f);
 
if(CompResult.x >= 15.9f || CompResult.x <= 0.1f)
{
    // run light adaptation here
    ...
}
else
{
    // use luminance value from previous frame
    ...
}

Glare

Glare generation is accomplished by a bright pass filter followed by an area convolution to mimic the overloading of the optic nerve in the eye under very bright lighting conditions.

The bright pass filter compresses the dark parts of a scene so that the bright areas are emphasized. The remaining bright areas are then blurred with a Gaussian convolution filter. By blending the end result of those two stages with the already tone-mapped final image an effect is achieved similar to the dodging and burning used by Ansel Adams to brighten or darken parts of black and white photographs; just with color images.

Bright pass filter

To be consistent with the rest tone mapping stage, the bright pass filter uses a tone mapping operator to compress darker areas, so that the remaining image consists of relatively bright areas. The basis for this operator is again Reinhard's tone mapping operator as shown in equation x.

BrightPassFilterEQ.gif
Equation 16<center>

Figure 9 shows the resulting curve of the bright pass filter.

<center>AdvancedBrightPassFilter.jpg
Figure 9 - Bright pass filter with a threshold of 0.5 and an offset value of 1.0.

The parameter T -Threshold- in the bright pass tone mapping operator moves the whole curve into the -y direction, while the parameter O -Offset- changes the flow of the curve. Increasing the value of O makes the curve steeper, which means it is more sensitive to light changes and decreasing makes it more shallow, so that it is less sensitive to light changes. The bright pass filter has its own white value, which is set to 4 in figure 9.

Blur

A blur filter based on a simplified algorithm developed by Gauss is used to blur the result of the bright pass as shown in figure x:

GaussFilter.jpg
Figure 10 - Gaussian blur filter

This Gauss filter blurs 2D images by sampling a circular neighborhood of pixels from the input image and computes their weighted average.

GaussFilterEQ.gif
Equation 17

This filter can be rearranged and made separable as shown in equation 18:

GaussFilterSeparatedEQ.gif
Equation 18

Where σ is the standard deviation of the Gauss filter and x and y are the coordinates of image samples relative to the center of the filter. Factorizing the equation this way lets us compute the Gauss filter with a series of 1D filtering operations. Combining for example a weighted sum of a column of 13 pixels centered around the current pixel with a weighted sum of a row of 13 pixels centered around the same pixel will mimic a 25-tap filter.

The bright-pass filter and the Gauss filter can be applied to very small render targets to save performance. Typically quarter-size or a sixteenth-size render targets are used here. These filters do not necessarily applied in the order presented here. Bluring a small image with a Gauss filter and then applying the bright-pass filter later is also possible and sometimes favourable if the blurred image can be re-used for other effects like Depth of Field.

Night Tonemapping / Blue Shift

The tone mapping operators discussed so far all assume that the image represents a scene under photopic viewing conditions, i.e., as seen at normal light levels. For scotopic scenes, i.e. very dark scenes, the human visual system exhibits distinctly different behavior. In particular, perceived contrast is lower, visual acuity is lower, and everthing has a slightly blue appearance.

A night tone mapping operator reduces brightness and contrast, desaturates the image, adds a blue shift and reduces visual acuity (read more in [Shirley]).

A typical approach starts by converting the image from RGB to XYZ. Then, scotopic luminance V may be computed for each pixel:

NighttoneMappingEQ.gif
Equation 19

To tint the scotopic luminance V with a blue shift, the single channel image is multiplied with the blue shift color.

NighttoneMapping+BlueShiftEQ.gif
Equation 20

The night view can then be achieved by a simple linear blend between the actual scene and this night color.