Upcoming Events
GDC 2010
3/9 - 3/13 @ San Francisco, CA

SXSW Interactive Festival
3/12 - 3/16 @ Austin, TX

IEEE Virtual Reality 2010
3/20 - 3/26 @ Waltham, MA

Women in Games
3/25 - 3/26 @ Bradford, United Kingdom

More events...


Quick Stats
6198 people currently visiting GDNet.
2375 articles in the reference section.

Help us fight cancer!
Join SETI Team GDNet!



Link to us

Events 4 Gamers

  Intel sponsors gamedev.net search:   

High Dynamic Range Environment Mapping On Mainstream Graphics Hardware


IV. Implementation

4.1 Demo

Included with this article is an SSE2 optimized implementation of file loading and image key calculations utilizing the HDRFormats demo from the Microsoft SDK. [MSSDK04]. Our implementation was measured to have a 21.6% speedup on a 1024x768 HDR image over a C implementation for image reading. We also were able to gain about 31% speedup on image key calculations for 640x480 images compared to a C based implementation. The demo also allows interactive adjustment of the midzone_luminance (referred to as MIDDLE_GREY in the demo) from Equation 5 to allow the reader to better understand how adjustments of the midzone_luminance affects the resulting image. Additionally, we noticed that a pure implementation of the mathematics of tone mapping for each image could result in images that changed tone too dramatically from frame to frame. Therefore, we limit the amount the image_key can vary from frame to frame to prevent the image from changing too drastically, allowing the image to 'settle' to the correct value after a few iterations. The result was much more aesthetically pleasing, and is a more accurate depiction of what the light does in situations where the light does vary dramatically.

4.2 RGBE format

The RGBE format is suitable for storage of high dynamic imagery for real-time graphics and was used for our implementation. RGBE was originally created by Greg Ward for his Radiance software package [Radiance04]. The format consists of an 8-bit mantissa for each Red, Green, and Blue channel along with an 8-bit exponent for 32 bits per pixel as seen in Figure 8. They share the exponent thus reducing the storage required significantly when comparing to a 32 bit per channel format. (32 bits per float * 3 = 92 bits per pixel vs 32 bits per pixel). A downside is a lack of dynamic resolution between color channels since you are sharing the exponent for all of the color channels.

Figure 8. Figure depicts the 8 bytes per channel for Red, Green, Blue and a shared exponent value that is used to represent the HDR images in our examples. The shared exponent is in the channel typically reserved for the alpha channel in an image.

Encoding and decoding using the RGBE format is easy. To encode a pixel using the RGBE format the following HLSL pixel shader 2.0 code can be used:

float4 EncodeRGBE8( in float3 rgb )
{
  float4 vEncoded;
  float maxComponent = max(max(rgb.r, rgb.g), rgb.b );
  float fExp = ceil( log2(maxComponent) );
  vEncoded.rgb = rgb / exp2(fExp);
  vEncoded.a = (fExp + 128) / 255;
  return vEncoded;
}

To decode a pixel using the RGBE format the following HLSL pixel shader 2.0 code can be used:

float3 DecodeRGBE8( in float4 rgbe )
{
  float3 vDecoded;
  float fExp = rgbe.a * 255 - 128;
  vDecoded = rgbe.rgb * exp2(fExp);
  return vDecoded;
}

4.3 SSE2 Optimized High Dynamic Range Image Reading

For our implementation we created an SSE optimized HDR reader for RGBE images that shows a speedup of 21.6% when reading in images of size 1024x768. The C version is based on Greg Ward's implementation, originally written and posted by Bruce Walter at http://www.graphics.cornell.edu/~bjw/rgbe.html. It was altered to read HDRShop headers by Alex at www.FusionIndustries.com.

Using this SSE2 routine in your engine can speed up load times of images used for high dynamic range environment maps in your engine. If you only have one high dynamic range image in your game, this performance difference could be negligible, but if it occurs for 5, 10, or 20 images per level one can quickly see the benefits of such a routine. The demo includes this code.

4.4 SSE2 Optimized High Dynamic Range Image Key

To avoid having to transfer the image over the bus to compute an image key we calculate this on the CPU using an SSE optimized image_key computation included in the example. Deciding whether to calculate the image key on the CPU or the GPU is application, graphics card, and graphics bus dependent. Experiment with a CPU and GPU approach to see what is best for your application. Our SSE optimized HDR image key computation was shown to be 33% faster than doing the calculation with C code on a 640x480 render target.

4.5 Pixel Shader for Integrated Graphics

We have also written a pixel shader in HLSL that supports using HDR images for environment mapping on Intel 915G Graphics. The 915G is optimized for DirectX 9 support and uses DirectX's Intel-architecture-optimized PSGP (Platform Specific Graphics Processing) Vertex Shader 3.0 and Pixel Shader 2.0. Since there is no support for floating point textures in this hardware, we perform the tone mapping described earlier in the pixel shader using RGBE images. The complete source code for our pixel shaders is given in the effects file in the demo.

4.6 HDR samples in the Microsoft DirectX 9.0 SDK

Microsoft has provided examples in the DirectX SDK that demonstrate the above techniques without the optimizations made in this article [MSSDK04]. They provide examples that show HDR in several different scenarios. HDRCubemap is a demonstration of cubic environment mapping that uses floating-point cube textures to store values where the total amount of light illuminating a surface is greater than 1.0. and HDR lighting. HDRFormats shows a technique much like what is used in this article for displaying HDR images on hardware that is not capable of using floating point textures and was the original inspiration for our work. The most notable difference is that this sample is not tied to the DDS file format, therefore any HDR image that is encoded by HDRShop can be used. We have a SIMD accelerated optimization to determine powers of two for the shared exponent. HDRLighting demonstrates blue shift under low light and bloom under intense lighting conditions as well as under and overexposing the camera.





Future Work


Contents
  Introduction
  Theory
  Implementation
  Future Work

  Source code
  Printable version
  Discuss this article