• entries
18
20
• views
26797

# FPS meter, Moving buffers to the GPU, and Using the stencil part of the depth-stencil

1495 views

Update: source now available at https://github.com/lyost/d3d12_framework

While trying to build a couch and dealing with a broken pipe below the concrete floor of the basement, I've also been continuing playing with Direct3D12.  Since the last blog entry, I have implemented an FPS meter that uses a basic texture atlas for its display, added classes for having vertex and index buffers reside in GPU memory without direct CPU access, and I added a depth-fail shadow volume test case for adding use of the stencil part of the depth-stencil to the framework.

# FPS Meter

So far in the framework, the Game base class passed the value of the fixed timestep to the update and draw functions as the elapsed time.  In order to compute the actual number of frames per second, the actual elapsed time between frames is needed instead.  So, both values are now provided as arguments to the update and draw functions.  This allows for it to easily be the choice of the game for which value to use, or it can use both.  This of course required a minor update to all the existing test programs to add in the additional argument even though they are still using the fixed timestep value.

The FPS meter itself is a library in the project named "fps_monitor" so it can be easily re-used for projects as needed.  The library is the FPSMonitor class and the shaders needed for rendering it.  The FPSMonitor calculates and displays the minimum, maximum, and average FPS over a configurable number of frames.  It has its own graphics pipeline for rendering.  So that it doesn't get bloated with code for loading different image formats or texture atlas data formats, the already loaded data is taken as arguments to the constructor.

The vertices sent to the vertex shader use projection space x and y coordinates that maintain the width and height of the character as provided to the FPSMonitor constructor (which means this works best with monospaced fonts), uv coordinates for the texture going from 0-1 in both dimensions, and the key into the texture atlas lookup table (initialized to 0, but the Update function fills in the desired value for that frame).

m_vertex_buffer_data[i * VERTS_PER_CHAR    ] = { XMFLOAT2(-1 + x,                y),                 XMFLOAT2(0.0f, 0.0f), 0 };
m_vertex_buffer_data[i * VERTS_PER_CHAR + 1] = { XMFLOAT2(-1 + x,                y - m_char_height), XMFLOAT2(0.0f, 1.0f), 0 };
m_vertex_buffer_data[i * VERTS_PER_CHAR + 2] = { XMFLOAT2(-1 + x + m_char_width, y - m_char_height), XMFLOAT2(1.0f, 1.0f), 0 };
m_vertex_buffer_data[i * VERTS_PER_CHAR + 3] = { XMFLOAT2(-1 + x + m_char_width, y),                 XMFLOAT2(1.0f, 0.0f), 0 };

The texture atlas lookup table is provided to the vertex shader through a constant buffer that is an array of the uv coordinates to cover a rectangle for that entry.

struct LookupTableEntry
{
float left;
float right;
float top;
float bottom;
};

cbuffer LOOKUP_TABLE : register(b0)
{
LookupTableEntry lookup_table[24];
}

The combination of the 0-1 uv coordinates on each vertex and the lookup table index allow for the vertex shader to easily compute the uv coordinates for the particular character in the texture atlas.

output.uv.x = (1 - input.uv.x) * lookup_table[input.lookup_index].left + input.uv.x * lookup_table[input.lookup_index].right;
output.uv.y = (1 - input.uv.y) * lookup_table[input.lookup_index].top  + input.uv.y * lookup_table[input.lookup_index].bottom;

An alternative approach would be to skip the index field in the vertex data and update the uv coordinates on the host so that the vertex shader becomes more of a pass through.

In order to test that the FPS values are being computed correctly, the test program needs the frame rate to vary.  Conceptually there are 2 ways to accomplish this within a program.  One is to switch between different content for one set that don't stress the system's rendering capabilities and one that does.  Another way, and the way taken in the test program, is to change the fixed timestep duration.  By pressing and releasing numpad 1, 2, or 3 the test program will move between 60, 30, or 24 FPS respectively.  While changing the frame rate up or down instantly changes the min or max FPS, the average FPS takes a little bit, based on the number of samples, to get to a steady value.  Assuming a system can handle the requested frame rate, once enough samples at the new frame rate have occurred to fill all of the sample slots in the FPSMonitor class, then all 3 should have the same value.

# Stencil Part of the Depth-Stencil Buffer

Up until now, the depth-stencil buffer has been used for just depth data.  Exercising the stencil portion of this buffer required framework updates to create a depth-stencil with an appropriate format (previously the depth-stencils were all DXGI_FORMAT_D32_FLOAT), adding the ability to configure the stencil when creating a pipeline, and an algorithm to use for a test case.

For the format, the DepthStencil class has an optional argument of "bool with_stencil" that if true will create the depth stencil with a format of DXGI_FORMAT_D32_FLOAT_S8X24_UINT.  If it is false (the default), the format will be DXGI_FORMAT_D32_FLOAT.

For configuring the stencil, the static CreateD3D12 functions in the Pipeline class had their "DepthFuncs depth_func" argument changed to "const DepthStencilConfig* depth_stencil_config".  If that argument is NULL, both the depth and stencil tests are disabled.  If it points to an instance of the DepthStencilConfig struct, then the depth and stencil test can be enabled or disabled individually along with the specifying the other configuration data.

/// <summary>
/// Enum of the various stencil operations
/// </summary>
/// <remarks>
/// Values must match D3D12_STENCIL_OP
/// </remarks>
enum StencilOp
{
SOP_KEEP = 1,
SOP_ZERO,
SOP_REPLACE,
SOP_INCREMENT_CLAMP,
SOP_DECREMENT_CLAMP,
SOP_INVERT,
SOP_INCREMENT_ROLLOVER,
SOP_DECREMENT_ROLLOVER
};

/// <summary>
/// Configuration for processing pixels
/// </summary>
struct StencilOpConfig
{
/// <summary>
/// Stencil operation to perform when stencil testing fails
/// </summary>
StencilOp stencil_fail;

/// <summary>
/// Stencil operation to perform when stencil testing passes, but depth testing fails
/// </summary>
StencilOp depth_fail;

/// <summary>
/// Stencil operation to perform when both stencil and depth testing pass
/// </summary>
StencilOp pass;

/// <summary>
/// Comparison function to use to compare stencil data against existing stencil data
/// </summary>
CompareFuncs comparison;
};

/// <summary>
/// Configuration for the depth stencil
/// </summary>
struct DepthStencilConfig
{
/// <summary>
/// true if depth testing is enabled.  false otherwise
/// </summary>
bool depth_enable;

/// <summary>
/// true if stencil testing is enabled.  false otherwise
/// </summary>
bool stencil_enable;

/// <summary>
/// Format of the depth stencil view.  Must be correctly set if either depth_enable or stencil_enable is set to true.
/// </summary>
GraphicsDataFormat dsv_format;

/// <summary>
/// true if writing to the depth portion of the depth stencil is allowed.  false otherwise.
/// </summary>
bool depth_write_enabled;

/// <summary>
/// Comparison function to use to compare depth data against existing depth data
/// </summary>
CompareFuncs depth_comparison;

/// <summary>
/// Bitmask for identifying which portion of the depth stencil should be used for reading stencil data
/// </summary>

/// <summary>
/// Bitmask for identifying which portion of the depth stencil should be used for writing stencil data
/// </summary>

/// <summary>
/// Configuration for processing pixels with a surface normal towards the camera
/// </summary>
StencilOpConfig stencil_front_face;

/// <summary>
/// Configuration for processing pixels with a surface normal away from the camera
/// </summary>
StencilOpConfig stencil_back_face;
};

After those changes it was onto an algorithm to use as a test case.  While over the years I've read up on different algorithms that use the stencil, I haven't implemented one before.  I ended up picking depth-fail shadow volume using both the Wikipedia article and http://joshbeam.com/articles/stenciled_shadow_volumes_in_opengl/ for reference (I don't plan for this entry to be a tutorial on depth-fail, so I'd recommend those links if you want to read up on the algorithm).  The scene is a simple one comprised of an omnidirectional light source at (8, 0, 0), an occluder at (1, 0, 0), and a textured cube that can be moved in y and z with the arrow keys that is initially positioned at (-7, 0, 0).  The textured cube is initially in shadow, so the up, down, and left arrows allowed it to be moved so it can be partially or completely out of shadow or back into shadow.  For the right arrow key, there was an issue where the framework was always assuming D3D12_CULL_MODE_BACK which prevented the stencil buffer from being correct.  Since the stencil configuration in D3D12 allows different stencil operations for front faces and back faces, only 1 pass is needed for setting the stencil buffer when the cull mode is set to none.  By doing that, the model was correctly lit when moving out the shadow volume with the right arrow key as well.

There are no comments to display.

## Create an account

Register a new account