Now that the flood gates are being opened, I can answer.
The source code provided by the post above pretty much covers it. A couple of notes:
- You still need to fill all the values from the D3D12_GRAPHICS_PIPELINE_STATE_DESC, not just the blob pointer. The DX API is likely unable to read the blob, only the driver, thus it needs you to still fill this data.
- The cache is unique to a driver & GPU model.
- Because of the above, you have to check whether PSO creation out of cache succeeded. It can fail if the user updated his drivers or changed his GPU. You can see that the source code provided by MS is doing exactly this.
- Most of the time is spent compiling HLSL shaders to D3D bytecode ASM, which is an intermediary representation. You should cache this (which is GPU/driver agnostic) and that has been possible since DX11. PSOs go one step further by allowing you to cache the D3D bytecode -> ISA translation, which is usually fast, but still a nice bonus.