[D3D12] PSO Libraries

Started by
9 comments, last by galop1n 7 years, 6 months ago

In this video there is a new D3D12 feature called PSO libraries mentioned. Has this feature been released yet? Is there documentation for it?

-potential energy is easily made kinetic-

Advertisement

ID3D12Device1::CreatePipelineLibrary

Thanks... also is there a high level overview of them?

-potential energy is easily made kinetic-

Quick question: Do PSO libraries have an effect on the speed of changing PSO's? For example if you have a PSO library with similar PSO's varying only by blend state does it get treated like DX11 where the driver patches different code onto the end of the pixel shader? This versus monolithic PSO's that is?

-potential energy is easily made kinetic-

The answer is no, the PSO Library is only an attempt to save memory and storage from redundant parts ( as it can now account for hundreds of megabytes without that), they have no incidence on individual pso shaders.

The answer is no, the PSO Library is only an attempt to save memory and storage from redundant parts ( as it can now account for hundreds of megabytes without that), they have no incidence on individual pso shaders.

Are you sure about that? How does it accomplish saving memory by reusing common parts which means pipelines might not be as monolithic as before. Driver patching would explain the savings at least partially.

-potential energy is easily made kinetic-

So the reality is that pipelines aren't actually monolithic - but they also can't be chunked in a way that's common across the various architectures. Any attempt to parameterize the pipeline (like how D3D11 did) results in overhead, because the mismatch from API to hardware needs to be resolved dynamically.

So, we've bundled everything together to give the driver the opportunity to see everything they need upfront, but they still compile into separate pieces that are applied individually.

If you use the original serialization APIs, you end up with tons of duplicate data, because multiple pipelines may have common pieces between them, but when you serialize each of them, they have to include it in the serialization. Using a library allows the driver to only write one copy of each piece that gets de-duplicated as more pipelines are added to the library.

Loading has similar issues - using the original serialization APIs, there's a temptation for the driver to essentially just memcpy the contents of the serialized pipeline, but if they take that naive approach, you end up with tons of duplication in memory afterwards. So instead, the drivers have to de-duplicate everything on load, which just adds overhead. But using a library, they end up with everything de-duplicated already, so there's no overhead or wasted memory.

So the reality is that pipelines aren't actually monolithic - but they also can't be chunked in a way that's common across the various architectures. Any attempt to parameterize the pipeline (like how D3D11 did) results in overhead, because the mismatch from API to hardware needs to be resolved dynamically. So, we've bundled everything together to give the driver the opportunity to see everything they need upfront, but they still compile into separate pieces that are applied individually.

I understood this, as it evidenced by the diagrams in early DX12 presentations but without PSO libraries the chunks are only associated with one mate per interface so there no glue logic present... hence the term monolithic.

If you use the original serialization APIs, you end up with tons of duplicate data, because multiple pipelines may have common pieces between them, but when you serialize each of them, they have to include it in the serialization. Using a library allows the driver to only write one copy of each piece that gets de-duplicated as more pipelines are added to the library.

This is where I think the aforementioned glue logic comes into play. Lets say you're on an architecture where blending occurs as a program appended to the end of the pixel shader. Using a PSO library would have a single "pixel shader chunk" with glue logic to the different blend states included in the PSO library. I'm interested in this because I'm interested in the cost to switch PSO's that are similar. I was wondering if using libraries reduce the overhead of switching PSO's when within the same library?

-potential energy is easily made kinetic-

As far as I'm aware, all current D3D12 driver implementations do de-duplicate based on state combinations across the entire device, they don't need to be associated with a library for that to happen. The library association is all about minimizing the cost of saving the compilation result for a new device.

So with this de-duplication, switching from one PSO to another only requires changing the pieces that changed. I know that, for example, on Xbox, they've put this diffing logic directly into the command processor on the GPU so you even get benefits from one command list to the next. I think all current PC drivers do this on the CPU though.

Sorry for the late reply. Do you know how this de-duplication takes place...? to me it seems it would add to shader compilation time. Also is input layout considered a state combination? Or is that a part of the vertex shader chunk?

Finally in the above video it also mentions programmable blending... do you know when this will be released?

-potential energy is easily made kinetic-

This topic is closed to new replies.

Advertisement