After further investigations, I have decided to resurrect this thread with new information. My suspicions are that this incredibly contrived use case might be manifesting a bug.
I brought this to the attention of a DX team member, and he directed me
here. I had already seen something similar
here, integrating Diret2D and WPF. I dove into the SurfaceQueue code and found that it contained two synchronization mechanisms. One is for multithreaded enqueue/dequeue of surfaces, which does nothing in my use case, as I am simply producing a surface that is to be immediately consumed. The second is the render synchronization which is intended to force the frame to finish rendering synchronously with the enqueue of the surface.
The magic is in the enqueue method of the SurfaceQueue:
[source]
// Copy a small portion of the surface onto the staging surface
hr = m_pProducer->GetDevice()->CopySurface(pStagingResource, pSurface, width, height);
...
//
// Force rendering to complete by locking the staging resource.
//
if (FAILED(hr = m_pProducer->GetDevice()->LockSurface(pStagingResource, Flags)))
{
goto end;
}
if (FAILED(hr = m_pProducer->GetDevice()->UnlockSurface(pStagingResource)))
{
goto end;
}
ASSERT(QueueEntry.pStagingResource == NULL);
//
// The call to lock the surface completed succesfully meaning the surface if flushed
// and ready for dequeue. Mark the surface as such and add it to the fifo queue.
//
[/source]
In a nutshell, the SurfaceQueue copies a portion of the surface to be enqueued to a staging resource, and then (in the case of Direct3D10 and 11), maps and unmaps the resource into CPU space. In theory, this should force the rendering to the original surface to complete so that the staging resource will be up to date when it is (potentially) read in CPU space. This makes sense, so armed with this information, I added the following code to my previous example's render method:
[source lang="csharp"]
D3DDevice.ImmediateContext.CopyResource(SharedTexture, StagingTexture);
var data = D3DDevice.ImmediateContext.MapSubresource(StagingTexture, 0, StagingTexture.Description.Width * StagingTexture.Description.Height * sizeof(float), MapMode.Read, MapFlags.None);
D3DDevice.ImmediateContext.UnmapSubresource(StagingTexture, 0);
[/source]
and initialization:
[source lang="csharp"]
Texture2DDescription stagingdesc = new Texture2DDescription();
stagingdesc.BindFlags = BindFlags.None;
stagingdesc.Format = DXGI.Format.B8G8R8A8_UNorm;
stagingdesc.Width = WindowWidth;
stagingdesc.Height = WindowHeight;
stagingdesc.MipLevels = 1;
stagingdesc.SampleDescription = new DXGI.SampleDescription(1, 0);
stagingdesc.Usage = ResourceUsage.Staging;
stagingdesc.OptionFlags = ResourceOptionFlags.None;
stagingdesc.CpuAccessFlags = CpuAccessFlags.Read;
stagingdesc.ArraySize = 1;
StagingTexture = new Texture2D(D3DDevice, stagingdesc);
[/source]
Running the sample results in the same blinking behavior as before. In addition, i was playing around with the original Direct3D10 sample, and added Direct2D code to clear the render target using the SharedTexture surface, and that resulted in the behavior I had previously noted where the any 3D content before the Acquire call to the KeyedMutex would complete 100% of the time, but anything after that would blink. Except in this case, any Direct2D content drawn after the 3D triangle would blink, while the triangle was always present. So, I remade the current sample to add an additional triangle. Code follows:
Initialization:
[source lang="csharp"]
SampleStream1 = new DataStream(3 * 32, true, true);
SampleStream1.WriteRange(new[] {
new Vector4(0.25f, 0.5f, 0.5f, 1.0f), new Vector4(1.0f, 0.0f, 0.0f, 1.0f),
new Vector4(0.75f, -0.5f, 0.5f, 1.0f), new Vector4(0.0f, 1.0f, 0.0f, 1.0f),
new Vector4(-0.25f, -0.5f, 0.5f, 1.0f), new Vector4(0.0f, 0.0f, 1.0f, 1.0f)
});
SampleStream1.Position = 0;
SampleVertices1 = new Buffer(D3DDevice, SampleStream1, new BufferDescription()
{
BindFlags = BindFlags.VertexBuffer,
CpuAccessFlags = CpuAccessFlags.None,
OptionFlags = ResourceOptionFlags.None,
SizeInBytes = 3 * 32,
Usage = ResourceUsage.Default
});
[/source]
And the new completed Render:
[source lang="csharp"]
D3DDevice.ImmediateContext.ClearDepthStencilView(SampleDepthView, DepthStencilClearFlags.Depth | DepthStencilClearFlags.Stencil, 1.0f, 0);
float c = ((float)(arg % 1000)) / 999.0f;
D3DDevice.ImmediateContext.ClearRenderTargetView(SampleRenderView, new SlimDX.Color4(1.0f, c, c, c));
D3DDevice.ImmediateContext.InputAssembler.InputLayout = SampleLayout;
D3DDevice.ImmediateContext.InputAssembler.PrimitiveTopology = PrimitiveTopology.TriangleList;
D3DDevice.ImmediateContext.InputAssembler.SetVertexBuffers(0, new VertexBufferBinding(SampleVertices, 32, 0));
EffectTechnique technique = SampleEffect.GetTechniqueByIndex(0);
EffectPass pass = technique.GetPassByIndex(0);
for (int i = 0; i < technique.Description.PassCount; ++i)
{
pass.Apply(D3DDevice.ImmediateContext);
D3DDevice.ImmediateContext.Draw(3, 0);
}
Mutex.Acquire(0, int.MaxValue);
Mutex.Release(0);
D3DDevice.ImmediateContext.InputAssembler.SetVertexBuffers(0, new VertexBufferBinding(SampleVertices1, 32, 0));
for (int i = 0; i < technique.Description.PassCount; ++i)
{
pass.Apply(D3DDevice.ImmediateContext);
D3DDevice.ImmediateContext.Draw(3, 0);
}
D3DDevice.ImmediateContext.CopyResource(SharedTexture, StagingTexture);
var data = D3DDevice.ImmediateContext.MapSubresource(StagingTexture, 0, StagingTexture.Description.Width * StagingTexture.Description.Height * sizeof(float), MapMode.Read, MapFlags.None);
data.Data.Position = 3377 * 4;
var read1 = data.Data.ReadByte();
var read2 = data.Data.ReadByte();
var read3 = data.Data.ReadByte();
var read4 = data.Data.ReadByte();
System.Diagnostics.Debug.WriteLine("Actual1: " + read1 + " " + read2 + " " + read3 + " " + read4);
data.Data.Position = 3390 * 4;
read1 = data.Data.ReadByte();
read2 = data.Data.ReadByte();
read3 = data.Data.ReadByte();
read4 = data.Data.ReadByte();
System.Diagnostics.Debug.WriteLine("Actual2: " + read1 + " " + read2 + " " + read3 + " " + read4);
System.Diagnostics.Debug.WriteLine("");
D3DDevice.ImmediateContext.UnmapSubresource(StagingTexture, 0);
[/source]
First of all, I know this is stupid, but how can I find the pitch so that I can request the correct data size?
Aside from that, there are a few things to note here (it only takes a few minutes to set the sample up and it is interesting to say the least):
1) The second triangle will flicker, while the original triangle will remain solid. This is expected though not desired.
2) Those data readings represent the near tip of each triangle (although I am not sure if this is uniform across all hardware due to pitch). Every frame, my debug output reads thusly:
Actual1: 6 1 247 255
Actual2: 4 4 247 255
This means that every frame, by the end of the execution of the Render method, the staging surface clearly has the data showing that both triangles are drawn (I have also created additional tests sampling additional pixels to confirm this is absolutely true). I must assume that the original shared surface has the triangles drawn as well because the staging surface is a copy. Immediately following the render method is the synchronous lock the D3DImage version of the surface and its supposed copy to an image source (the bowels of the D3DImage). Yet this does not seem possible, because visually, some frames clearly do not have the second triangle visible, although tests show that every frame it is being produced prior to the D3DImage locking and copying it.
So if the map/unmap is forcing synchronization as the SurfaceQueue suggests (and my CPU read tests), I am at a loss as to why this does not work. If this were a synchronization issue with DX9Ex, why is the first triangle ALWAYS visible? Am I missing something?