[D3D11] Map() from two different devices causing system hangup / freeze

Started by
9 comments, last by chai_ 13 years, 7 months ago
EDIT: after clarifying the problem, changed the thread title. Scroll down a couple posts to see where it's currently at.

My program (built off of the n-body sample in the latest SDK) should do this:

1) set up a buffer of particles with initial data
2) run a compute shader on a deferred context to determine new positions/velocities of each particle
3) copy the buffer from the GPU to the CPU

I'm having serious issues doing this because of incompatible access flags.


My buffer needs the following bind flags for the compute shader:
D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE;

I need to copy it to the CPU from a deferred context using Map(), and according to http://msdn.microsoft.com/en-us/library/ff476457(v=VS.85).aspx , the only way is to use the map flags D3D11_MAP_WRITE_DISCARD and/or D3D11_MAP_WRITE_NO_OVERWRITE .

To use D3D11_MAP_WRITE_DISCARD and/or D3D11_MAP_WRITE_NO_OVERWRITE, I must also have the cpu access flags D3D11_CPU_ACCESS_WRITE set on the buffer.

To use D3D11_CPU_ACCESS_WRITE , I must use the usage flags D3D11_USAGE_DYNAMIC or D3D11_USAGE_STAGING, because the other flags do not allow for write access or bind access.

HOWEVER, DYNAMIC and STAGING flags are incompatible with UNORDERED_ACCESS and SHADER_RESOURCE.


Am I going about this the wrong way? Is there something I'm missing with the flags? Or do I need to copy the resources using something other than Map()? How can I get these deferred context resources onto the CPU??

Thanks in advance,

Alfredo

[Edited by - chai_ on August 25, 2010 5:50:28 PM]
Advertisement
Quote:Original post by chai_
My program (built off of the n-body sample in the latest SDK) should do this:

1) set up a buffer of particles with initial data
2) run a compute shader on a deferred context to determine new positions/velocities of each particle
3) copy the buffer from the GPU to the CPU

I'm having serious issues doing this because of incompatible access flags.


My buffer needs the following bind flags for the compute shader:
D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE;

I need to copy it to the CPU from a deferred context using Map(), and according to http://msdn.microsoft.com/en-us/library/ff476457(v=VS.85).aspx , the only way is to use the map flags D3D11_MAP_WRITE_DISCARD and/or D3D11_MAP_WRITE_NO_OVERWRITE .

To use D3D11_MAP_WRITE_DISCARD and/or D3D11_MAP_WRITE_NO_OVERWRITE, I must also have the cpu access flags D3D11_CPU_ACCESS_WRITE set on the buffer.

To use D3D11_CPU_ACCESS_WRITE , I must use the usage flags D3D11_USAGE_DYNAMIC or D3D11_USAGE_STAGING, because the other flags do not allow for write access or bind access.

HOWEVER, DYNAMIC and STAGING flags are incompatible with UNORDERED_ACCESS and SHADER_RESOURCE.


Am I going about this the wrong way? Is there something I'm missing with the flags? Or do I need to copy the resources using something other than Map()? How can I get these deferred context resources onto the CPU??

Thanks in advance,

Alfredo
I highlighted one problem above - you can't read resource data back on a deferred context, only on an immediate context. The reason that you must use either D3D11_MAP_WRITE_DISCARD or D3D11_MAP_WRITE_NO_OVERWRITE is that they don't return data to the CPU with the Map call - they only allow writing to the resource.

Have you tried doing this from an immediate context? Also, are you getting debug output when you try to create your resource? I am fairly certain that you want to use D3D11_USAGE_STAGING, but I haven't ever tried reading back GPU resources to the CPU...
Quote:Original post by Jason ZI highlighted one problem above - you can't read resource data back on a deferred context, only on an immediate context. The reason that you must use either D3D11_MAP_WRITE_DISCARD or D3D11_MAP_WRITE_NO_OVERWRITE is that they don't return data to the CPU with the Map call - they only allow writing to the resource.

Have you tried doing this from an immediate context? Also, are you getting debug output when you try to create your resource? I am fairly certain that you want to use D3D11_USAGE_STAGING, but I haven't ever tried reading back GPU resources to the CPU...


Thanks for the quick reply.

I separated the SRV buffer from the UAV buffer, so now I have a buffer that has D3D11_BIND_UNORDERED_ACCESS only. However, I get errors if I use any usage flags other than D3D11_USAGE_DEFAULT. Seems I need this in order to create my UAV buffer for the compute shader. Other than those errors, no debug output.

I now have created two devices, and therefore two immediate contexts (none deferred, I understood these wrong and they weren't essential to the program). Now I have some new issues.

From Device1, I can map the resource and read it into CPU perfectly, no problem. When I do the exact same thing with Device2, my whole system hangs and I have to do a hard-reset. It happens as soon as i call the Map() function on Device2. Is there any reason why Map() should crash my system on a different device? What might cause this?

-Alfredo
That crash sounds like a driver bug - a pretty serious one if it crashed the whole OS.

Could you describe better what it is you're trying to do and how deferred contexts are expected to help?

The primary GPU resourcs should be D3D11_USAGE_DEFAULT so that they can be bound as both SRV's and UAV's.

You'll want a second resource using the D3D11_USAGE_STAGED with CPU read enabled.

Your deferred context can then do all the work it needs on the first resource and then do a CopyResource() (cost of this won't matter) to the staged resource. Then the call to Map() (this will be the big cost) on the staged resource will do a data transfer from the GPU to the CPU. If you are worried about being blocked, then use the flag D3D11_MAP_FLAG_DO_NOT_WAIT. This would allow you to poll Map() over several frames until the data transfer is complete. This would also imply that you'd use a ring buffer of resources to manage the data over several frames.
What I have done successfully is:

1) Create Device1 and Device2
2) Run a compute shader for 1000 particles (using nbody sample shader) on Device1
3) copy buffer from Device1 to Device2
4) render all 1000 particles using Device2

What I want to do now is:

1) create Device1, Device2, and Device3
2) run compute shader for 500 particles on Device1
3) simultaneously run compute shader for 500 particles on Device2
4) copy Device1 buffer and Device2 buffer to Device3
5) render all 1000 particles using Device3

I wanted to use deferred contexts in order to run steps 2) and 3) simultaneously, but that's not necessary, because the Dispatch command does not block execution unless you specify explicitly.

You are right that I have to copy the buffer to a staging resource, that's what I'm doing. I copy it to a buffer, i.e. TempBuffer, and then run Map() on the TempBuffer to get the values. When I do this process for Device1, it works, i can get the values. When i do it for Device2, the Map() call hangs the entire system.

EDIT: I should specify, i can render them all without copying to the CPU, but I want to be able to, so that in the next step, I can distribute particles amongst many Devices and still consolidate them afterwards.
This this a multi GPU setup? Or are you creating several devices on a single GPU?
Multi-GPU setup, so if I put several devices on one GPU, copying resources isn't a problem, but for copying between GPUs, I need to copy them first to CPU to consolidate all the particles.

I'm going to try on a different multi-GPU system and see if the problem persists, if so, it's definitely a driver issue. Will update soon! Thanks for all the help so far.
Ok, it's definitely not a driver issue. Tested on a different system with different GPUs, and I get the same result. I looked through google to see if anyone had the issue of Map() causing a system hangup, but looks like this is not a common occurrence.

Does anyone have info on what might cause a system hangup like this? I got a similar issue when I passed in the wrong InputLayout in a DX program-- so can I assume it's an issue of memory leaks/corruption? Does anyone know of a possible way to debug, maybe dump the buffer information as it attempts to map?

Thanks,
Alfredo
Have you changed the TDR period?

If not, then this is sounding a lot like an OS bug. Could you post your system's configurations using dxdiag? And if possible it would be great to get a working debug build of the code with symbols so that we might have a look at what's going on in the OS. PM me if you like.
Haven't changed the TDR period, and I'm not getting an error or a crash, just a complete hangup. The dxdiag complete output is here:

http://pastebin.com/Cj2Mbw9h

I'd have to get approval to share the debug build, i personally have no problem doing so though. I'll let you know, until then i'm gonna clean up the code and make sure there isn't anything silly that i'm missing...

This topic is closed to new replies.

Advertisement