Sign in to follow this  
Mr_Fox

Copy texture data in default heap to readback heap

Recommended Posts

Hi Guys,

Are there any straight forward way to copy texture2d data into CPU readable array? IIRC, we can't create texture2d on readback heap right?  Copy upload heap buffer to texture2d require lots of bookkeeping, so I guess do a revers task is also very tricky.

 

Right now I use a compute shader to copy texture2D to a structured buffer in default heap first, and then use CopyBufferRegion to copy it to buffer in readback heap, and finally do a map on it to read data back to CPU.

 

I feel my approach is overly complicated, there must be a much convenient way to copy texture2D to CPU readable array (otherwise how we do capture screen to a file?)

 

Thanks in advance.

 

Share this post


Link to post
Share on other sites

CopyTextureRegion() can describe either a source or dest buffer as if it were a texture, and facilitate the copy. So what you'd do is pass the texture and subresource info as the source, and the buffer with the texture footprint (consider using GetCopyableFootprints() on the texture to generate the footprint) as the dest.

Share this post


Link to post
Share on other sites

Jesse can correct me if I'm wrong, but the restriction regarding Textures in a Readback heap is not one shared by the 'Custom' heap.

You should be able to create a CUSTOM heap in L0 (System Memory) with a CPU_PAGE_PROPERTY of WRITE_BACK. With that heap type you shouldn't be prevented from creating a texture that can be persistently mapped by the CPU.

Share this post


Link to post
Share on other sites

Thanks Jesse, I will try that later~

 

 

You should be able to create a CUSTOM heap in L0 (System Memory) with a CPU_PAGE_PROPERTY of WRITE_BACK. With that heap type you shouldn't be prevented from creating a texture that can be persistently mapped by the CPU.

 

Thanks  Adam, just curious, if a texture is mapped, with data layout in a swizzled way, what the advantages for mapped texture2D over mapped structured buffer? (I feel we have to spend extra CPU cycle to 'decode' data for further CPU process) And what the potential use case for map a texture2D by CPU?

Thanks in advance

Edited by Mr_Fox

Share this post


Link to post
Share on other sites

I haven't ever tried, but I'd be surprised if D3D12 let you map a texture with an UNKNOWN layout. Even if it did, you wouldn't know where to find your texels anyway. You can always make the WRITE_BACK texture in the ROW_MAJOR layout and write to that directly in system memory.

Does the 2D texture data have a lifetime on the GPU beyond the point where you generate it? i.e. Is the data only generated for the benefit of being read back on the CPU or is it read again by the GPU later in the frame? It feels to me like you'd be better off writing it directly back to system memory at the point that you generate the data rather than writing it to GPU local memory and then copying it in a separate step back to system memory.

Share this post


Link to post
Share on other sites

I haven't ever tried, but I'd be surprised if D3D12 let you map a texture with an UNKNOWN layout. Even if it did, you wouldn't know where to find your texels anyway. You can always make the WRITE_BACK texture in the ROW_MAJOR layout and write to that directly in system memory.

 

Believe it or not, you can. If pass a null pointer when you request to map the resource, we'll internally just store the memory location, after which you can use the ReadFromSubresource API to copy (some of) the texels into a ROW_MAJOR layout in standard CPU memory.

Share this post


Link to post
Share on other sites

Believe it or not, you can.

You live and learn!

So it sounds like you can either pay the cost of detiling the system memory resource on the CPU using the ReadFromSubresource API Jesse mentioned, or create the system memory resource in a format you can access directly (ROW_MAJOR today, Standard Swizzle in the future).

There's a good chance the GPU would much prefer the resource to be natively tiled as it writes to system memory, but if there's other bottlenecks on that Draw/Dispatch it may make no difference what its layout is.

Share this post


Link to post
Share on other sites

It feels to me like you'd be better off writing it directly back to system memory at the point that you generate the data rather than writing it to GPU local memory and then copying it in a separate step back to system memory.

Correct me if I am wrong, I don't think we could have compute shader directly write to structured/typed buffer in readback heap?

Share this post


Link to post
Share on other sites

Not in a Readback heap, no. However, as per my first post in the thread, this restriction shouldn't apply to a "Custom" heap that shares all the flags/properties of a Readback heap.

Thanks Adam, but if such Custom heap could share all the flags/properties of a Readback heap while without aforementioned restriction, why readback heap have that restriction at the first place? 

Share this post


Link to post
Share on other sites

Thanks Adam, but if such Custom heap could share all the flags/properties of a Readback heap while without aforementioned restriction, why readback heap have that restriction at the first place? 

That's a question I've asked myself many times before! In fact it wasn't the case in pre-release "early access" versions of D3D12. My speculation would be that it was simply to dissuade people from doing it / prevent people from doing it accidentally, while still giving them the ability to do it if they were feeling brave. Jesse may be able to shed more light on why it is the way it is.

Edited by Adam Miles

Share this post


Link to post
Share on other sites

 

Thanks Adam, but if such Custom heap could share all the flags/properties of a Readback heap while without aforementioned restriction, why readback heap have that restriction at the first place? 

That's a question I've asked myself many times before! In fact it wasn't the case in pre-release "early access" versions of D3D12. My speculation would be that it was simply to dissuade people from doing it / prevent people from doing it accidentally, while still giving them the ability to do it if they were feeling brave. Jesse may be able to shed more light on why it is the way it is.

 

That's a pretty good summary of the reason. An UPLOAD heap encapsulates the capabilities necessary for the majority of CPU -> GPU operations, and a READBACK heap does the same for GPU -> CPU operations. At least, the majority of what was available in prior APIs. Attempting to get fancier with your resource usage involves breaking away the abstractions and delving down into the actual properties that they imply.

In D3D11 (at least), the only things that came back from the GPU were copies to STAGING or queries, both of which can be simply implemented in 12 as copies to a READBACK buffer. So that's all that we decided READBACK needed to be able to do.

Share this post


Link to post
Share on other sites

In D3D11 (at least), the only things that came back from the GPU were copies to STAGING or queries, both of which can be simply implemented in 12 as copies to a READBACK buffer. So that's all that we decided READBACK needed to be able to do.

Thanks Jesse and Adam. I think I will just create texture on such Custom heap to be able to directly map to CPU read. But are there any caveats for map texture buffer? (Texture is ROW_MAJOR) I tried map structured/typed buffer they are straight foreword. But I bet texture (even row_major) is different 

Share this post


Link to post
Share on other sites

Not in a Readback heap, no. However, as per my first post in the thread, this restriction shouldn't apply to a "Custom" heap that shares all the flags/properties of a Readback heap.

Well, I tried Custom heap with row major texture, but get this error

The D3D12 device being used only supports copies to/from row major textures.  Neither D3D12_RESOURCE_FLAG_ALLOW_RENDER_TARGET nor D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS may be set. [ STATE_CREATION ERROR #599: CREATERESOURCE_INVALIDMISCFLAGS]

So it seems we can't set allow uav, so means at least for texture with row major(UNKNOWN layout still works), compute shader can't directly write to cpu mapable resource 

Edited by Mr_Fox

Share this post


Link to post
Share on other sites

"The D3D12 device being used only supports copies to/from row major textures."

Does that means there are certain DX12 device could support other ways to access row major textures?

BTW, my GPU is GeForce GTX TITAN X

Share this post


Link to post
Share on other sites

It looks like you're right, my GCN card reports the same thing.

The no RTV/UAV on Row-Major seems to be a separate limitation unrelated to what type of heap it lives in. The only way you would be to do it directly is if you calculate an index per pixel (y * width + x) and wrote to a UAV buffer in system memory and flattened the image to 1D on write.

Your original idea was a two-step process of copying from Texture to Buffer in DEFAULT, then copying the buffer from DEFAULT to READBACK.

But you could do it one less copy if you just copied the texture from the DEFAULT heap to the CUSTOM heap directly. The row-major one in System Memory wouldn't need an RTV or UAV flag on it as it's just a copy destination. The GPU should deswizzle on copy. It may also be possible to do it on the Copy Queue in parallel with the GPU continuing its work. The bit I'm not 100% sure about is whether the Copy queue has the ability to change Texture Layout on copy, I know GPUs that are capable of it but I'm not sure what we exposed on Windows.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this