"Writing to Mips in Compute Shader"
Is this possible? For now I'm rendering to a texture, and then use CopySubresourceRegion() in order to generate mips in realtime.
This whole process -> 2048x1024, 1024x512 ... down to 2x1 resolution. (under 0.3ms but I might have to do it around 3-6 times per frame)
My version has a very similar speed compared to GenerateMips().
I really think a ComputeShader version would be considerably faster, especially bcz I could write 4+ mip levels at once.
Microsoft doc says you have to index RWTextures with an int2. But what if I use and out-of-bounds int2. Will it let me access the mip levels? They must be somewhere around in memory :)
Can I cast an RWTexture into a RAW buffer and access all of it (including mip levels?)
[Edited by - OctavianTheFirst on March 2, 2010 5:29:14 PM]
You can't access the mips from a RWTexture2D obj. This can be verified since the Texture2D object has the .Mips operator but RWTexture2D does not. So, the UAV can only access Mip level 0.
I think that you may have to bind 8 UAV's to write your mip data to, and then do a CopySubResourceRegion in order to get the UAV data into the mipped texture. This still could be faster than doing several separate renders. Heck, you could bind 1 UAV and write out all of the data yourself in your own MIP format, and then copy the data into the real texture.
Indexing a UAV or a SRV out of bounds will always return zero and will always discard the data from a write.
Texture objects can be cast between formats using an SRV, but can not be cast to different resource type. The memory layout of these objects are very different.
The boundary behavior is built into the hardware and part of the D3D spec. You won't be able to get around this.
Type checking is is verified in several places, hooking this up to try and do these casts would just result in d3d catching the errors and not running your shader.
Developers aren't given the ability to modify shader assembly in D3D10 and higher versions.
The texture layouts and mip layouts are inaccessible because they can very per driver and per hardware implementation. In general that is good since it lets people take advantage of future driver/hardware optimizations. It also makes sure that developers don't write code that depends on these layouts remaining unchanged. But of course making the data inaccessible can also mean that it's a bit harder for the developer to make optimizations of their own, as is your case. Functions like Copy* on the GPU have the ability to encode/decode the formats and transfer data between resources. You're pretty much going to be stuck working within those bounds.
How are you profiling the speed of these Mip generation functions?
But of course those mips are not used for texture interpolation. I like using mips because they're an easy way to store trees on the gpu. It helps me compress sparse textures and then sending them from one gpu to another. (Sending hd images over a 4GBps link is not that fast, especially when you need fp rgba...)
I've achieved some "variable bitrate" realtime compression algorithm that compresses FP textures in an quasi-loosless way up to 4x. And the compression time depends mostly on the mipmap creation time. But I rearranged my functions (Grouped CopySubResource() in a separate loop) and it's not a bottleneck anymore.
I might go ahead and write them in my own mipmap format in a single texture, avoiding completely the usage of DX11 Mips... That might speed up the process some more, but right now I'm working on the renderer, making it take advantage of the accumulation process as much as possible...
What I really like about DX11 is that compute shaders can acess textures and resources directly. This make them so easy to integrate with shaders, and opens the door to any imaginable highly parallel algorithm.