How to use the same texture as input and output?

Started by
7 comments, last by circlesoft 18 years, 4 months ago
Hi, I have some single-channel data to process with a pixel shader, and the output is single channel too. As my card doesn't have any formats like D3DFMT_L8, or the like, I have to use eg. D3DFMT_A8R8G8B8 for both source and destination texture, use a single channel from both, wasting bandwidth and video memory. The idea of using two channels of the same texture as source and destination comes naturally, but I've read this in ShaderX2, in the article titled "Simulating Blending Operations on Floating-point Render Targets" by Francesco Carucci, pp 174: "... The idea cannot be implemented in a straightforward manner as described because having the same render target texture both as input and as output of the same pixel shader is not officially supported by current hardware and might lead to undefined and unpredictable results, due to the presence of texture and frame-buffer caches that usually do not talk to each other. ..." Puff. So do you have any idea, some clever trick to use the minimal amount of bandwidth and video memory when having two single-channel 8-bit textures to work with? Thanks for your help, kp
------------------------------------------------------------Neo, the Matrix should be 16-byte aligned for better performance!
Advertisement
IIRC, there is just no way to get around it. You just can't have the same texture as both the active render target and the input texture. Just think about it - even if you could do that, you wouldn't want to, because it would really screw your algorithms up (because your input data would change mid-frame, since you are outputting to it).

Anyways, the only way around it that I've seen is to use two different rendertargets and swap which one is input and output for each frame.
Dustin Franklin ( circlesoft :: KBase :: Mystic GD :: ApolloNL )
As circle soft says. The reason is for the way the Graphics Cards handle all the shit!
----------------------------

http://djoubert.co.uk
Quote:Original post by circlesoft
Just think about it - even if you could do that, you wouldn't want to, because it would really screw your algorithms up (because your input data would change mid-frame, since you are outputting to it).

Thanks for your reply! But let me make it more precise. If (for example) I use the R channel of a texture for input, and the G for output, and set COLORWRITEENABLE according to this, then input and output data do not interfere with each other. This is what I was thinking about.

Quote:Original post by circlesoft
Anyways, the only way around it that I've seen is to use two different rendertargets and swap which one is input and output for each frame.

Yeah, this is what I'm currently doing, and wasting the GBA channels of two huge textures..

kp
------------------------------------------------------------Neo, the Matrix should be 16-byte aligned for better performance!
Quote:Original post by kovacsp
If (for example) I use the R channel of a texture for input, and the G for output, and set COLORWRITEENABLE according to this, then input and output data do not interfere with each other. This is what I was thinking about.


This will not work either. The graphics card will handle all components of a pixel in parallel (actually multiple pixels usually), and the pixel components have to stay together for the entire length of the rasterization pipe.

Niko Suni

Quote:Original post by kovacsp
Thanks for your reply! But let me make it more precise. If (for example) I use the R channel of a texture for input, and the G for output, and set COLORWRITEENABLE according to this, then input and output data do not interfere with each other. This is what I was thinking about.

This would still require specialized hardware to recognize situations like this. Graphics hardware isn't built to do this.
Quote:Original post by circlesoft
Anyways, the only way around it that I've seen is to use two different rendertargets and swap which one is input and output for each frame.

Yeah, this is what I'm currently doing, and wasting the GBA channels of two huge textures..

There's not much you can do about this. If your card doesn't support single channel render targets, then you have no other options.

By the way, the limitation of not being able to set a texture as a source at the same time as a destination is not likely to go away any time soon. I used to work at one of the IHV's on the software side and whenever I mentioned this to the hardware guys, they looked at me very angrily :P

neneboricua
I've got an idea!
If I compress my single-channel texture into one which encodes four pixels of my original texture (either horizontally or vertically, maybe both), then I use all channels, and I can also make use of all four channels when processing! (if I'm lucky [smile])
Although this would need a preprocessing step, but maybe it would worth it!

kp
------------------------------------------------------------Neo, the Matrix should be 16-byte aligned for better performance!
Quote:Original post by kovacsp
I've got an idea!
If I compress my single-channel texture into one which encodes four pixels of my original texture (either horizontally or vertically, maybe both), then I use all channels, and I can also make use of all four channels when processing! (if I'm lucky [smile])
Although this would need a preprocessing step, but maybe it would worth it!

kp


Actually, it is pretty easy to pack 4 bytes onto a single RGBA pixel, with practically no pre-processing other than a type cast at your application [smile] This will additionally accelerate your original algorithm to potentially 400% performance - well thought!

EDIT: The optimal solution is to pack the pixels along the scanlines; this way, you don't have to stress the texel cache any more than usually, and you don't need to write a custom scanline skipping shader logic.

Niko Suni

Quote:Original post by neneboricua19
By the way, the limitation of not being able to set a texture as a source at the same time as a destination is not likely to go away any time soon. I used to work at one of the IHV's on the software side and whenever I mentioned this to the hardware guys, they looked at me very angrily :P

Yea, just think about it from a software standpoint - it just doesn't make sense. You don't write to a buffer that you are reading from at the same time, otherwise you are going to be overwriting data that you may have to read from later on in the algorithm. The bottom line is that you have to write to a clean buffer, which is just what we are doing now.
Dustin Franklin ( circlesoft :: KBase :: Mystic GD :: ApolloNL )

This topic is closed to new replies.

Advertisement