[solved] how to save 4-channel to 2-channel?

Started by
5 comments, last by Matias Goldberg 16 years, 6 months ago
Thanks to all! :) in summary, i'm trying to save both gram space and copyback time, and it's hard for now. ---------------------------------- hi, [dx10, vs2005, geforce8800] i'm doing blending and render to texture. I only need 2 channels for blending, one channel is color(any of r/g/b, blendOp = Min), the other is Alpha(blendOp = Add. Actually Inc, since I let ps.outColor = (x,x,x,1)). But the hardware only supports full 4-channel textures (2-channel formats don't have Alpha), thus wasting copyback time. how to save 4-channel to 2-channel? Thanks! [Edited by - yk_cadcg on October 12, 2007 11:36:59 PM]
Advertisement
Are you sure this actually poses a performance issue? Have you compared blending with 2 channels compared to 4?
Sirob Yes.» - status: Work-O-Rama.
Thanks Sirob,
1, this is not a critical performance issue, readback is very fast, for a 1M 4-channel texture, this only costs <100ms.
but it's a waste of resource, since only 2 channels are used.
2, i can't test with a 2-channel texture: there is no 2-channel Format that contains Alpha channel, such as R32A32. (I have to use one color and one alpha, since there're 2 different blendOp, i can only place these 2 Ops to color and alpha, seperately. 2 colors such as R32G32 can only share one BlendOp.)

Quote:Original post by sirob
Are you sure this actually poses a performance issue? Have you compared blending with 2 channels compared to 4?


Thanks Sirob,
1, this is not a critical performance issue, readback is very fast, for a 1M 4-channel texture, this only costs <100ms.
but it's a waste of resource, since only 2 channels are used.
2, i can't test with a 2-channel texture: there is no 2-channel Format that contains Alpha channel, such as R32A32. (I have to use one color and one alpha, since there're 2 different blendOp, i can only place these 2 Ops to color and alpha, seperately. 2 colors such as R32G32 can only share one BlendOp.)

Quote:Original post by sirob
Are you sure this actually poses a performance issue? Have you compared blending with 2 channels compared to 4?


You might be able to get implicit conversion through casting using one of the copy-resource functions. I'm not on my D3D10 dev machine now so I don't have the specs/docs to confirm this.

Regardless of whether the API can do it I'd imagine you'll be best off implementing some sort of staging resource and a GPU-based working-to-staging conversion. For example, do a simple render-to-texture from 4-channel to 2-channel and convert accordingly in PS.

However I suspect that the extra cost of performing this conversion is likely to offset against the halving of GPU<->CPU bandwidth. You may end up making your software a lot more complex for a relatively minor performance improvement.

I was reading about NVPerfHUD 5 yesterday. Drill into your app with this sort of tool before you start over-complicating your code based on theoretical assumptions.

hth
Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

If you want to use alpha blending you're stuck with four channels, so there's not much you can do in terms of saving bandwidth. Disabling color writes on the channels you don't need should speed up rendering though.

If it's that much of an issue, try triple buffering your algorithm and doing the alpha blending manually. You'll only need two-channel textures then, at the cost of an extra texture and the manual blending ops.

BTW, there is a duplicate of this thread further below. I originally posted this reply there, however the duplicate should probably be deleted.
I'm no shader expert, but my guess, it will only "save" bandwidth, since shader units makes use of paralellized instructions. I believe it will compute 4 floats in the same time as computing 2, because it wouldn't be operating at full capacity.
As for bandwidth, your R component would later need to be copied to the B and G component for the final result, since that's how video cards work (okok... except you're using YUV front buffers or something similar) so you'll lose the performance gain somewhere.


I hope I'm Right :P
Dark Sylinc

PS: What I'm trying to say, is that I don't think it's worth optimizing it.

This topic is closed to new replies.

Advertisement