reading texture performance question

Started by
2 comments, last by 21st Century Moose 11 years, 1 month ago

In the pixel shader i can do one of this two configuration

--reading from a R32G32B32A32 texture one time using Load() one time

--reading from R32 texture four times using Load() four times

The first must be faster, lets say the pixel shader runs around 1,000,000 times is there a big difference between the two?

Advertisement

The key is, what else does the pixel shader do?

And no, it is not obvious or given that the first one is faster.

Niko Suni

I've seen cases where either one or the other is faster. Profile it in your particular scenario and see.

Sometimes if you can interleave the 4 Loads with some ALU ops you'll get instructions for free. Sometimes the 4 32-bit textures will fit in GPU caches more easily. I'd suggest that rather than asserting that one way "must be faster" you instead write both codepaths and profile them against each other. There is no answer to this because everybody's shader code is going to be different, so make a decision based on actual facts that are relevant to your use case and your code instead of suppositions.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

This topic is closed to new replies.

Advertisement