reading texture performance question

lomateron

In the pixel shader i can do one of this two configuration


--reading from a R32G32B32A32 texture one time using Load() one time 


--reading from R32 texture four times using Load() four times



The first must be faster, lets say the pixel shader runs around 1,000,000 times is there a big difference between the two?

mhagain

Sometimes if you can interleave the 4 Loads with some ALU ops you'll get instructions for free.  Sometimes the 4 32-bit textures will fit in GPU caches more easily.  I'd suggest that rather than asserting that one way "must be faster" you instead write both codepaths and profile them against each other.  There is no answer to this because everybody's shader code is going to be different, so make a decision based on actual facts that are relevant to your use case and your code instead of suppositions.

