• entries
455
639
• views
424664

# TLM : Why the dot() ?

107 views

So, in reading my comments I came across this one by jjd;

Quote:
 Original post by jjdOk, I'm a little confused here about the dot function thing. I know that the two formulations are equivalent, but why create an object and use a function call to evaluate expressions that you already had in a clear, efficient form?

So, I figured I'd take a few mins to explain (partly because I'm in full on procastination mode right now...), this might come across as a little GPU 101 so I applogise if I pitch the answer a little low [smile]

The long and the short of it was, while the expression might be clear to a human and good when dealing with a CPU (assuming non-SIMD code) it's not remotely efficient for a GPU. GPUs tend to like vectored operations, dealing with floats isn't great, dealing with floats and sourcing them from the same register is major bad karma indeed (thus my brain twitching).

Now, I don't know for certain how the compiler dealt with the code I gave it, however given it's constraints it probably went something like this;

float SinkWest = 0.5 * (east.x + east.y - east.z + east.w);

PARM half = 0.5
temp SinkWest
mul SinkWest.x east.x half.x

However, a dotproduct is a common operation, as such it is reasonable to assume that it will be fast to execute, and when done as a dot product the above code would look like this

float SinkWest = dot(east,vec4( 0.5, 0.5,-0.5, 0.5));

PARM coef = {0.5, 0.5, -0.5, 0.5}
temp SinkWest
dp4 SinkWest.x east, coef

Job done.

It might well cost 2 constants, however we can probably execute 4 dot products in the same space of time it takes as to do the first method and it's probably much more prefered by the GPU in general.

I hope that clears up my reasoning behind it [smile]

## 1 Comment

Cool, thanks for clearing that up :)

## Create an account

Register a new account