# Why is math transformation taxing to most CPUs?

This topic is 1409 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I'm reading the 3-D Graphics section of the "Video" Chapter in my I.T book and it states:

1) The computer must track all of the vertices of all of the objects in the 3-D world, including the ones you cannot currently see.

Question 1: When the book says the computer, do they really mean the program itself or the CPU that does the processing of the "addresses" or the RAM where the addresses are stored

2)  This calculation process is called transformation is extremely taxing to most CPUs.

Question 2: Is it because the CPU cannot process the address of the data quickly per frame? Does it lead to a slow frame-rate in the game?

Anybody who knows a lot about computer architecture and 3D game programming share your experiences with me. =] I only done 2D game programming and I only worked with X, Y coordinates and have never explored the X, Y and Z coordinates.

##### Share on other sites

How old is that book?

In the 90's the first bottleneck was rasterizing a triangle. Once GPUs became better at it, the next most expensive operation was transform and lighting; which at that time was being done in the CPU and sent every frame to the GPU.

That's why HW TnL (Hardware Transform and Lighting) was invented, which kept the vertices always in the GPU, and the math was done entirely in the GPU. Later this would evolve in what we now know as vertex shaders.

I have a hunch that book could be really, really old.

This book is a information technology - 2 years old. It's focus is trouble-shooting. The author decided to shed light on some 3D graphics just for fun.

##### Share on other sites

2)  This calculation process is called transformation is extremely taxing to most CPUs.

Question 2: Is it because the CPU cannot process the address of the data quickly per frame? Does it lead to a slow frame-rate in the game?

This one is actually still true.

Matrix-matrix multiply and matrix-vector multiply are a big cost. A few really smart math geeks have greatly reduced the costs, and some really smart hardware geeks have moved a portion of the cost over to the graphics card rather than the CPU.  However, the operations are not free and they are the most common basic functions used in graphics and physics and other systems.

Done poorly a game can still overload the CPU with badly-written math operations. It is a known concern. Faster processors and good libraries can help reduce and mitigate the concern, but it is still something you will see quite visibly on profiling numbers.

A simple naive matrix multiplication, a 4x4 multiplied by another 4x4, is rather costly.  Multiply each row by each column (using a dot product) to compute each one of the 16 necessary results.  That is 64 floating point multiplications and 48 floating point additions. While an individual matrix multiply isn't overly taxing, doing many of them quickly reaches an unacceptable cost.

With a little bit of math magic and some SIMD instructions you can reduce it to the oft-cited code snippet of 16 multiplications, 12 additions, and 16 "shuffles" that let you reuse some of the intermediate results. It ends up about 5x to 6x faster depending on implementation details of the naive implementation.  I'm not sure where it came from or what the proper name for it is, but it has been floating around the web for about a decade now. There are many similar speedy specialized algorithms for various vector-matrix operations for both column-based and row-based vectors.

While that is a reduction in the number of steps, the cost of matrix multiply is still one of the more costly low-level operations you can do.  Graphics operations rely heavily on it.  Every time you move or position something in 3D space you need to run a series of matrix multiplies all the way through that portion of your scene.  You've got the Model or World, the View, and the Projection matrices that ultimately needs to be pushed out and multiplied to every pixel that gets rendered. You'll need to do quite a few of those matrix multiplies on the CPU, but fortunately you can pass the pre-multiplied values out to the GPU and allow the card with its specialized hardware to do the rendering and heavy lifting.

Physics relies heavily on it, every time you move a physics object you also rely heavily on this math  Much like the graphics APIs, there are physics libraries (e.g. PhysX) that take advantage of hardware to do the more costly parts. Most physics simulations work on bigger primitives rather than point clouds so they often require less total matrix operations, but it can still require a hefty portion of the CPU budget.

##### Share on other sites
I think in the context of point 1), point 2) reveals a certain lack of understanding of the role of GPUs on the part of the author.

Matrix transforms for tracking object positions on the CPU for, for example, culling or physics would not run into the millions per frame unless you had a world with a very large number of objects in it.

##### Share on other sites

Done poorly a game can still overload the CPU with badly-written math operations.

badly-written math operations as in "poorly optimized math code"? May you write an example showcasing a badly-written math operations and a goodly-written math operations? Definitely would want to learn more.

##### Share on other sites

What's frob's referring to is naïve 'C style' math vs SSE math.

If you search for matrix matrix multiplication, or matrix vector multiplication, and time them against matrix matrix SSE, or matrix vector SSE, you'll see what he means. The SSE runs roughly 3.5x faster.

1. 1
Rutin
25
2. 2
3. 3
4. 4
5. 5

• 9
• 10
• 13
• 19
• 14
• ### Forum Statistics

• Total Topics
632943
• Total Posts
3009354
• ### Who's Online (See full list)

There are no registered users currently online

×