# Code Optimization in visual studio

This topic is 3454 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

given the following code:
void Quat_To_Matrix1(Quaternion& quat,Matrix& matrix)
{
const float x = quat[0];
const float y = quat[1];
const float z = quat[2];
const float w = quat[3];

matrix[0][0] = w*w + x*x - y*y - z*z;
matrix[0][1] = 2*x*y + 2*w*z;
matrix[0][2] = 2*x*z - 2*w*y;
matrix[0][3] = 0.0f;

matrix[1][0] = 2*x*y-2*w*z;
matrix[1][1] = w*w - x*x + y*y - z*z;
matrix[1][2] = 2*y*z + 2*w*x;
matrix[1][3] = 0.0f;

matrix[2][0] = 2*x*z + 2*w*y;
matrix[2][1] = 2*y*z - 2*w*x;
matrix[2][2] = w*w - x*x - y*y + z*z;
matrix[2][3] = 0.0f;

matrix[3][0] = 0.0f;
matrix[3][1] = 0.0f;
matrix[3][2] = 0.0f;
matrix[3][3] = w*w + x*x + y*y + z*z;
}

void Quat_To_Matrix2(Quaternion& quat,Matrix& matrix)
{
const float x = quat[0];
const float y = quat[1];
const float z = quat[2];
const float w = quat[3];

float _w = w*w;
float _x = x*x;
float _y = y*y;
float _z = z*z;

matrix[0][0] = _w + _x - _y - _z;
matrix[0][1] = 2*x*y + 2*w*z;
matrix[0][2] = 2*x*z - 2*w*y;
matrix[0][3] = 0.0f;

matrix[1][0] = 2*x*y-2*w*z;
matrix[1][1] = _w - _x + _y - _z;
matrix[1][2] = 2*y*z + 2*w*x;
matrix[1][3] = 0.0f;

matrix[2][0] = 2*x*z + 2*w*y;
matrix[2][1] = 2*y*z - 2*w*x;
matrix[2][2] = _w - _x - _y + _z;
matrix[2][3] = 0.0f;

matrix[3][0] = 0.0f;
matrix[3][1] = 0.0f;
matrix[3][2] = 0.0f;
matrix[3][3] = _w + _x + _y + _z;
}

Suppose i am using Visual Studio 2008, is there any need to the optimization in the second function ? i mean do visual studio reaches an assembly in function one that is optimized without making the manual optimization in the second function ? [Edited by - ApochPiQ on June 3, 2009 8:56:02 AM]

##### Share on other sites
Generally it's better to ask the compiler if it does an optimization rather than a human being. You can use the /FA family of switches to get MSVC to output the assembly it generates for a given C++ file.

##### Share on other sites
Measure it with a profiler. If your program does not spend the majority of its time in this function, the answer to "do I need to optimise it" is unequivocally no.

##### Share on other sites
Quote:
 Original post by SiCraneGenerally it's better to ask the compiler if it does an optimization rather than a human being. You can use the /FA family of switches to get MSVC to output the assembly it generates for a given C++ file.

i am asking this question to know the level of optimization we have to do manually
and what are the issues modern c++ compilers take care of.

the Code Generator of Maple generates function two, while i saw function one
in very large free source projects that are realy reliable.

##### Share on other sites
Quote:
 Original post by eGamerplease show some respect in your answers, otherwise it is useless in this forum.

...

On the other hand, let me give you a very helpful link anyways: click. And as it turns out, there appear all the links I wanted to recommend.

edit: Out of curiosity, where exactly was someone disrespectful? I fail to find that.

##### Share on other sites
Quote:
 Original post by eGamerplease show some respect in your answers, otherwise it is useless in this forum.

I have no idea how it is that you sense any disrespect here.

##### Share on other sites
Quote:
 Original post by eGameri mean do visual studio reaches an assembly in function one that is optimized without making the manual optimization in the second function ?

Both functions are completely unoptimized.

You first need to rewrite the code as SIMD (SSE - SSE4, depending on target platform) in assembly, then you need to run a decent profiler, to determine if instruction scheduling works as expected, or whether there are any other issues. Of course, profile against non-SIMD version, it might be faster.

In addition, the data needs to be properly aligned and prefetched in cache, and the loop calling it must be either cache oblivious or take into consideration cache line sizes.

Lastly, you need to profile the whole application to determine if bottlenecks occur elsewhere, whether inlining the function is possible or beneficial, and whether the rest of application doesn't introduce unexpected bottlenecks.

##### Share on other sites
Your manual optimizations are (usually) unnecessary, a decent compiler will perform them, or at least consider them. It's called common sub-expression elimination; expressions like "x*x" and "2*x*y" can be calculated once and used in other expressions.

The opposite is rematerialization; sometimes you don't want to do too much CSE if you don't have enough registers to store all the CSEs and don't want to spill to memory, so you recalculate expressions because it's cheaper than keeping them around.

##### Share on other sites
Quote:
 Original post by outRiderYour manual optimizations are (usually) unnecessary, a decent compiler will perform them, or at least consider them. It's called common sub-expression elimination; expressions like "x*x" and "2*x*y" can be calculated once and used in other expressions.
Be careful with such assumptions.
C compilers often refuse to do any kind of algebraic transformation on floating point expressions, so I don't expect it to work for things that aren't exactly identical. That is it may well fail to expose certain possible common sub-expressions within the sums (e.g. things like x - y and 2 + x - y) .

The usual aliasing issues apply here too of course. Caching the quaternion members (as the OP is doing here) is a good idea for reasons other than saving on typing.

##### Share on other sites
Quote:
Original post by implicit
Quote:
 Original post by outRiderYour manual optimizations are (usually) unnecessary, a decent compiler will perform them, or at least consider them. It's called common sub-expression elimination; expressions like "x*x" and "2*x*y" can be calculated once and used in other expressions.
Be careful with such assumptions.
C compilers often refuse to do any kind of algebraic transformation on floating point expressions, so I don't expect it to work for things that aren't exactly identical. That is it may well fail to expose certain possible common sub-expressions within the sums (e.g. things like x - y and 2 + x - y) .

Hence the "usually" and "at least consider them" qualifiers. I'm aware of the FP complications. Most compilers I've used have a switch for fast vs precise FP consistency and some use fast at high opt.

Quote:
 Original post by implicitThe usual aliasing issues apply here too of course. Caching the quaternion members (as the OP is doing here) is a good idea for reasons other than saving on typing.

Aliasing is irrelevant to the OPs question, the difference between the two snippets is the explicit calculation of the squared sub-expressions, all the variables are pinned. But yes, in general aliasing is a factor in correctly identifying CSEs and invariants.

1. 1
Rutin
40
2. 2
3. 3
4. 4
5. 5

• 17
• 18
• 12
• 14
• 9
• ### Forum Statistics

• Total Topics
633362
• Total Posts
3011528
• ### Who's Online (See full list)

There are no registered users currently online

×