Jump to content
  • Advertisement
Sign in to follow this  
geekalert

Yes I'm an optimization freak

This topic is 4701 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I just wanted to know, is the operation 3 / 2 or the operation 3 * .5 faster on an Intel-based processor. Thanx for any suggestions! (Oh and please dont post anything about "it doesnt matter" unless it really doesnt matter) Edit: To expand on that, is the operation 2 * (4 / 5) faster than 2 * .8 Edit: Assume all values are stored as floats

Share this post


Link to post
Share on other sites
Advertisement
well, considering that todays technology you could presume that takes 1/1207833498789 of the processor to do maybe smaller lol... but considering the circumstances it would be faster doing 2 * .8 because it only has to do ONE mathmatic equation where at 2 * (4/5) it has to do two

Share this post


Link to post
Share on other sites
Not sure but I think this is faster then both

3 >> 1;

of course that wont work with floats but if your focased with optomization that much there has to be sacrifices.

Share this post


Link to post
Share on other sites
If you have something like this:

3 * (4/5)

And everything is constants, not variables, then the compiler "SHOULD" figure it out at compile time.

At run time, doing 4/5 would be longer than .8 because of the extra instruction, I think.

Share this post


Link to post
Share on other sites
It doesn't matter. Every single change you suggest can be trivially performed by the compiler if needed. So, it doesn't matter. Write code that is easy to understand, and let people with more information than you worry about the details.

CM

Share this post


Link to post
Share on other sites
Quote:
Original post by geekalert
(Oh and please dont post anything about "it doesnt matter" unless it really doesnt matter)


It really doesn't matter.

Share this post


Link to post
Share on other sites
yes as stated above the compiler will do LOW LEVEL optimzation, the thing that you as a programmer must be concerned with is high level opimization. For example

while(I < 300)
I++; ///<-- the compiler wont optomize this, obivious as the ineffeiciency is

as opposed to

I = 300;

this is a really simple example but hopefully you get what I mean

Share this post


Link to post
Share on other sites
Quote:
Original post by raptorstrike
yes as stated above the compiler will do LOW LEVEL optimzation, the thing that you as a programmer must be concerned with is high level opimization. For example

while(I < 300)
I++; ///<-- the compiler wont optomize this, obivious as the ineffeiciency is

as opposed to

I = 300;

this is a really simple example but hopefully you get what I mean

This even is a bad example - the compiler will reduce that to i = 300.
(tested under VC toolkit 2k3).

Cheers,
Pat.

PS: Micro-optimisation is pointless unless profiler outcome suggests to do it.

Share this post


Link to post
Share on other sites
Back to the topic. If you have really floats and not constants, there is a HUGE difference in performance! Obviously the calculation with 3 numbers is the slowest and multiplication should be faster than divisions.

I wrote a little test app doing each of your calculations 1 billion times, here is the result:

3/2=1,5 took 3662791ns
3*0,5=1,5 took 1371114ns
2*(4/5)=1,6 took 7788465ns
2*0,8=1,6 took 1366573ns


And the sourcecode (c#, but test it in c++ or assembler if you want to ...):

// Project: TestAddMultPerformance, File: Program.cs
// Namespace: TestAddMultPerformance, Class: Program
// Path: C:\code\TestAddMultPerformance, Author: Abi
// Code lines: 107, Size of file: 2,96 KB
// Creation date: 30.12.2005 07:16
// Last modified: 30.12.2005 07:25
// Generated with Commenter by abi.exDream.com

#region Using directives
using System;
using System.Collections.Generic;
using System.Text;
using System.Runtime.InteropServices;
#endregion

namespace TestAddMultPerformance
{
/// <summary>
/// Program
/// </summary>
class Program
{
#region Performance counters and getting ns time
/// <summary>
/// Query performance (high resolution) timer frequency
/// </summary>
/// <param name="lpFrequency">current frequency</param>
[System.Security.SuppressUnmanagedCodeSecurity]
[DllImport("Kernel32.dll")]
[return: MarshalAs(UnmanagedType.Bool)]
internal static extern bool QueryPerformanceFrequency(
out long lpFrequency);

/// <summary>
/// Query performance (high resolution) timer counter
/// </summary>
/// <param name="lpCounter">current counter value</param>
[System.Security.SuppressUnmanagedCodeSecurity]
[DllImport("Kernel32.dll")]
[return: MarshalAs(UnmanagedType.Bool)]
internal static extern bool QueryPerformanceCounter(
out long lpCounter);

/// <summary>
/// Get current performance timer frequency
/// (using QueryPerformanceFrequency)
/// </summary>
public static long GetPerformanceFrequency()
{
long l;
QueryPerformanceFrequency(out l);
return l;
} // GetPerformanceFrequency()

/// <summary>
/// Get current performance timer counter value
/// (using QueryPerformanceCounter)
/// </summary>
public static long GetPerformanceCounter()
{
long l;
QueryPerformanceCounter(out l);
return l;
} // GetPerformanceCounter()

/// <summary>
/// Remember the frequency
/// </summary>
public static long performanceFrequency = GetPerformanceFrequency();

/// <summary>
/// Convert performance counter value to ns.
/// </summary>
/// <param name="perfCounter">Counter difference from 2 values</param>
static public int ConvertToNs(long perfCounter)
{
return (int)(perfCounter * 1000000 / performanceFrequency);
} // ConvertToNs(perfCounter)

/// <summary>
/// Convert performance counter value difference
/// (perfCounter2-perfCounter1) to ns.
/// </summary>
static public int ConvertToNs(long perfCounter1, long perfCounter2)
{
return (int)((perfCounter2 - perfCounter1) *
1000000 / performanceFrequency);
} // ConvertToNs(perfCounter1, perfCounter2)
#endregion

static void Main(string[] args)
{
// Declare everything here in case this eats up cycles
float ret = 0.0f;
float value1 = 3;
float value2 = 2;
float value3 = 1 / value2;
float value4 = 4;
float value5 = 5;
float value6 = value4 / value5;
long perfCounterBefore, perfCounterAfter;

// Do 1 bio iterations.
int numberOfIterations = 1000 * 1000 * 1000;

// Just call every method once to make sure we don't count any JIT time.
perfCounterBefore = GetPerformanceCounter();
for (int i = 0; i < numberOfIterations; i++)
{
ret = value1 / value2;
} // for
perfCounterAfter = GetPerformanceCounter();

Console.WriteLine("Dummy test to init JIT: "+
value1 + "/" + value2 + "=" + ret + " took " +
ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns");

// Test1
perfCounterBefore = GetPerformanceCounter();
for (int i = 0; i < numberOfIterations; i++)
{
ret = value1 / value2;
} // for
perfCounterAfter = GetPerformanceCounter();

Console.WriteLine(value1 + "/" + value2 + "=" + ret + " took " +
ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns");

// Test2
perfCounterBefore = GetPerformanceCounter();
for (int i = 0; i < numberOfIterations; i++)
{
ret = value1 * value3;
} // for
perfCounterAfter = GetPerformanceCounter();

Console.WriteLine(value1 + "*" + value3 + "=" + ret + " took " +
ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns");

// Test3
perfCounterBefore = GetPerformanceCounter();
for (int i = 0; i < numberOfIterations; i++)
{
ret = value2 * (value4 / value5);
} // for
perfCounterAfter = GetPerformanceCounter();

Console.WriteLine(value2 + "*(" + value4 + "/" + value5 + ")=" + ret + " took " +
ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns");

// Test4
perfCounterBefore = GetPerformanceCounter();
for (int i = 0; i < numberOfIterations; i++)
{
ret = value2 * value6;
} // for
perfCounterAfter = GetPerformanceCounter();

Console.WriteLine(value2 + "*" + value6 + "=" + ret + " took " +
ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns");

Console.ReadLine();
} // Main(args)
} // class Program
} // namespace TestAddMultPerformance


And obviously this is nothing to consider when coding normal algorithms, but it is never bad to know these things :)

Share this post


Link to post
Share on other sites
Quote:
Original post by raptorstrike
yes as stated above the compiler will do LOW LEVEL optimzation, the thing that you as a programmer must be concerned with is high level opimization. For example

while(I < 300)
I++; ///<-- the compiler wont optomize this, obivious as the ineffeiciency is

as opposed to

I = 300;

this is a really simple example but hopefully you get what I mean

While your point is both valid and important, I'll bet that if you check, a good compiler does indeed optimize that.

CM

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!