Yes I'm an optimization freak
I just wanted to know, is the operation
3 / 2
or the operation
3 * .5
faster on an Intel-based processor.
Thanx for any suggestions!
(Oh and please dont post anything about "it doesnt matter" unless it really doesnt matter)
Edit:
To expand on that, is the operation
2 * (4 / 5)
faster than
2 * .8
Edit:
Assume all values are stored as floats
well, considering that todays technology you could presume that takes 1/1207833498789 of the processor to do maybe smaller lol... but considering the circumstances it would be faster doing 2 * .8 because it only has to do ONE mathmatic equation where at 2 * (4/5) it has to do two
Not sure but I think this is faster then both
3 >> 1;
of course that wont work with floats but if your focased with optomization that much there has to be sacrifices.
3 >> 1;
of course that wont work with floats but if your focased with optomization that much there has to be sacrifices.
If you have something like this:
3 * (4/5)
And everything is constants, not variables, then the compiler "SHOULD" figure it out at compile time.
At run time, doing 4/5 would be longer than .8 because of the extra instruction, I think.
3 * (4/5)
And everything is constants, not variables, then the compiler "SHOULD" figure it out at compile time.
At run time, doing 4/5 would be longer than .8 because of the extra instruction, I think.
It doesn't matter. Every single change you suggest can be trivially performed by the compiler if needed. So, it doesn't matter. Write code that is easy to understand, and let people with more information than you worry about the details.
CM
CM
Quote:Original post by geekalert
(Oh and please dont post anything about "it doesnt matter" unless it really doesnt matter)
It really doesn't matter.
yes as stated above the compiler will do LOW LEVEL optimzation, the thing that you as a programmer must be concerned with is high level opimization. For example
while(I < 300)
I++; ///<-- the compiler wont optomize this, obivious as the ineffeiciency is
as opposed to
I = 300;
this is a really simple example but hopefully you get what I mean
while(I < 300)
I++; ///<-- the compiler wont optomize this, obivious as the ineffeiciency is
as opposed to
I = 300;
this is a really simple example but hopefully you get what I mean
Quote:Original post by raptorstrike
yes as stated above the compiler will do LOW LEVEL optimzation, the thing that you as a programmer must be concerned with is high level opimization. For example
while(I < 300)
I++; ///<-- the compiler wont optomize this, obivious as the ineffeiciency is
as opposed to
I = 300;
this is a really simple example but hopefully you get what I mean
This even is a bad example - the compiler will reduce that to i = 300.
(tested under VC toolkit 2k3).
Cheers,
Pat.
PS: Micro-optimisation is pointless unless profiler outcome suggests to do it.
Back to the topic. If you have really floats and not constants, there is a HUGE difference in performance! Obviously the calculation with 3 numbers is the slowest and multiplication should be faster than divisions.
I wrote a little test app doing each of your calculations 1 billion times, here is the result:
And the sourcecode (c#, but test it in c++ or assembler if you want to ...):
And obviously this is nothing to consider when coding normal algorithms, but it is never bad to know these things :)
I wrote a little test app doing each of your calculations 1 billion times, here is the result:
3/2=1,5 took 3662791ns3*0,5=1,5 took 1371114ns2*(4/5)=1,6 took 7788465ns2*0,8=1,6 took 1366573ns
And the sourcecode (c#, but test it in c++ or assembler if you want to ...):
// Project: TestAddMultPerformance, File: Program.cs// Namespace: TestAddMultPerformance, Class: Program// Path: C:\code\TestAddMultPerformance, Author: Abi// Code lines: 107, Size of file: 2,96 KB// Creation date: 30.12.2005 07:16// Last modified: 30.12.2005 07:25// Generated with Commenter by abi.exDream.com#region Using directivesusing System;using System.Collections.Generic;using System.Text;using System.Runtime.InteropServices;#endregionnamespace TestAddMultPerformance{ /// <summary> /// Program /// </summary> class Program { #region Performance counters and getting ns time /// <summary> /// Query performance (high resolution) timer frequency /// </summary> /// <param name="lpFrequency">current frequency</param> [System.Security.SuppressUnmanagedCodeSecurity] [DllImport("Kernel32.dll")] [return: MarshalAs(UnmanagedType.Bool)] internal static extern bool QueryPerformanceFrequency( out long lpFrequency); /// <summary> /// Query performance (high resolution) timer counter /// </summary> /// <param name="lpCounter">current counter value</param> [System.Security.SuppressUnmanagedCodeSecurity] [DllImport("Kernel32.dll")] [return: MarshalAs(UnmanagedType.Bool)] internal static extern bool QueryPerformanceCounter( out long lpCounter); /// <summary> /// Get current performance timer frequency /// (using QueryPerformanceFrequency) /// </summary> public static long GetPerformanceFrequency() { long l; QueryPerformanceFrequency(out l); return l; } // GetPerformanceFrequency() /// <summary> /// Get current performance timer counter value /// (using QueryPerformanceCounter) /// </summary> public static long GetPerformanceCounter() { long l; QueryPerformanceCounter(out l); return l; } // GetPerformanceCounter() /// <summary> /// Remember the frequency /// </summary> public static long performanceFrequency = GetPerformanceFrequency(); /// <summary> /// Convert performance counter value to ns. /// </summary> /// <param name="perfCounter">Counter difference from 2 values</param> static public int ConvertToNs(long perfCounter) { return (int)(perfCounter * 1000000 / performanceFrequency); } // ConvertToNs(perfCounter) /// <summary> /// Convert performance counter value difference /// (perfCounter2-perfCounter1) to ns. /// </summary> static public int ConvertToNs(long perfCounter1, long perfCounter2) { return (int)((perfCounter2 - perfCounter1) * 1000000 / performanceFrequency); } // ConvertToNs(perfCounter1, perfCounter2) #endregion static void Main(string[] args) { // Declare everything here in case this eats up cycles float ret = 0.0f; float value1 = 3; float value2 = 2; float value3 = 1 / value2; float value4 = 4; float value5 = 5; float value6 = value4 / value5; long perfCounterBefore, perfCounterAfter; // Do 1 bio iterations. int numberOfIterations = 1000 * 1000 * 1000; // Just call every method once to make sure we don't count any JIT time. perfCounterBefore = GetPerformanceCounter(); for (int i = 0; i < numberOfIterations; i++) { ret = value1 / value2; } // for perfCounterAfter = GetPerformanceCounter(); Console.WriteLine("Dummy test to init JIT: "+ value1 + "/" + value2 + "=" + ret + " took " + ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns"); // Test1 perfCounterBefore = GetPerformanceCounter(); for (int i = 0; i < numberOfIterations; i++) { ret = value1 / value2; } // for perfCounterAfter = GetPerformanceCounter(); Console.WriteLine(value1 + "/" + value2 + "=" + ret + " took " + ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns"); // Test2 perfCounterBefore = GetPerformanceCounter(); for (int i = 0; i < numberOfIterations; i++) { ret = value1 * value3; } // for perfCounterAfter = GetPerformanceCounter(); Console.WriteLine(value1 + "*" + value3 + "=" + ret + " took " + ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns"); // Test3 perfCounterBefore = GetPerformanceCounter(); for (int i = 0; i < numberOfIterations; i++) { ret = value2 * (value4 / value5); } // for perfCounterAfter = GetPerformanceCounter(); Console.WriteLine(value2 + "*(" + value4 + "/" + value5 + ")=" + ret + " took " + ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns"); // Test4 perfCounterBefore = GetPerformanceCounter(); for (int i = 0; i < numberOfIterations; i++) { ret = value2 * value6; } // for perfCounterAfter = GetPerformanceCounter(); Console.WriteLine(value2 + "*" + value6 + "=" + ret + " took " + ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns"); Console.ReadLine(); } // Main(args) } // class Program} // namespace TestAddMultPerformance
And obviously this is nothing to consider when coding normal algorithms, but it is never bad to know these things :)
Quote:Original post by raptorstrike
yes as stated above the compiler will do LOW LEVEL optimzation, the thing that you as a programmer must be concerned with is high level opimization. For example
while(I < 300)
I++; ///<-- the compiler wont optomize this, obivious as the ineffeiciency is
as opposed to
I = 300;
this is a really simple example but hopefully you get what I mean
While your point is both valid and important, I'll bet that if you check, a good compiler does indeed optimize that.
CM
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement