Yes I'm an optimization freak

Started by
33 comments, last by intransigent-seal 18 years, 3 months ago
I just wanted to know, is the operation 3 / 2 or the operation 3 * .5 faster on an Intel-based processor. Thanx for any suggestions! (Oh and please dont post anything about "it doesnt matter" unless it really doesnt matter) Edit: To expand on that, is the operation 2 * (4 / 5) faster than 2 * .8 Edit: Assume all values are stored as floats
A JPEG is worth a thousand and twenty four DWORD's. ;)
Note: Due to vacationing my website will not be updated till late-August. Of course I still have internet, but who wants to program during a vacation?
Advertisement
well, considering that todays technology you could presume that takes 1/1207833498789 of the processor to do maybe smaller lol... but considering the circumstances it would be faster doing 2 * .8 because it only has to do ONE mathmatic equation where at 2 * (4/5) it has to do two
Not sure but I think this is faster then both

3 >> 1;

of course that wont work with floats but if your focased with optomization that much there has to be sacrifices.
____________________________"This just in, 9 out of 10 americans agree that 1 out of 10 americans will disagree with the other 9"- Colin Mochrie
If you have something like this:

3 * (4/5)

And everything is constants, not variables, then the compiler "SHOULD" figure it out at compile time.

At run time, doing 4/5 would be longer than .8 because of the extra instruction, I think.
It doesn't matter. Every single change you suggest can be trivially performed by the compiler if needed. So, it doesn't matter. Write code that is easy to understand, and let people with more information than you worry about the details.

CM
Quote:Original post by geekalert
(Oh and please dont post anything about "it doesnt matter" unless it really doesnt matter)


It really doesn't matter.
yes as stated above the compiler will do LOW LEVEL optimzation, the thing that you as a programmer must be concerned with is high level opimization. For example

while(I < 300)
I++; ///<-- the compiler wont optomize this, obivious as the ineffeiciency is

as opposed to

I = 300;

this is a really simple example but hopefully you get what I mean
____________________________"This just in, 9 out of 10 americans agree that 1 out of 10 americans will disagree with the other 9"- Colin Mochrie
Quote:Original post by raptorstrike
yes as stated above the compiler will do LOW LEVEL optimzation, the thing that you as a programmer must be concerned with is high level opimization. For example

while(I < 300)
I++; ///<-- the compiler wont optomize this, obivious as the ineffeiciency is

as opposed to

I = 300;

this is a really simple example but hopefully you get what I mean

This even is a bad example - the compiler will reduce that to i = 300.
(tested under VC toolkit 2k3).

Cheers,
Pat.

PS: Micro-optimisation is pointless unless profiler outcome suggests to do it.

Back to the topic. If you have really floats and not constants, there is a HUGE difference in performance! Obviously the calculation with 3 numbers is the slowest and multiplication should be faster than divisions.

I wrote a little test app doing each of your calculations 1 billion times, here is the result:
3/2=1,5     took 3662791ns3*0,5=1,5   took 1371114ns2*(4/5)=1,6 took 7788465ns2*0,8=1,6   took 1366573ns


And the sourcecode (c#, but test it in c++ or assembler if you want to ...):
// Project: TestAddMultPerformance, File: Program.cs// Namespace: TestAddMultPerformance, Class: Program// Path: C:\code\TestAddMultPerformance, Author: Abi// Code lines: 107, Size of file: 2,96 KB// Creation date: 30.12.2005 07:16// Last modified: 30.12.2005 07:25// Generated with Commenter by abi.exDream.com#region Using directivesusing System;using System.Collections.Generic;using System.Text;using System.Runtime.InteropServices;#endregionnamespace TestAddMultPerformance{	/// <summary>	/// Program	/// </summary>	class Program	{		#region Performance counters and getting ns time		/// <summary>		/// Query performance (high resolution) timer frequency		/// </summary>		/// <param name="lpFrequency">current frequency</param>		[System.Security.SuppressUnmanagedCodeSecurity]		[DllImport("Kernel32.dll")]		[return: MarshalAs(UnmanagedType.Bool)]		internal static extern bool QueryPerformanceFrequency(			out long lpFrequency);		/// <summary>		/// Query performance (high resolution) timer counter		/// </summary>		/// <param name="lpCounter">current counter value</param>		[System.Security.SuppressUnmanagedCodeSecurity]		[DllImport("Kernel32.dll")]		[return: MarshalAs(UnmanagedType.Bool)]		internal static extern bool QueryPerformanceCounter(			out long lpCounter);		/// <summary>		/// Get current performance timer frequency		/// (using QueryPerformanceFrequency)		/// </summary>		public static long GetPerformanceFrequency()		{			long l;			QueryPerformanceFrequency(out l);			return l;		} // GetPerformanceFrequency()		/// <summary>		/// Get current performance timer counter value		/// (using QueryPerformanceCounter)		/// </summary>		public static long GetPerformanceCounter()		{			long l;			QueryPerformanceCounter(out l);			return l;		} // GetPerformanceCounter()		/// <summary>		/// Remember the frequency		/// </summary>		public static long performanceFrequency = GetPerformanceFrequency();		/// <summary>		/// Convert performance counter value to ns.		/// </summary>		/// <param name="perfCounter">Counter difference from 2 values</param>		static public int ConvertToNs(long perfCounter)		{			return (int)(perfCounter * 1000000 / performanceFrequency);		} // ConvertToNs(perfCounter)		/// <summary>		/// Convert performance counter value difference		/// (perfCounter2-perfCounter1) to ns.		/// </summary>		static public int ConvertToNs(long perfCounter1, long perfCounter2)		{			return (int)((perfCounter2 - perfCounter1) *				1000000 / performanceFrequency);		} // ConvertToNs(perfCounter1, perfCounter2)		#endregion		static void Main(string[] args)		{			// Declare everything here in case this eats up cycles			float ret = 0.0f;			float value1 = 3;			float value2 = 2;			float value3 = 1 / value2;			float value4 = 4;			float value5 = 5;			float value6 = value4 / value5;			long perfCounterBefore, perfCounterAfter;			// Do 1 bio iterations.			int numberOfIterations = 1000 * 1000 * 1000;			// Just call every method once to make sure we don't count any JIT time.			perfCounterBefore = GetPerformanceCounter();			for (int i = 0; i < numberOfIterations; i++)			{				ret = value1 / value2;			} // for			perfCounterAfter = GetPerformanceCounter();			Console.WriteLine("Dummy test to init JIT: "+				value1 + "/" + value2 + "=" + ret + " took " +				ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns");			// Test1			perfCounterBefore = GetPerformanceCounter();			for (int i = 0; i < numberOfIterations; i++)			{				ret = value1 / value2;			} // for			perfCounterAfter = GetPerformanceCounter();			Console.WriteLine(value1 + "/" + value2 + "=" + ret + " took " +				ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns");			// Test2			perfCounterBefore = GetPerformanceCounter();			for (int i = 0; i < numberOfIterations; i++)			{				ret = value1 * value3;			} // for			perfCounterAfter = GetPerformanceCounter();			Console.WriteLine(value1 + "*" + value3 + "=" + ret + " took " +				ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns");			// Test3			perfCounterBefore = GetPerformanceCounter();			for (int i = 0; i < numberOfIterations; i++)			{				ret = value2 * (value4 / value5);			} // for			perfCounterAfter = GetPerformanceCounter();			Console.WriteLine(value2 + "*(" + value4 + "/" + value5 + ")=" + ret + " took " +				ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns");			// Test4			perfCounterBefore = GetPerformanceCounter();			for (int i = 0; i < numberOfIterations; i++)			{				ret = value2 * value6;			} // for			perfCounterAfter = GetPerformanceCounter();			Console.WriteLine(value2 + "*" + value6 + "=" + ret + " took " +				ConvertToNs(perfCounterBefore, perfCounterAfter) + "ns");			Console.ReadLine();		} // Main(args)	} // class Program} // namespace TestAddMultPerformance


And obviously this is nothing to consider when coding normal algorithms, but it is never bad to know these things :)
Microsoft DirectX MVP. My Blog: abi.exdream.com
Quote:Original post by raptorstrike
yes as stated above the compiler will do LOW LEVEL optimzation, the thing that you as a programmer must be concerned with is high level opimization. For example

while(I < 300)
I++; ///<-- the compiler wont optomize this, obivious as the ineffeiciency is

as opposed to

I = 300;

this is a really simple example but hopefully you get what I mean

While your point is both valid and important, I'll bet that if you check, a good compiler does indeed optimize that.

CM

This topic is closed to new replies.

Advertisement