assembly is slower ?! (source included)

Started by
39 comments, last by nosajghoul 20 years ago
Ok, newbie to assem, just probing the territory. I made 2 functions that do exactly the same thing. (Add a number together) but one uses assembly, the other not. (Im coding in MSVC++ 6.0 on a P3450 this evening, so your numbers will prolly b better ) I got the start time and finish time for each before I made it loop a million times (literally). I then output the total times, which should (shouldnt it?) be a benchmark of sorts for which is faster. I thought assembly would beat straight c, but nope. My results were : Assembly function = 1702, normal function = 1562. Normal was about .15 seconds faster. Why? And now for the code :

#include <iostream.h>
#include <windows.h> //for GetTickCount()

int x = 0xFFFF;
int y = 0xFFFF;
int r = 0; 

void __fastcall AssemAdd()
{
	_asm
	{
		mov eax, x
		mov ebx, y
		add eax, ebx
		mov r, eax
	}
}

void NormAdd()
{	
	r=x+y;
}

const int loops = 10000000; //10 million

void main()
{
	DWORD StartAssem = 0.0f;
	DWORD FinishAssem = 0.0f;
	DWORD TotalAssem = 0.0f;

	DWORD StartNorm = 0.0f;
	DWORD FinishNorm = 0.0f;
	DWORD TotalNorm = 0.0f;

	int x = 0; 
	
	//log starttime
	StartAssem = GetTickCount();
	
	for(x=0; x < loops; x++) 
		AssemAdd();
	
	//log finishtime
	FinishAssem = GetTickCount();
	//calculate totaltime
	TotalAssem = FinishAssem - StartAssem;

	//now for not assembly
	
	//log starttime
	StartNorm = GetTickCount();

	for(x=0; x < loops; x++)
		NormAdd();

	//log finishtime
	FinishNorm = GetTickCount();
	//calculate totaltime
	TotalNorm = FinishNorm - StartNorm;

	//output the times
	cout << endl << "Assem = " << TotalAssem;
	cout << endl << "Norm = " << TotalNorm;
	cout << endl;

}
 
Should work fine as is, in MSVC++ 6.0 at least. Its a console app, and when ran (on my comp anyway) it seems to hang for a second (its just doing 20 million things, thats all) and then outputs the time in milliseconds it took for the functions. Oh and could someone tell me what __fastcall is? Whats the technical term for those double underscored commands so I can look it up? -Jason
normal_toes@hotmail.com
Advertisement
Inline assembly defeats the compiler''s optimizer. You''re saying "I know better" when you type asm, and it does everything exaxtly how you enter it. With C & C++ code, it just has to gaurantee the result.

(PS iostream.h is deprecated, use iostream, no .h)
- The trade-off between price and quality does not exist in Japan. Rather, the idea that high quality brings on cost reduction is widely accepted.-- Tajima & Matsubara
Plus, that''s not even the most efficient you can make that code. More efficient would be:

_asm
{
mov eax, x
add eax, y
mov r, eax
}

If I remember my basic x86 asm right.

Are you using compiler optimizations?
Anonymous :

compiler optimizations? looking into that now. As I said, Im new at this as of today. I get it so far, havent hit the wall yet.

Im trying your suggestion (add eax, y mov r, eax ...) Makes sense. Ill post the times in a few minutes.

-Jason
normal_toes@hotmail.com
quote:Original post by nosajghoul
Oh and could someone tell me what __fastcall is? Whats the technical term for those double underscored commands so I can look it up?


__fastcall, __cdecl, __pascal, etc. are all calling conventions. IIRC in __fastcall, the first two DWORD or smaller sized arguments are passed in ecx and edx, the rest are pushed on the stack right to left. Since you have no parameters, thats really not giving you a performance boost.
lol, I did what anonymous said, plus I switched from debug to release. (doh!) Huge difference. Normal function = 200, assembly = 201.

Closer, much faster, but I was really looking for a simple way to show off the speed of assembly to myself. Whilst I learnt, I have opened more doors than Ive shut.

Theres a way (I hear) to compile and see c++ code alongside assem code in MSVC++, right? I looked in settings, dont think its there.... help?

-Jason
normal_toes@hotmail.com
Try looking at the listing files section of the output tab in your project settings. Or just look up the /FAs compiler switch.
go to:

view->debug->disassembly

as for __ (duel underscore) keywords they are generally compiler specific language extensions.
DWORD StartAssem = 0.0f;
The ".0" and "f" is for floating point numbers, use "L"
DWORD StartAssem = 0l;

Doesn''t make a difference I think, but it looks wierd assigning floating point values to a long;

__fastcall is a Microsoft extension to C++.
It simply passes first two args in registers, and stacks the rest from right to left.
Since you are passing no args, it is pointless to use __fastcall (other than for name decoration reasons)

To make the asm code faster, simply put it in the loop, not in a seperate function. You MIGHT get a better time if you change "void AssemAdd(void)" to
__fastcall int AssemAdd(const int x, const int y)
{
// this function returns the answer
_asm
{
xor eax eax
add eax edx
add eax ecx
}
-OR-
_asm
{
mov eax ecx
add eax, edx
}

-OR-

__fastcall void AssemAdd(const int x, const int y)
{
// this will place the result in the variable "r"
_asm
{
mov r, ecd
add r, edx
}
}

There are ALOT of ways to do things in assembly.
Like Magmai Kai Holmlor said, when using inline assembly, you are saying you know mare than the compiler so you must be right. Assembly itself isnt faster than C++. Its what you do with it that makes it faster.

Intro Engine
The problem with learning asm for that purpose is that compilers keep getting better and better at translating C++ to assembly. Example: I used every optimization trick I knew for this one block of code on a program I worked on. Result? It worked faster in the debug mode (which doesn''t use optimizations), but my "clever" optimizations short-circuited the compiler''s optimizations, which were superior.

Nevertheless, you can almost always still find ways to squeeze more speed out of a block of code by converting to asm because no compiler is perfect, and you know exactly what you''re trying to do...it''s just a lot harder than if you''re working on an obscure platform with a poor compiler.

This topic is closed to new replies.

Advertisement