Back to General and Gameplay Programming

Visual C++ 2008 producing very strange code

General and Gameplay Programming Programming

Started by l0calh05t August 02, 2008 02:53 PM

18 comments, last by l0calh05t 15 years, 8 months ago

l0calh05t

1,829

Author

August 02, 2008 02:53 PM

I was looking at some assembler output and I noticed the following: This line of code (a and b are floats)

return a < b ? a : b;

creates the following output


movss	xmm0, DWORD PTR _a$[esp+8]
movss	xmm1, DWORD PTR _b$[esp+8]
cvtps2pd xmm0, xmm0
cvtps2pd xmm1, xmm1
comisd	xmm1, xmm0
lea	eax, DWORD PTR _a$[esp+8]
ja	SHORT $LN6@main
lea	eax, DWORD PTR _b$[esp+8]
$LN6@main:

I find it very strange that two single-precision floats are first expanded into double precision floats before being compared. Why isn't comiss used instead of comisd? This would save two instructions and would generally seem to be far more efficient.

x_filed

122

August 02, 2008 03:11 PM

Compilers cannot understand a programmer's context and will rarely produce the best code for every situation.

l0calh05t

1,829

Author

August 02, 2008 03:29 PM

Obviously, but in this case the compiler knows a and b are floats. Why does it expand them to doubles??

If it understood intent it should produce something along the lines of:

movss xmm0, a;movss xmm1, b;minss xmm0, xmm1;movss retval, xmm0;

Imtiaz

400

August 02, 2008 03:34 PM

Perhaps it is willing to make the tradeoff in favor of execution performance versus space. In other words, the conversion to double may yield better performance in hardware than using float.

l0calh05t

1,829

Author

August 02, 2008 03:38 PM

Why would it? The instruction set supports the exact same instruction in a single precision variant, and doing so would require two instructions less (the conversions to double precision)

MichaelT

214

August 02, 2008 03:52 PM

Most likely the compiler recognizes the pattern and just provides the one solution it knows. The developers might have reasoned that the expansion is practically "free".

No no no no! :)

l0calh05t

1,829

Author

August 02, 2008 03:59 PM

Well, it isn't:

	float a;	float b;	float res;	std::cin >> a;	std::cin >> b;	sf::Clock clock;	clock.Reset();	for(unsigned int i = 0; i < 10000000; ++i)	{		__asm		{			movss xmm0, a;			movss xmm1, b;			movss xmm2, xmm0;			movss xmm3, xmm1;			cvtps2pd xmm0, xmm0;			cvtps2pd xmm1, xmm1;			comisd xmm1, xmm0;			movss res, xmm2;			ja min_is_a;			movss res, xmm3;min_is_a:		}	}	float time1 = clock.GetElapsedTime();	clock.Reset();	for(unsigned int i = 0; i < 10000000; ++i)	{		__asm		{			movss xmm0, a;			movss xmm1, b;			movss xmm2, xmm0;			movss xmm3, xmm1;			comiss xmm1, xmm0;			movss res, xmm2;			ja min_is_a2;			movss res, xmm3;min_is_a2:		}	}	float time2 = clock.GetElapsedTime();	std::cout << res << "\n";	std::cout << time1 << "\n";	std::cout << time2 << "\n";

produces

0.0515853
0.0381895

on my pc. so the expansion is everything but free (35% slower)

samoth

9,833

August 02, 2008 04:11 PM

It probably doesn't matter because it's only a few cycles lost, and that code won't normally run a billion times per second, but have you tried std::min or the instrinsic min() function?
I'm not using Visual C++, so no idea about that one, but under gcc, the like functions usually offer an optimal implementation, which is about as good as you could get with (and sometimes better than) hand-written assembly.

Having said that, I've entirely given up going anywhere near assembler quite a while ago, because it isn't really worth it any more. Looking at more than 3-4 isolated instructions, compiler output with full optimization is rarely a few cycles slower than what you could code in assembler, and usually as fast or even faster. Also, writing C++ takes only about 5% of the time, and the code is a lot easier to manage and debug.

thedustbustr

191

August 02, 2008 04:33 PM

Run that benchmark 10k times and tell us the averages. I trust that you told your compiler to generate optimized code.

l0calh05t

1,829

Author

August 02, 2008 04:37 PM

Quote:Original post by thedustbustr
Run that benchmark 10k times and tell us the averages. I trust that you told your compiler to generate optimized code.

Right... and yes.

I ran it 5 times and the differences were at the 5th decimal. Good enough for me.

Visual C++ 2008 producing very strange code

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Visual C++ 2008 producing very strange code

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines