Shouldn't inline functions be faster?

Started by
15 comments, last by alh420 11 years ago

So there is an exercise that says I should create a program to measure which function is faster,a normal one,or a inline one.

Here's the code:



#include <iostream>
#include <string>
#include <assert.h>

#include <ctime>
using namespace std;

clock_t t;
inline void f1(){
	t=clock()-t;
	if(t!=0)
	cout << "f1 " << t << endl;
}

void f2(){
    t=clock()-t;
	if(t!=0)
	cout <<"f2 " << t << endl;
}
int main() {
	for(int i = 0;i<10000;i++){
		t=clock();
		f1();
	}
	for(int i = 0;i<10000;i++){
		t=clock();
		f2();
	}

} 

The problem is f1 appears 4 times and f2 appears 2 times.As you can see I set the functions so they'll show text only if the time difference is bigger than 0.

Why does the inline function get executed slower?! Shouldn't it be faster?

Advertisement
Using the inline keyword doesn't guarantee that the compiler will inline a function. Not using the inline keyword doesn't mean that the compiler won't inline it. The inline keyword is a hint that you give the compiler; a hint that most modern compilers will ignore and do their own thing anyways. With a modern compiler about the only difference the keyword makes is allowing you to stick the definition of a inline function in a header.

It should be faster, not by much but a little, if you call them lots of times you should see a slight increase in performance. However the compiler is free to chose to inline or not to inline, whether you declare it as inline or not. For example with optimisations on, your compiler may inline functions that you didn't declare inline. And also choose to not inline if you declared as inline.

That said, your clock probably doesn't have the accuracy needed for this test. Try using c++11 high_precision_clock instead ( <chrono> ). Or if not using c++11, try win32 clock QUERY functions.

If this post or signature was helpful and/or constructive please give rep.

// C++ Video tutorials

http://www.youtube.com/watch?v=Wo60USYV9Ik

// Easy to learn 2D Game Library c++

SFML2.2 Download http://www.sfml-dev.org/download.php

SFML2.2 Tutorials http://www.sfml-dev.org/tutorials/2.2/

// Excellent 2d physics library Box2D

http://box2d.org/about/

// SFML 2 book

http://www.amazon.com/gp/product/1849696845/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=1849696845&linkCode=as2&tag=gamer2creator-20

That said, your clock probably doesn't have the accuracy needed for this test. Try using c++11 high_precision_clock instead ( ). Or if not using c++11, try win32 clock QUERY functions.

If you're using Visual Studio 12 then high_precision_clock won't work unless they've fixed it in one of the updates. It shipped with a placeholder version that used a low-resolution clock. If you're on Windows/Visual Studio 12, you should use QueryPerformanceCounter instead.

Somewhat related are the contents of this thread, in particular my last post and my post preceding it touch a little bit on why inlining and other micro-optimizations aren't so simple to reason about, and from there you can infer why its a good thing that the inline keyword is only a hint to the compiler, rather than a direct command.

throw table_exception("(? ???)? ? ???");

I did my own test a while back on inlining functions, compiled a similar test exe, and disassembled it to look at the actual output from the compiler - only to find that the code, for both functions, and for calling them, was identical.. eg: there was no discernable difference between using inline and not inline at the machine code level.. All of this I assume is the result of what everybody is saying about it here.

So, I just #define function-like macros to produce virtually the same result as inlining functions. At least, from what I've read, it's virtually the same as inlining functions. Either way, it works how I want it to at the machine level.

If you are using Microsoft visual studio you can force use '__forceinline' keyword, meanwhile 'inline' just gives the compiler a "hint". Inlining a function removes the call/ret overhead generated by compiler, but creates a larger executable image. The only time i use inlining is small tight loops where the overhead would cost too many cpu cycles, otherwise i rely on the compiler to make the right decision.

If you are using Microsoft visual studio you can force use '__forceinline' keyword

Actually you can't. Even __forceinline is just a strong suggestion. From the MSDN documentation on __forceinline:

The compiler treats the inline expansion options and keywords as suggestions. There is no guarantee that functions will be inlined. You cannot force the compiler to inline a particular function, even with the __forceinline keyword.

Function inlining can cost you performance because it increases the code size an thus the the strain on the I-cache.

However if used correctly it can significantly increase the performance for two reasons:

1. The compiler can optimize across the boundaries of the function. In your case you won't be seeing this, because there is virtually nothing outside the function except for the loop.

2. You don't need the the instructions for calling the function, passing parameters, creating a stack frame, etc. You only see a benefit from this, if this overhead is actually a large percentage of what the function does. This is typically the case for getters and setters which would boil down to about 1 instruction if not for that overhead. However in your case, you are doing a syscall in that function worth a couple of thousand instructions, so the overhead of not inlining is insignificant.

Bottom line, in the scenario you chose, inlining should not give you any visible performance advantage.

So, I just #define function-like macros to produce virtually the same result as inlining functions. At least, from what I've read, it's virtually the same as inlining functions. Either way, it works how I want it to at the machine level.

I would recommend that no one do this, unless they have hard performance data to back up the change.

When your compiler chooses to ignore the 'inline' suggestion, it generally has a good reason why...

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

You shouldn't treat inline as a performance directive. Much like the old auto and register keywords, its initial use is pretty much depricated. Rather think of it in terms of a linkage directive, like extern, or static. The inline keyword allows you to define a function in the header, and it quite useful in that regard. As far as performance concerns, modern compilers with global optimizations do not need a hint to know when or when not to considering inlining.

This topic is closed to new replies.

Advertisement