Back to General and Gameplay Programming

Function pointers slow??

General and Gameplay Programming Programming

Started by MrRage April 17, 2006 09:55 PM

5 comments, last by iMalc 18 years ago

MrRage

127

Author

April 17, 2006 09:55 PM

I just recently did a test on my machine and was surprised to see how badly function pointers did compared to a static function. It took nearly 17x longer to access than a function call. Even though on a modern machine 17 clock cycles is not noticeable. Class->func was a virtual base class, Class.func() was not inhearted. f() == void (*f)(); // Try Function() int times = 1000; QueryPerformanceCounter((LARGE_INTEGER *)&startTime); for (i = 0; i < times; i++) { myBClass.func(); // void func(){int a = 5+4;} } QueryPerformanceCounter((LARGE_INTEGER *)&endTime); fps1 = (long)(endTime - startTime); Output: C:\dev\projects\ptrTest\Release>ptrTest.exe ========================== class.func() ran in 1043 ticks f() ran in 17241 ticks class->func() ran in 17227 ticks f() was 16.53x slower than function() ========================== CPU Frequency: 1501707296

ApochPiQ

23,138

April 17, 2006 10:03 PM

This may not be a valid comparison. For instance, it is very likely that your compiler is smart enough to discard the direct calls to func(), or maybe just call it once. The only really reliable defense against that is to turn off compiler optimizations and double-check.

In general, yes, dynamic dispatch will always be slower than a direct function call. That's because dynamic dispatch (virtual methods, function pointers) cannot be optimized at compile-time in many ways that can be done on static calls. There's more to it, but suffice it to say for now that in general dynamic is slower than static for function calling.

However, all of this is totally meaningless. Unless you've done profiling and proven that dynamic-dispatch function calls are causing performance problems for your current real project, there's no point panicking and going to remove all dynamic-dispatch calls from your code.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

blaze02

100

April 17, 2006 10:12 PM

It seems like it is just bad optimization on your compilers part. When I think about optimizations, I convert everything to assembly. If you compare calling a function to calling a function pointer:

call <function name> //<function name> gets converted to a memory address before compilation, a constant.

vs.

call [variable] //[variable] dereferences the variable during run-time and performs the same operation as above. Dereferencing the variable takes no time as it is done in the CPU pipeline.

I think the reason your benchmarks are different has to do with caching. If the CPU knows which function (location in memory) is going to get executed, it can intelligently cache the functions code into L1/L2 cache.

-------Harmotion - Free 1v1 top-down shooter!Double Jump Studios Blog

iMalc

2,466

April 18, 2006 04:44 AM

ApochPiQ has a point, the test is not valid because the compiler may optimise out absolutely anything it possibly can which still gives the same result. You have to get really clever to force the optimiser to not optimise things out sometimes.
Turning off optimisations is not fair either, they should be tested with optimisations.

"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms

Emmanuel Deloget

1,382

April 18, 2006 06:34 AM

Quote:Original post by blaze02
It seems like it is just bad optimization on your compilers part. When I think about optimizations, I convert everything to assembly. If you compare calling a function to calling a function pointer:

call <function name> //<function name> gets converted to a memory address before compilation, a constant.

vs.

call [variable] //[variable] dereferences the variable during run-time and performs the same operation as above. Dereferencing the variable takes no time as it is done in the CPU pipeline.

I think the reason your benchmarks are different has to do with caching. If the CPU knows which function (location in memory) is going to get executed, it can intelligently cache the functions code into L1/L2 cache.

In his case, the call to func() is probably inlined - thus, he don't call the function at all (of course, it tends to be faster).

The valid comparison is between the call to the function ptr and the call to the virtual function (there is one "call" asm instruction issued).

Regards,

-- Emmanuel D. [blog, in French] [blog, very bad googlized translation]

Spoonbender

1,258

April 18, 2006 07:35 AM

Quote:Original post by blaze02Dereferencing the variable takes no time as it is done in the CPU pipeline.

Err?

Dereferencing requires a memory load (you need to get the address stored in the function pointer), which may be cached, may be retrieved in a couple of cycles, or may take over 100 cycles, depending on the location in memory.
After that, you have to jump to the address you retrieved, which is another memory load.

So yeah, it depends on a lot of factors. Sometimes it might be only one or two cycles slower than a "normal" function call. Sometimes it might be one or two hundred cycles slower. (Although in the average case, I think it'll be closer to the former)

So what does this mean?
Nothing at all. You'll have to use function pointers *a lot* in your code to get any noticeable difference. And the second time you call it, the pointer will probably be cached, so the difference goes way down. And when you use a function pointer, it's usually because you need the functionality, and then static function calls just won't cut it, and this comparison becomes meaningless. It's like saying that addition is faster than square root. Yes, it is, but sometimes you need a square root. [wink]

iMalc

2,466

April 19, 2006 02:46 AM

Exactly spoonbender.
Bending over backwards to not use function pointers is only going to mean you end up using more complicated, harder to understand, and probably far slower things instead. Not to mention it would all be based on an invalid benchmark.

This is why people should not run tests like this. Just write the code in the simplest way possible, and later on if it proves to be too slow, only then do you optimise it.

"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms

Function pointers slow??

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Function pointers slow??

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines