• Advertisement

Archived

This topic is now archived and is closed to further replies.

Optimization question

This topic is 5510 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I recently had a guy help me with a program. He took a look at it and said I was doing massive data copying that wasn''t needed (calling by value, I assumed the functions were inlined...oops!). Anyway, he made a few changes to a few functions so that they passed by reference to just a few functions that didn''t need copies, and the thing went twice as fast (!). So I figured, if removing the extra data copying that wasn''t needed was an improvement, I''d remove ALL of the data copying. Since most compilers won''t recursively inline functions, I just made macros out of the functions that were doing a lot of data copying onto the stack. To my suprise, the program actually ran slower than the original slow program after I did this, even though I eliminated all data copying to functions (well, at least uneeded copying). He said that I traded one bad thing (extraneous data copying) for an even worse thing, which he called a "cache spill", and he said that since I made everything a macro, I gave the compiler no choice, which is a bad thing. So, what is a cache spill, and why did it make my program go slower? And where can I learn about this kind of stuff? This guy writes time critical code for a living (SQL parsers and ODBL something or others...), and I was amazed at how much stuff he found in my program that was causing such a slowdown after only a quick look. Things I never even considered, like the data copying from passing by value (well, I know it''s faster to pass by reference or pointer for large objects, but I thought it was inlined). I don''t care so much for super low level ugly hacks to make the code go faster. I''m just interested in good efficient programming. Currently, I would look at a program, and I would probably think, "looks good to me," but this guy looked at my program and in under an hour''s time had doubled the speed. It seems like things just jump out at him. So, how do I learn this kind of stuff?

Share this post


Link to post
Share on other sites
Advertisement
Guest Anonymous Poster
quote:
Original post by Russell
I recently had a guy help me with a program. He took a look at it and said I was doing massive data copying that wasn''t needed (calling by value, I assumed the functions were inlined...oops!).



Parameters are still copied even if the compiler inlines the function.

quote:

So, what is a cache spill, and why did it make my program go slower?



Never heard of it. He could be talking about a cache miss.

quote:

So, how do I learn this kind of stuff?


Read books, study CPU architecture and memory, take classes, learn from others, know what your compiler is actually doing with the code you write (your inline assumption is a good example of where a better understanding of C++ would have benefitted you).

Share this post


Link to post
Share on other sites
I think that small functions load into cache nicer than inline-expanded functions (since each piece is not repeated in the former)...

I''m definately not the one to ask about cache optimization though... Our teachers just explained old random crap about cache that isn''t useful for optimization.

Share this post


Link to post
Share on other sites
There is memory called cache memory which is close to the processor and fast and there is normal memory which is far slower by comparison. The cache can keep data which is being used a lot and therefore speed up performance. However there is (obviously) only so much room for the frequently used data and if you exceed it there will be a significant performance drop as the variables have to be reloaded from the (relatively) slow memory.

This leads to some surprising situations. For example, in some loop for(i = 0; i < 1000; ++i) , if there are more variables used than will fit in the cache you may find that splitting the operations into two seperate loops, using fewer variables, will actually increase performance. This is because there will be less cache ''spills''. This is counter intuitive when you only look at the language as being an expression of what you are trying to do, or think of efficiency as ''do it all at the same time'', and forget that it has a physical existance on a machine.

Always profile your code before making changes that are supposed to enhance the performance.

you may find this interesting too.

Share this post


Link to post
Share on other sites

  • Advertisement