Is squeezing performance out of a console the same as premature optimization?

Started by
10 comments, last by S1CA 10 years, 4 months ago

How does a programmer squeeze the performance out of a console and is it a collaborative effort?

Is squeezing performance out of a console the same as premature optimization?

Is squeezing performance out of a console done fairly on when the game is being developed for the console or when the game is halfway done or when the game is closed to the deadline?

Is there any similarities with squeezing performance out of a console versus optimizing an application?

This thought dawned on me when I knew I could optimize my collision method in my 2D game by adding an extra condition to force the logic to only execute at a certain point rather than executing unnecessarily.

Does the premature optimization rule still hold today? The term was coined in 1979 and well a lot of time has passed since then. But I figure optimizing something like collision logic in my game will help the game performance. While I ran the game with or without the optimization, I have not seen any noticeable difference but the time I used up was probably 2 minutes so no harm was done and I feel the optimization will help in the long run when I make the game have more enemies.

I hope a lot of question is allowed. I figure I post them all here since all of my later questions relate to my first question?

Advertisement
Squeezing performance and premature optimization are not the same. There is no single definition of premature optimization, really. You want to think about performance early on and make correct architectural decisions that allow you to optimize and run your code efficiently. Good architecture that was built with the platform's performance characteristics in mind is vital.

When I think of premature optimization I typically think of shortcuts that were taken before the problem space was fully realized. Sometimes you can "optimize" something in a way that doesn't benefit you at all (in terms of design OR performance) later on because you applied the optimization before you really knew what kinds of problems your code was going to be asked to solve.

EDIT: Another similar optimization pitfall is performing a bunch of "optimizations" without really knowing why you're performing them. Profile! Make optimizations where necessary. Time is limited, you don't want to spend ages rewriting code just for a 0.01ms speedup. It's tempting to try and fix things you "know" (read: feel) are slow. Don't. Fix thinks you KNOW (read: have evidence for) are slow. That means profiling and experimenting and seeing what works and what doesn't in terms of performance.

Heh, "squeezing".

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

My journals: dustArtemis ECS framework and Making a Terrain Generator

Squeezing performance and premature optimization are not the same. There is no single definition of premature optimization, really. You want to think about performance early on and make correct architectural decisions that allow you to optimize and run your code efficiently. Good architecture that was built with the platform's performance characteristics in mind is vital.

When I think of premature optimization I typically think of shortcuts that were taken before the problem space was fully realized. Sometimes you can "optimize" something in a way that doesn't benefit you at all (in terms of design OR performance) later on because you applied the optimization before you really knew what kinds of problems your code was going to be asked to solve.

EDIT: Another similar optimization pitfall is performing a bunch of "optimizations" without really knowing why you're performing them. Profile! Make optimizations where necessary. Time is limited, you don't want to spend ages rewriting code just for a 0.01ms speedup. It's tempting to try and fix things you "know" (read: feel) are slow. Don't. Fix thinks you KNOW (read: have evidence for) are slow. That means profiling and experimenting and seeing what works and what doesn't in terms of performance.

Ah okay. Thanks. Great answer.

How does a programmer squeeze the performance out of a console

The same way you would any other platform—profile and fix.

and is it a collaborative effort?

No matter what the company or situation is, it is always best if all people can write competent code with no obvious pitfalls.
But not only does that never actually happen, there should still be at least one guy who is really good at and mainly focuses just on performance, at least for big projects.

Is squeezing performance out of a console the same as premature optimization?

No, and depending on who you ask, “premature optimization” is a misnomer in the first place.

Is squeezing performance out of a console done fairly on when the game is being developed for the console or when the game is halfway done or when the game is closed to the deadline?

How many do this and at which time in a project they start doing it may vary.
At tri-Ace that is my main job and I do it during around the second half of each project.

Is there any similarities with squeezing performance out of a console versus optimizing an application?

They are the same thing, unless you are talking about actual techniques involved.
For example:


for ( int i = iTotal; --i >= 0; ) {}

Works very well in Java, so you always want to use that in Java Android apps.
Works pretty well in x86/x64, so you generally want to use it on desktops.
I hardly notice a change on iOS devices and Nintendo 3DS.
Often fails on PlayStation Vita (even though it is also ARM)—it seems only to be faster when i is not used as an index inside the loop.
If you are going down to the level of ASM, they are very different.
Different intrinsics, different sets of assembly instructions, and in the case of Nintendo 3DS vs. conventional ARM VFP the syntax is different (it more closely matches how the assembly looks in the final product, which I prefer, except that it makes it impossible to use for a lot of things because you have to know the register of a variable, and you don’t).

Does the premature optimization rule still hold today?

That “rule” has done more damage than good for the same reason the “don’t write engines, write games” “rule” has.
In fact, you are a perfect example of it:

But I figure optimizing something like collision logic in my game will help the game performance. While I ran the game with or without the optimization, I have not seen any noticeable difference but the time I used up was probably 2 minutes so no harm was done

Some people are paranoid that what they are writing could be a game engine because they read an article telling them not to write game engines.
You’ve been raised to fear this thing called “premature optimization”, so now you are sitting here worried that you may have done it, but justifying it by saying it was just 2 minutes’ worth of your time.


Many people go beyond that and seem to almost intentionally write slow code or at least specifically avoid obvious little things they could do just because they are afraid of it being coined “premature”. For example, as long as the order of your loops don’t matter, in Java you should almost always use the loop I showed above.


But the real harm from “avoid premature optimizations” comes later when you decide to finally start optimizing.
Sometimes there will be a clear algorithmic bottleneck, but once you get those out of the way you are then looking at a sea of functions/methods in which there does not seem to be an obvious one to tackle next.
Perhaps you see a bunch that are taking 5% of the CPU and you feel content, oblivious to the fact that they should all be taking 4% or 3%.
Or that one that is just a bit high but you aren’t so bothered by it because it just peaks over everything else. Little do you realize it would be towering over the rest if proper attention to performance had been taken from the start.


A “premature” optimization is a destructive optimization.
It’s an optimization that either makes things slower or causes bugs.
It’s also subjective and it depends on your skill level and experience.
If it is your first time writing a new algorithm you should probably not try to heavily optimize every little part along the way. Make it run, then make it fast.
But if you have written it before, or even many times before, you may very well be justified in having an extra focus on optimizations from the very start.


And there is more.
Don’t be apologetic for spending 2 minutes even though you didn’t see a result.
You will never write fast code unless you try and fail or try and succeed (in other words, unless you try).
When you fail you gain a sense of what not to do in the future.
When you succeed you start to feel better about certain assumptions, such as that the for-loop I posted above is a better default loop than “for ( int i = 0, i < iTotal; ++i ) {}”.
Doesn’t mean it is always the faster loop but if you have to pick one or the other as a default you will win more times than not via the counting-down loop.
Something you learn in time by trying.



Just code.
You don’t always need to worry about making it fast, but you should never worry about not making it fast (not to be confused with making it slow; they are not the same thing).


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

+1 for L. Spiro, Although premature optimization is the root of all evil: good programmers are lazy, and wasting your time optimising things that don't matter is a bad hangover I see from ASM programmers in the past (I started just after that, PS1 time). Use a profiler, optimise what takes the most time to run.

Picking a better algorithm normally beats low level optimisations anyway.

"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley
The premature optimization quote is from Donald Knuth in the 1970s. And yes, it is still relevant:

"Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." -- Donald Knuth

On most projects I'm one of the lucky ones who gets to spend time with profiling and improving code. I agree with Knuth's numbers as approximations; most things that people THINK are slow really are not slow. The only way to know for certain is to actually measure, to profile the code, and to measure both before and after changes are made.


There are some non-squeezing areas. Some graphics routines get hand-coded for the device just because that is the only way they will work. In those cases it is not about optimization, it is about sane implementation and it represents some of those "critical 3%".


When it comes to squeezing out performance there are generally a few phases.

The first is to look for the obvious low-hanging fruit based on profiling results. These are functions that take microseconds but should be taking nanoseconds. Usually they are doing stupid things like calling expensive functions within a loop rather than calling it once and storing the results, or using multiply-nested loops to accidentally generate O(n^4) or worse behavior, calling strlen() every time through a loop, calling expensive constructors rather than cheap default constructors, and so on. Nothing specific to the hardware at this point, mostly it is bug fixes. Often this results in big benefits. Usually this is done early on and continuously; in several games I've worked on this has more than doubled the frame rate and improved stability on the first few passes.

The next thing is to look for the obvious algorithmic changes based on profiling results. If a single call to the pathfinder requires 2 milliseconds then obviously something major is wrong. Usually this means swapping out slow algorithms, replacing computation with lookup and cache, using partitioning methods to reduce required work, and so on. Again, generally nothing specific to hardware happens at this point either. Usually this also gives big benefits. Again this is something done many times during development, and some of the passes will dramatically improve performance.

Next up is to look for small global gains. These changes are generally tiny improvements on extremely high frequency functions. One of the common ones is to convert virtual functions called tens of thousands of times per frame into a non-virtual inline function; a virtual function has a tiny cost of about 10ns for the virtual part, and another 10-15ns for the function part, but when a small number of virtual functions are called a million or so times every second, that is 20 milliseconds per second that you can cut. This type of optimization is usually done later in the project. The performance benefit is usually just a few percent improvement so it isn't something to rely on.

Only after those are exhausted do you get into the hardware-specific low-level optimizations. It is uncommon to see big benefits here except in the most frequently called compute-intensive loops.


Most code simply isn't performance critical. The code that is performance critical is generally either a bug or a bad algorithm choice.


A “premature” optimization is... an optimization that either makes things slower or causes bugs.

It's also pretty much any optimisation that influences architectural decisions early in your project.

The cost of large refractors increases dramatically as a project progresses, which makes it imperative that your architecture be sound from the get-go. If you compromise early on architecture because of beliefs about performance, you risk paying the refactor tax many times over. Performance hotspots tend to change over the course of development, whereas architectural challenges don't.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

For me, a premature optimization is an optimization that adds complexity to the code and possibly makes it harder to read that's made before it has been determined that such an optimization is even needed. There's nothing wrong with writing fast code and optimizing early. I always try to do this, but after examining my projects needs, sometimes I choose the easier albeit slower implementation/algorithm over the faster and more complicated (and potentially buggier) one. If later I find that it's a bottleneck, I'll change it then.

When working on hobby projects as the lone coder, this is important. Take this example. A programmer is working on a simple 3D game and uses a 2D array to keep track of his objects locations in 3D space. Its simple and it works well, but he hears that 2D arrays are slow and quad-trees are faster so he changes his code. He's new to quad trees and adds a bunch of bugs. He's not sure if the bugs are from the quad tree or other parts of the code. On top of that, we didn't even have that many objects so using 2D arrays was actually more than fast enough for his project.

Learn all about my current projects and watch some of the game development videos that I've made.

Squared Programming Home

New Personal Journal

The "premature optimization" advice is one of the most misquoted rules of thumb in programming. It's still good advice, but we interpret it completely differently to Knuths original meaning.
He wrote than in a 1960's article about proper use of the goto statement -- a move towards function/procedure based programming that we use today!! He was arguing that using goto isn't bad as long as you're following certain structures (like calling a function, then returning to the same place, as we now do with the 'call stack').
To him, a premature optimization was one that made the code unreadable or harder to reason about (e.g. spaghetti code) for the sake of saving one clock cycle...

Modern commentators will use Knuth's phrase, but with entirely new, modern advice implied.

(1)
It's also pretty much any optimisation that influences architectural decisions early in your project.

(2)
The cost of large refractors increases dramatically as a project progresses, which makes it imperative that your architecture be sound from the get-go. If you compromise early on architecture ..., you risk paying the refactor tax many times over.

I would disagree with (1), because (2)! ;-)

If early architecture choices do lead to performance problems a the end of a project, then you're screwed.
Wide-reaching architectural choices, which underpin the rest of the code-base, are one time where you have to use a lot of care.

This topic is closed to new replies.

Advertisement