Clean Design vs Efficiency

Started by
15 comments, last by Emmanuel Deloget 13 years, 2 months ago
I have developed a 2D Game Engine that currently represents the scene as a series of Tile Objects, each tile has an associated texture.

At present there is a texture Palette, this is a collection that stores a single copy of each texture, each Tile then has a texture index which indicates which texture from the palette should be drawn.

Now, at present to render the tile the following steps are taken:

The texture index is got from the Tile,
The Texture is then got from the palette using the index
Texture is then drawn.

I have been wondering if, this is the best set up. The above setup has loose coupling and high cohesion and allows me to change how textures are stored in the engine quite easily, only the palette and the rendering engine need updating the game logic classes are unaffected, a Tile does not need to know how the texture is stored it just needs to provide the information needed to draw the correct texture.

What I was thinking of is that this process is long winded and has many execution steps, and instead I was thinking of making the Tile objects extend my texture class, so that the tiles themselves could simply be passed to the rendering engine and drawn. A single copy of each texture would still exist in the palette as before but instead of holding a texture index each tile would instead directly reference the texture in the palette.

This setup offers tighter coupling though and less cohesion and would mean if I wanted to change the way in which the texture data is stored, I would have to change all the classes as noted before, as well as the Tile class.

So I am left with the decision, possibly increase efficiency by reducing the execution steps, but increasing coupling and making the code harder to change or keep the current system.

It is worth noting that the current system has not brought forward any problems with speed or frame rate, but I have not tested the engine on lower spec machines. Also the main reason for the current system was to try and keep the game logic and the rendering engine separate as much as possible.

What is everyones' advice or opinion on this matter? Stick with what I have or change it around?
Advertisement
Always a balance. Neither is more important.

Tile rendering is a simple thing and doesn't take a lot of code, so if you're concerned about speed, then write a separate version for each platform you want the game to work on. A class for every tile sounds excessive to me. Look for the shortest route from the pixel data to the screen. Don't over-code things trying to do it "properly" when going into more procedural C style, operating on raw data and API calls would save CPU time, memory, and be more readable anyway.

Conversely, when you're doing more complicated and widespread game logic and stuff, object oriented styles tend to be more readable and get more work done in less lines of code. Often speed/memory use wind up better in the end anyway, because you can look for high-level optimizations and refactor things easily as needs change.

The old rule "90% of the time is spent in 10% of the code" helps when deciding what to optimize each part for... speed, memory use, readability, refactorability, reusability, portability, flexibility. Many properties to consider, each is more important in different situations.

Or you can do as I do nowadays and assume that it's pretty rare to come across a computer slow enough to have trouble rendering a tiled 2D game entirely in software anyway, and just write your own functions for everything except shoving your completed frame to the screen. That would be optimizing for portability and flexibility, since it's not at all hardware dependent, and you can do anything you want in software rendering. Definitely not optimized for speed, when I could instead use 3D hardware to render tiles way faster than I would ever need... but it's fast enough.
clean design should be efficient by default. it should be the logical best way to do something => should not be inefficient.

low level optimisations can then be applied as needed later, but only IF the design is good.
If that's not the help you're after then you're going to have to explain the problem better than what you have. - joanusdmentia

My Page davepermen.net | My Music on Bandcamp and on Soundcloud

Two additions.

Don't optimise prematurely.

Quote from C++ Coding Standards "We define premature optimization as making designs or code more complex, and so less readable, in the name of performance when the effort is not justified by a proven performance need (such as actual measurement and comparison against goals) and thus by definition adds no proven value to your program. All too often, unneeded and unmeasured optimization efforts don't even make the program any faster."

The other rule which should be considered though:

Don't pessimize prematurely.

Again quote from C++ Coding Standards "All other things being equal, notably code complexity and readability, certain efficient design patterns and coding idioms should just flow naturally from your fingertips and are no harder to write than the pessimized alternatives. This is not premature optimization; it is avoiding gratuitous pessimization.



Avoiding premature optimization does not imply gratuitously hurting efficiency. By premature pessimization we mean writing such gratuitous potential inefficiencies as:Defining pass-by-value parameters when pass-by-reference is appropriate. Using postfix ++ when the prefix version is just as good. Using assignment inside constructors instead of the initializer list. It is not a premature optimization to reduce spurious temporary copies of objects, especially in inner loops, when doing so doesn't impact code complexity. Item 18 encourages variables that are declared as locally as possible, but includes the exception that it can be sometimes beneficial to hoist a variable out of a loop. Most of the time that won't obfuscate the code's intent at all, and it can actually help clarify what work is done inside the loop and what calculations are loop-invariant. And of course, prefer to use algorithms instead of explicit loops.



It's worth noting that the biggest optimisation wins are the macroscopic, algorithmic changes rather than microscopic changes such as looking up by index versus pointer. Part of what makes a design 'good' is how amenable it is to those macroscopic changes in the future.

This is the case here too in fact! Your first design affords you an additional processing step, into which you can plonk different algorithms - from a simple array lookup by texture-ID to the more sophisticated state-minimising texture pre-sorter that uses the IDs to group together the rendering of tiles that have the same texture in order to cut down on needless and expensive texture switching.
Definitely clean design until profiling tells you otherwise. Finish the game first, then make it fast.

However, you should keep performance in mind while you design your software architecture. For example, when you design feature X, choose data structures which are cache friendly and algorithms which are fast. Then write your game as cleanly as possible and then profile and go back and optimize the hotspots. This will probably make parts of the code less clean, but overall the codebase should remain fairly clean. Basically, don't optimize prematurely.

It will be easier to optimize clean code than prematurely optimized code anyway.
Design and code efficiency are mostly orthogonal, although there is a good possibility that a clean design will help to create efficient code. Anyway, you don't have to chose between these two, as you can get both if you do things carefully.

Regarding your personnal experience, I don't see anything wrong. Whatever you do, you will end up by fetching the correct texture (and since you want only one in memory - that's the idea behind tiles - you have ti put some indirection somewhere) before you can draw the tile. Keeping a pointer or retrieving this pointer after a fast look up is not that dramatic, and will not give you much in tems of performance (how many times is this lok up performed per frame ? My guess is "not that much", because there is no reason to do this more than N times per tile (where N is probably 1 ; perhaps 2 or 3 if you use multiple texture types to make some special effects).

Design and code efficiency are mostly orthogonal, although there is a good possibility that a clean design will help to create efficient code. Anyway, you don't have to chose between these two, as you can get both if you do things carefully.
Efficient code these days often means being friendly to the cache (see Pitfalls of OOP). Different designs can make this easy or impossible.

Given the following problem:
* There are two types of "WorldObjects" - Foo and Bar.
* Each frame:
** Every WorldObject needs to be updated.
** Every Foo object needs to be processed via the DoFoo routine, with the parameter 'x'.
** Every Bar object needs to be processed via the DoBar routine, with the parameter 'y'.

A "typical" C++/Java design might look like:class WorldObject
{
public:
virtual void Update();
};
class Foo : public WorldObject
{
public:
void Update();
void DoFoo();
};
class Bar : public WorldObject
{
public:
void Update();
void DoBar();
};


typedef vector< shared_ptr<WorldObject> > ObjectVec;
ObjectVec objects;
int x = ... , y = ...;
...

for( ObjectVec::iterator i=objects.begin(), e=objects.end(); i != e; ++i )
{
(*i)->Update();
Foo* pFoo = dynamic_cast<Foo*>( &**i );
Bar* pBar = dynamic_cast<Bar*>( &**i );
if( pFoo )
pFoo->DoFoo(x);
if( pBar )
pBar ->DoBar(y);
}
...Which has completely unpredictable memory access patterns, is oblivious to the cache, and can't easily be compiled for NUMA CPUs like an SPE.

An alternate design (which can also be implemented in clean, modern C++) gives incredibly superior performance due to cache coherency, and more easily ports to parallel or NUMA architectures:struct Foo {}
struct Bar {}

typedef vector<Foo> FooVec;
typedef vector<Bar> BarVec;

void Update( FooVec& );
void Update( BarVec& );
void DoFoo( FooVec&, int x );
void DoBar( BarVec&, int y );

FooVec foo;
BarVec bar;
...

int x = ... , y = ...;

Update( foo );
Update( bar );
DoFoo( foo, x );
DoBar( bar, y );

If you're designing anything to do with algorithms or data, then you're making performance decisions.
Efficient code these days often means being friendly to the cache (see Pitfalls of OOP). Different designs can make this easy or impossible.
Agreed, however I still voted for design. I believe that there's a thing which is often not stressed enough: not all objects are created equal. Some are heavy enough that this won't make any sense, while others are so small that even a single CALL will hammer them. I think a good design - a well done good design - will have to take this in consideration so the two are closely related to a certain degree.

Previously "Krohm"

I get a feeling that the people who voted for "efficiency" have not worked in a corporate setting for long, or at all. If a design is clean, as it's been said, it will be easy to change large scale factors that would make it efficient later. Developing poor code adds a lot of time when something large scale wants to get changed, and in the end may doom a project. Develop a clean design and you will have, by nature, developed an efficient one as well.
/ Visual Studios 2010 / Codeblocks 10.05 / Windows 7 / Ubuntu 10.10 / - I might be wrong

This topic is closed to new replies.

Advertisement