stack vs heap

Started by
30 comments, last by King Mir 11 years, 2 months ago

Generally the whole point of the heap is to allow you to have precise control over when you allocate and deallocate memory. Obviously you can't just exit function calls constantly to free memory, especially if something runs in a tight loop and parts of the reserved memory need to stay in use and others need to be freed as you go. For games this is an obvious one since you get things like "permanent" entities in the world that need to be removed when they are no longer used.

I've noticed most people here recommend using smart pointers simply because it adds things like reference counting which is minute overhead for the benefit of the memory freeing itself when no longer needed, but, honestly, it all comes down to your needs. There's no strict reason you HAVE to use a smart pointer, in fact you can free allocate and free things and null pointers whenever you like, but its on your head if something gets missed.

Advertisement

if your writing a real time game, I would recommend an entirely different approach:

code should be C, not C++. No OO, no virtual methods, no mallocs (new, etc), no polymorphism. none of that slow stuff!

vars should be passed by reference with the fastcall convention and no stack frame.

the size limit for the stack segment supposedly went away years ago (at least in windows OS), so you should be able to declare huge arrays locally in a procedure without exceeding stack segment size. If its still an issue, simply do a malloc at procedure start and a free at procedure end.

whenever possible, declare data structures as global statics. instead of new-ing and disposing a bunch of objects for NPCs all the time, have a global array[MAXNPCS] with an active field in the struct. always use arrays vs linked lists. in general, always do whats fast, not whats considered "correct" by non-game programming standards. if its the "correct" way to do things in the real programming world, odds are its totally wrong for high performance real time games.

do you have any idea exactly what happens (at the instruction set and clock cycle count level) when you create an object? talk about SLOW code!

remember OO was invented for people who couldn't write ADT (abstract data type) style non-spaghetti code, and as a slicker way to encapsulate variant records and procedural variables. It has nothing to do with writing games that run fast. May make it easier (for some) to write code, but slower code on average, unfortunately.

like most things in game development, there's the easy way and the fast way to do it. if you code each line with clock cycles in mind form the get go, you'll have very little optimization to do at the end.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

if you code each line with clock cycles in mind form the get go, you'll have very little optimization to do at the end.

This is absolutely true, because you'll probably never finish your game. You probably won't even get 5% of it done by the time you (hopefully) realized you've seriously screwed up and are prioritizing horribly wrong. And by then, your code will be so defunct you might as well start over from scratch. So yes, very little optimization to do at the end indeed, because you won't reach the end.

FWIW, optimizations should be focused on algorithmic optimizations. Optimizing a single line of code (or a bunch of lines individually) will likely have very little effect on your program's performance (the rare exceptions are incredibly tight loops that are known to be significant bottle necks (known after profiling)).

And Norman, you're insanea complete troll if you seriously think you shouldn't use any dynamic memory allocation. Yeah, definitely a troll. Not even gonna try to respond to your post anymore.

[size=2][ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]
Once somebody who coded every line with clock cycles in mind went through one of my functions and optimised it. He did an amazing job. Of course, it was completely unreadable.

Then I realised that the function didn't need to be called and deleted it.
You should optimize your code for clarity first and correctness second. Regarding performance, think about what algorithms and data structures you use, avoid writing code that you know will be slow, and then only optimize something for performance if your program is too slow and you have a profiler run indicating this is where optimization is needed.

You'll get much further with this advice than with the optimize-every-line nonsense from a few posts above this one.
The funny thing is with the speed difference between CPUs (and the fact they are out of order monsters) and memory access you are probably better off worrying about your memory layout before worrying about every CPU cycles.

Sure, don't do more work than is required but often data access patterns can make a bigger difference than your instruction count.

In my opinion you should always focus on the clarity of the code first, as someone else stated. Refinement is a natural process and code is often slowed down 95% in small areas and 5% around the whole rest of the program, it's all about those time critical areas and those are often easier to refine than trying to refine EVERYTHING.

Of course thats not to say you shouldn't keep performance in mind but don't go overboard, keep in mind basic optimizations like reducing redundant statements or doing large copies of data constantly, things like that cause slowdown more than anything else.

To me it's kind of beginner methodology along with the "I have to write everything myself" idea, that every single piece of code needs to be optimized to hell and back or it is embarassing, for most programmers if code -works- it is already pretty good.

As I see it now, I should use stack vars if and only if it's life span does NOT need to exceed the scope of it's creation {}.
Also, a "little" advantage of heap vars is that I can free them in the "middle" of the scope, thus the memory usage is a little smaller.
Ok! I will use stack vars where I can !

Keep in mind that the examples you've looked at so far, these "stack allocated" objects have always been local variables, which means that they do exist on "the stack", and their scope is the scope of the function. However, functions aren't the only thing that have a scope.
For example:


struct Foo
{
  int a, b, c;
};
struct Bar
{
  Foo foo
};
void Test()
{
  Bar barStack;
  Bar* barHeap = new Bar;
  delete barHeap;
}

The barHeap object is created on the heap, which mean that it's members are also created on the heap.
However, the foo member of Bar isn't a pointer or reference - it's a value member of Bar - the kind of variable we've been calling "stack variables". It's scope ({}'s) is the Bar object itself, which means that when the Bar object is created, the Foo is also created, and when the Bar object is deleted, the Foo is also deleted. They have the same lifetime. So instead of saying that foo has a "stack lifetime", it's better to say that it has an "automatic lifetime" (which means it's lifetime is the same as it's scope/{})

About shared_ptr. I read now that that Bjarne consider this as bad idea and this is considered as bad c++ design pattern. Anyone can put some light on this.
About unique_ptr. I do NOT want to write managed code. Is unique_ptr managed code ???

No, unique_ptr is not managed code, it's a part of the standard C++ library (it's full name is std::unique_ptr).
It's bad to use shared_ptr when you don't need such complexity. shared_ptr is for when you need reference counting.
If you don't need reference counting, and the object has only a single owner, then unique_ptr should be used.

And Norman, you're insanea complete troll if you seriously think you shouldn't use any dynamic memory allocation. Yeah, definitely a troll. Not even gonna try to respond to your post anymore.

The easiest way is to call malloc once to get a really large buffer, and then re-implement your own version of malloc that operates inside this buffer. Then you can still perform dynamic allocation (with similar costs to malloc) while saying that you never actually use malloc.
Yes that is a (mostly) sarcastic suggestion. wink.png

Seriously though, look at any game from the 90's or early 00's and it will be very rare to find any new/malloc usage within the game loop (i.e. outside of the loading/caching phase). It's definitely possible.
My engine is all C++ (and not the "C with classes" style of C++), but new is banned (a custom macro that wraps around placement-new is used in it's place), and malloc is used very rarely - usually to create large allocations which act as pools of specific objects (e.g. a pool of 1000 monsters that you can dynamically allocate/free from/to), or to create large allocations for scope/stack allocators. Not only is scope/stack allocation must simpler (faster) than new/malloc, but I personally find it to be less error prone and simpler to think about too (e.g. no leaks without even the hassle of using smart pointers).
Again, technically it's dynamic memory allocation, but it's not using new/malloc wink.png

Also, this is just for the engine. My game code is Lua, which mallocs excessively and is much slower than C++, but isn't performance critical happy.png

have a global array[MAXNPCS] with an active field in the struct

That's decent advice, but there's no reason it has to be a global. If you create it with malloc, then the cost of that allocation is amortized across every element, so it ends up being very cheap per each NPC, plus you only do it once so it's not a big deal even if it is expensive.
Also, the 'active' field probably shouldn't go into the struct. It will potentially increase the size of the structure by 4 bytes, when you only need one bit. It'd be much more compact to store an array of 'active' bits alongside the array itself. Also, the 'active' bits are probably not useful to the "NPC" (etc) class itself, so it would be best not to accidentally drag them into the cache whenever you operate on an NPC.
Alternatively, you can use a free-list instead of 'active' bits -- alongside the actual data array, you have an array of integers that act as a linked-list (implemented as an array, of course wink.png) of indices of all the elements that aren't active. When allocating an element, you can use the free-list to instantly find an inactive one, as longs as when freeing you return it's index back to this list.
My pool class uses this technique, and it turned out to be faster to allocate from than even my scope/stack allocators (which are ridiculously simple)!

remember OO was invented for people who couldn't write ADT (abstract data type) style non-spaghetti code, and as a slicker way to encapsulate variant records and procedural variables. It has nothing to do with writing games that run fast. May make it easier (for some) to write code, but slower code on average, unfortunately.
code should be C, not C++. No OO, no virtual methods, no mallocs (new, etc), no polymorphism. none of that slow stuff!

C's ADT's are equivalent to interfaces in OOP languages... If you were writing object oriented code in C, you'd likely use ADTs to do so... so the above doesn't really make sense.
OO code does not have to use new excessively, and it doesn't have to use virtual or polymorphism...

Comparing OOP C++ code that's slow because of unnecessary use of dynamic-dispatch and dynamic-allocation, against procedural C code that's well written is not a valid comparison. I can just as easly write slow C code that uses unnecessary dynamic-dispatch (home-made implementation of virtual) and excessive malloc calls, and then compare it against some well-written fast C++ OOP code... but there's no point in such mud slinging.
You're really talking about coding styles, not languages, here.

like most things in game development, there's the easy way and the fast way to do it. if you code each line with clock cycles in mind form the get go, you'll have very little optimization to do at the end.

All the current-gen console games that I've shipped have been mainly written in a slow "scripting" language like Lua. Yeah, thinking about performance is still important, but with Lua, that comes in the form of optimal algorithms and not creating garbage. All that matters is that the game runs within it's budget, which for us was 16ms of time in the Lua VM. That left 17ms of time on the main CPU core (and 33ms on the GPU and other CPU cores) for the engine to use. The engine is C++, and only sacrifices readability/maintainability for performance where necessary -- i.e. unless something is used in a tight loop, it's actually more valuable to the company in the long run for it to be easy to understand and modify over time than that it's as fast as possible, so that multiple games can get out the door in the smallest time-frames.

remember OO was invented for people who couldn't write ADT (abstract data type) style non-spaghetti code, and as a slicker way to encapsulate variant records and procedural variables. It has nothing to do with writing games that run fast. May make it easier (for some) to write code, but slower code on average, unfortunately.

OO doesn't only make it much easier to represent coherent and understandable complex systems in an programming environment but also makes code maintenance a trivial task. What if you're months into development and decide you need to change something you did at the very beginning? How are you gonna find that portion of code, how will you make sure it doesn't affect the rest of the code.

Also, OO is essential to any kind of collaborative programming due to how it modularizes code, allowing a group of programmers to focus on developing his part of the project without having to worry about the rest of the code. There can't be a better example of how OO is important than game programming as it's a many people project, unless you're making Pong.

Tiago.MWeb Developer - Aspiring CG Programmer

Yes, you should use automatic (stack) lifetimes where possible, and heap lifetimes where you have to.
n.b. with your example of using new, it returns a pointer to an object (in Java, every "object"-type variable is implicitly a pointer, but in C++ you need the "*")
{
Object* obj1 = new Object();//obj1 is a local variable, that is a pointer to a heap allocated object
Object& obj2 = *(new Object()); //a reference is pretty much the same thing as a pointer, but looks like it isn't ;)
Object obj3;//local variable (stack allocation)
delete obj1;
delete &obj2;
delete &obj3;//this is an error! It wasn't made with 'new' so you can't use 'delete'
}//obj3 deleted here automatically due to the stack unwinding

I am also confused by memory managment in c++. When stack object is freed is there a memory "hall" in stack.

You can't manually "free" an object from the stack. Stack frames are pushed and popped for each "block" of code (basically every set of "{ ... }"). Whenever a "}" is reached, all the stack variables that were created in that block are destroyed. So there are never any holes.

For an example of where stack allocation isn't applicable, imagine a function that makes a new monster appear:

Monster* SpawnMonster()
{
  Monster m;
  return &m;//error! m is created with the same scope as this function. When the function returns, m is "deleted".
}
 
Monster* SpawnMonster()
{
  Monster* m = new Monster();
  return m;//correct, but someone has to delete this at some point, otherwise you've got a memory leak
}
shared_ptr<Monster> SpawnMonster()
{
  return shared_ptr<Monster>(new Monster());//also correct. This acts more like what you're used to in Java. When there are no more shared_ptr's that reference the Monster, it will automatically be deleted.
}

I'm a little confused ;)

Can you please tell me, in the next example, mainCanon is in the heap or stack?

class Tank{

public:

Canon mainCanon;

...

}

class Army{

public:

Tank tanks[20];

...

}

void main(){

Army* army = new Army();

}

Thank You!

This topic is closed to new replies.

Advertisement