Sign in to follow this  
Aqua Costa

Storing and sorting objects on my engine

Recommended Posts

I want to know which is the best way to store and store the objects in my engine.

Ive got a scene manager class, and I dont know how many objects there will be in my game to I create a std::list of objects and keep adding the object I need.
To sort the objects I calculate the distance between the object and and eye position.

Is it ok to use an std::list or should I store the objects differently?

Share this post


Link to post
Share on other sites
No, it's really not ok to use a list for your scene manager. Too much overhead and way too slow because of memory fragmentation. Use a std::vector of pointers to your game objects. Reserve some space up front (vector.reserve()). A vector will be enormously faster thanks to contiguous memory allocation. Now, for sorting, you can use std::sort on the vector. The STL algorithms work well! Use them. You'll need to make a function that takes two pointers to your game object class and returns the Boolean value of a depth-comparison between them. You'll need to pass that function to the sort function so that the algorithm knows how to compare depths.

Hope that gives you an idea of how to approach it!

Share this post


Link to post
Share on other sites
Quote:
Original post by KulSeran
Quote:

...and way too slow because of memory fragmentation. Use a std::vector of pointers to your game objects.

A vector of pointers is bad for fragmentation too.

At least use something like boost::pool for allocating those pointers to help reduce fragmentation.


Of course, if the pointers are allocated carelessly. I should have been more clear - use a memory pool!

Still, lists have more overheard regardless.

Share this post


Link to post
Share on other sites
so I will use a std::vector, my question now is:
I created a vector of elements of this structure:

struct Object
{
ID3DX10Mesh* mesh;
D3DXVECTOR3 position;
float distanceToCameraPosition;
}



And I want to sort the elements of the vector by distanceToCameraPosition;
So how do I use the sort() function in a std::vector<Object> and sort it the distanceToCameraPosition?

Share this post


Link to post
Share on other sites
std::sort. It takes a range to sort on, and optionally, a comparison operator.
If it makes sense in context, you can overload operator < and std::sort will just work. If that doesn't make sense, or you need more information at the time of comparison, wrap everything into a comparison function object.

class CompareObjects
{
CompareObjects() {}
bool operator ()( const Object &a, const Object &b )
{
return a.distanceToCameraPosition < b.distanceToCameraPosition;
}
};
//...
std::vector<Object> vec;
//...
std::sort( vec.begin(), vec.end(), CompareObjects() );

Share this post


Link to post
Share on other sites
Hi,
i want to hear more arguments as to why not using list (of pointers) for this "thing".
i was reading from this page (among others) and have concluded that list have all
"attributes" that i need vs. vector witch i have found to be difficult for me to
use/implement into my test projects (i am beginner so i might overlooked some things).

Share this post


Link to post
Share on other sites
What are all of the use cases that you intend to use the collection for? Different containers are good for different things. In most cases, sorting a vector will be faster than sorting a list just because of memory continuity which helps leverage the cache better. It is however possible to also get similar performance from a list if you privide a memory pool allocator to the list. Ultimately even in the best case the list would still being doing just an ounce more work during the sort since it has to assign next/previous pointers asside from just the data. It's pretty easy to throw together a sort test with roughly the same size data and same number of objects as you expect and see how the two options compare. The results might be interesting.

You might have other goals for the collection that you didn't list here which could ultimately result in different recomendations. Problems in isolation often have different solutions than what results in real world scenarios.

Share this post


Link to post
Share on other sites
After some research/testing i have realized that you are right.
Found some performance measure code on this forum and used it
in my test project:


#define _CRT_RAND_S

#include <vector>
#include <list>
#include <iostream>
#include <algorithm>
#include <functional>
#include <windows.h>
using namespace std;

class Object
{
private:
float DistToCam;
public:
Object()
{
DistToCam = 0.0f;
}
~Object()
{
}
explicit Object(float d)
{
Set(d);
}
Object(const Object& other)
{
*this = other;
}
Object& operator = (const Object& other)
{
if(this != &other)
{
Set(other.DistToCam);
}
return *this;
}
void Set(float d)
{
DistToCam = d;
}

bool operator > (const Object& other) const
{
return DistToCam > other.DistToCam;
}

bool operator < (const Object& other) const
{
return DistToCam < other.DistToCam;
}

struct BackToFront : public binary_function<Object*,Object*,bool>
{
bool operator () (Object* left, Object* right)
{
return (*left > *right);
}
};

struct FrontToBack : public binary_function<Object*,Object*,bool>
{
bool operator () (Object* left, Object* right)
{
return (*left < *right);
}
};
};

float Random(float maxf)
{
float numOut(1.0f);
UINT num(1);
errno_t err = rand_s(&num);
if(0 == err)
{
numOut = (float)num / (float)UINT_MAX * maxf;
}
return numOut;
}

int main()
{
const UINT NumObjects = 5000;
LARGE_INTEGER before, after, frequency;

list<Object*> lObj;
list<Object*>::iterator iLObj;
for(UINT i = 0; i < NumObjects; i++)
{
Object* pObj = new Object(Random((float)NumObjects * 2.0f));
lObj.push_back(pObj);
}

QueryPerformanceCounter(&before);
lObj.sort(Object::BackToFront());
QueryPerformanceCounter(&after);
after.QuadPart -= before.QuadPart;
QueryPerformanceFrequency(&frequency);
cout << "list sort time: " << (double)after.QuadPart / (double)frequency.QuadPart << "s" << endl;

for(iLObj = lObj.begin(); iLObj != lObj.end(); iLObj++)
{
Object* pObj = *iLObj;
delete (pObj);
pObj = 0;
}

vector<Object*> vObj;
vector<Object*>::iterator iObj;
for(UINT i = 0; i < NumObjects; i++)
{
Object* pObj = new Object(Random((float)NumObjects * 2.0f));
vObj.push_back(pObj);
}

QueryPerformanceCounter(&before);
sort(vObj.begin(), vObj.end(), Object::BackToFront());
QueryPerformanceCounter(&after);
after.QuadPart -= before.QuadPart;
QueryPerformanceFrequency(&frequency);
cout << "vector sort time: " << (double)after.QuadPart / (double)frequency.QuadPart << "s" << endl;

for(iObj = vObj.begin(); iObj != vObj.end(); iObj++)
{
Object* pObj = *iObj;
delete (pObj);
pObj = 0;
}

cin.get();
return 0;
}




output from console:

list sort time: 0.300784s
vector sort time: 0.126097s


btw. is this "valid" way to do performance analisis?

EDIT: this results are from Debug build, with Release (Full Optimization /Ox
and Favore Fast code /Ot) i got similar result where vector is almost 2.5x faster.

[Edited by - brega on October 13, 2010 5:13:32 AM]

Share this post


Link to post
Share on other sites
The main thing about performance analysis is telling what it's worst case complexity is using an induction proof. O(n) might be slower than O(n lg n) for default sizes but O(n^2) as a worst case is dangerous even for small collections. I use in place heap sort with O(n lg n) complexity.

Linked lists are one of the worst datastructures for sorting because most sorting algorithms (Quick sort for example) require random access in constant time.

I use a padded dynamic array of pointers for my graphics engine and it can do insertion and deletion in constant time. There is no fragmentation from deleting because pointers have the same size and can swap places to fill any gaps. I only reallocate new memory when I run out of memory, and that only occur when loading a larger level.

Insert:
By increasing the counter, allocating if needed and filling the last space.

Delete:
By swaping place between the removed pointer and the last pointer, free the last element and decrease the counter.

Delete all:
By freeing all pointers and set the counter to 0.

Share this post


Link to post
Share on other sites
If I was writing an engine or game in "native" C/C++, I would write my own memory manager. It can be very difficult and time-consuming to do, much less, do right...but it can offer some significant advantages. The default CRT memory manager isn't really made for games, mind you. Games often make tons of allocations and deallocations in rapid succession; many of them very "lean" objects. The default memory manager isn't really designed for that sort of thing. Fragmentation can muck up your memory playground and it can be darned slow. If your game isn't really memory intensive, then this may not be worth the time. But if your game is big, bad and needs every tick of performance and needs to be memory efficient to the byte, then you should consider rolling your own memory manager.

The basic idea is you allocate a large, but appropriately sized, block of contiguous memory right at the start. Through using "smart pointers" or your own self-invented reference system, your memory manager can even provide garbage collection and heap compaction (though compacting the heap and moving objects can be problematic; because what if you have to expand and/or move everything the next frame after compacting?). You can take advantage of "placement-new" in C++, or provide your own "malloc/calloc-like" functions to place objects into your managed heap. There are lots of different ways to go with it, and the only "right" way would be the way that works the most efficiently for your needs.

I'm not sure what sort of performance/memory demands your project makes, so you'll be the one to make the evaluation. But if I wasn't using C#, which has a built-in memory manager and garbage collector, I'd implement my own in my engine. I wrote a simple one in C++ a while back for a different type of project, and it was successful within the scope of my needs.

P.S. - What is up with the forums posting "C" instead of the symbolic C-plussign-plusign for the language called "C-PlusPlus"? Everywhere I typed it it put just "C" instead of C with two plus signs's.

Share this post


Link to post
Share on other sites
Quote:
Original post by brega

list sort time: 0.300784s
vector sort time: 0.126097s

EDIT: this results are from Debug build, with Release (Full Optimization /Ox
and Favore Fast code /Ot) i got similar result where vector is almost 2.5x faster.
Sorry brega, I had a look at the code but there's a thing I have difficulties in getting. Is this the time to sort NumObjects = 5000; elements? When you say retail build, you mean that the time becomes list(.14s) and vector (.6s) (2.5 ratio or 2.5 reduction over debug)?

Share this post


Link to post
Share on other sites
@Krohm

Quote:

Is this the time to sort NumObjects = 5000; elements?

This is the number of elements.

Quote:

When you say retail build, you mean that the time becomes list(.14s) and vector (.6s) (2.5 ratio or 2.5 reduction over debug)?


I meant that sorting vector is about 2.5 times faster then sorting list
in my test be it release or debug .

Release build:

list sort time: 0.00170832s
vector sort time: 0.000696737s


Debug build:

list sort time: 0.292196s
vector sort time: 0.124949s

Share this post


Link to post
Share on other sites
You might consider profiling the sort just to see if anything funky is going on. I never tracked it down, but I did a sort on a vector once and found that it was doing a massive number of allocations during the sort, which was hugely unexpected. I switch algorithms implementations and got a handy speed up.

Share this post


Link to post
Share on other sites
If you're doing a sqrt when calculating the distances, and if you're not using the distances for anything else, you can save some cycles by not doing the sqrt. The order will work out the same anyway.

Quote:
Original post by keinmann
If I was writing an engine or game in "native" C/C++, I would write my own memory manager. It can be very difficult and time-consuming to do, much less, do right...but it can offer some significant advantages. The default CRT memory manager isn't really made for games, mind you. Games often make tons of allocations and deallocations in rapid succession; many of them very "lean" objects. The default memory manager isn't really designed for that sort of thing. Fragmentation can muck up your memory playground and it can be darned slow. If your game isn't really memory intensive, then this may not be worth the time. But if your game is big, bad and needs every tick of performance and needs to be memory efficient to the byte, then you should consider rolling your own memory manager.

The basic idea is you allocate a large, but appropriately sized, block of contiguous memory right at the start. Through using "smart pointers" or your own self-invented reference system, your memory manager can even provide garbage collection and heap compaction (though compacting the heap and moving objects can be problematic; because what if you have to expand and/or move everything the next frame after compacting?). You can take advantage of "placement-new" in C++, or provide your own "malloc/calloc-like" functions to place objects into your managed heap. There are lots of different ways to go with it, and the only "right" way would be the way that works the most efficiently for your needs.

Yes, this. In Windows you can just use VirtualAlloc to reserve about half a gig or a gig, and pull down from it as required (committing 64 k chunks at a time). No fragmentation, everything contiguous, a one-time only initial overhead, almost immune to "lots of small allocations" syndrome, and easy as pie to wipe all allocations without having to worry about leaks. You miss a lot of the nicer STL functionality, but much of it is replacable by CRT functions (qsort instead of std::sort, for example). It's not a silver bullet and you still need to do some management yourself, but for the type of allocation it's suited to, it's hard to beat.

I'm sure that there are Linux/OSX equivalents but I don't know them.

Share this post


Link to post
Share on other sites
I have another question.

Lets imagine that total number of object in scene is 5000 (like i used in my previous post). I don't need to sort all of them but just ones that are "visible" by "player" camera. Then (i imagine) i need two vectors, one witch holds all objects in the scenes/level and second that holds objects that should be rendered. So i need to fill second vector with that objects and then sort it. How do you people manage this problem?

EDIT: "fill" is the wrong word. i need second vector to store just pointers that point to objects in first vector that should be rendered.

[Edited by - brega on October 16, 2010 5:47:55 PM]

Share this post


Link to post
Share on other sites
You really can get the best of both worlds by writing your own allocator and then using it in conjunction with the STL containers. It's not even that difficult. It also allows you to customize smaller memory pools for different needs so that any one collection can remain contiguous for all elements.

Share this post


Link to post
Share on other sites
To be honest, if sorting is your bottleneck then you're either doing something very badly wrong in your sort routine or doing something incredibly miraculously right everywhere else.

You might find that just sorting them all anyway and skipping over the ones you don't want to render has less overhead than using a second list.

Share this post


Link to post
Share on other sites
@DieterVW
Quote:

...writing your own allocator...


Witch i don't know how. What is requirement for vector allocator?
What operators/functions do i need to implement in my allocator that
vector are using?

@mhagain
Quote:

You might find that just sorting them all anyway and skipping over the ones you don't want to render has less overhead than using a second list

That might be my last option if i don't find the other way around.

Share this post


Link to post
Share on other sites
For fixed size allocation, reuse chunks of memory by inserting garbage memory into a linked list of pointers. The pointer to the next chunk will have a special place in each allocation. Each time you need a new allocation of that size, take out the first pointer from the list and reuse it. If the list is empty, make a new allocation. You may also use padding to the power of two so that you round up from your requirement and for example use 128 bytes for an 80 byte struct. Your initial allocation can be one big chunk divided into smaller parts by calculating new pointer addresses.

Share this post


Link to post
Share on other sites
Dawoodoz,
you make my brain melt down.
Since i am a beginner i need to chew this one step a time so please try
and explain it to me.

1. Are you talking about some custom linked list container that you've made?

2.
Quote:

For fixed size allocation...

known number of elements in front?
something like vector::reserve but for list?

Quote:

reuse chunks of memory by inserting garbage memory into a linked list of pointers.

what is chunk? single element in a list?
what is garbage memory? list filled with (for example) same dummy element?

3.
Quote:

The pointer to the next chunk will have a special place in each allocation...

i don't get this, what this means?

And rest of text also is confusing for me.

Share this post


Link to post
Share on other sites
Collections of pointers can be dynamic arrays with padding that you usually don't need to reallocate and reallocating an array of pointers is very fast when the memory runs out.
The pointer collections can refer to fixed size chunks that efficiently can be reused by your garbage list.

Each type of struct that you use with the system will have a void* pointer as it's first element that we call "next".
The list is a pointer to NULL if empty or the first struct otherwise.
Whenever you want to release an allocation, the "next" pointer will point to the old head of the list and the new head of the list will be a pointer to your "released" memory allocation. Therefor you don't need another allocation for the link list because it borrows memory from the items that you throw away.

A chunk is what I call a part of the memory without any datatype assigned to it.
When you allocate memory, you get a pointer to the beginning of the interval in the linear memory address space.

Here is good links with the things that you need to learn:
http://www.openismus.com/documents/cplusplus/cpointers.shtml
http://www.macs.hw.ac.uk/~rjp/Coursewww/Cwww/linklist.html

Share this post


Link to post
Share on other sites
no need for the first link, i have already got that basic knowledge about pointers.
i have never tried to make my own linked list container but instead i use std::list one so i have no idea how this "system" manipulates elements. To me its just a "black box".
i will try to make my own tomorrow based on your second link (and ill try to find some others for reference) and your "ideas".
if i stuck i hope you don't mind continue helping me?
thanks greatly.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this