• Create Account

FREE SOFTWARE GIVEAWAY

We have 4 x Pro Licences (valued at \$59 each) for 2d modular animation software Spriter to give away in this Thursday's GDNet Direct email newsletter.

Read more in this forum topic or make sure you're signed up (from the right-hand sidebar on the homepage) and read Thursday's newsletter to get in the running!

# DOD and memory layout

Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

32 replies to this topic

### #21Antheus  Members   -  Reputation: 2397

Like
1Likes
Like

Posted 24 January 2012 - 01:01 PM

This is a very basic example. Is this data oriented? It is already faster than a traditionnal oo approach but I have to call the processing function for each set of entities.

It's the basic idea. But instead of clumping everything, just use what you need. Position is useful for many things, but it might not be needed for rendering. So be prepared to change what belongs together.

Orcs allOrcs = Orcs();

Think beyond. There is no need to have just orcs. Anything that is subject to movement or physics goes into here. Or perhaps everything that has a position in the world. Only thing all these have in common is position. Or vice versa - if it has position, it's in that array.

There might be more if it's convenient, but position of an orc is no different than that of an elf or a cannon ball. Even when combined with speed, physics are the same for all. What it then comes down to is speed changes, perhaps the acceleration values. Then above might be extended as something like this, in simplest form:
Vector2D * positions;
Vector2D * speeds;
Vector2D * acceleration;
float * mass;
To make orcs move slower, they might have their mass value higher than something like an elf.

And later, things can be clumped together:
struct PhysicsProperties {
float mass, width, strength;
}
...
PhysicsProperties * props;
Here, instead of having individual properties per item, one links to a shared template.

class Orc : .... {
Vector2D position;
float mass, weight, strength;
Vector2D acceleration;
Vector2D ....
...
}
More familiar, but it forces redundant data to be passed all over - renderer does not need physics properties, for example.

### #22Yuukan  Members   -  Reputation: 100

Like
0Likes
Like

Posted 25 January 2012 - 01:54 AM

Like those bitsquid slides say, the method behind DOD is "isolate tasks, find data objects". In this case, "update position" is the task and position and speed are the data that the task needs. Hence they should be together - they are one data object.

Thanks for the answers. So I must do something like this :
struct Position {
int posx;
int posy;
};
struct Orcs {
Position pos[MAX_ORCS];
int speed[MAX_ORCS];
};

Or maybe this way:
template <unsigned int COUNT>
struct Positions {
int posx[COUNT];
int posy[COUNT];
};
struct Orcs {
Positions<MAX_ORCS> pos;
...
};

?

void ComputePosition(Positions* pos, int* speed, unsigned int count);


And then try to merge all properties needed by the same tasks? The Entity will be just an unsigned int (index in the array for speed, positions and so on) or am I wrong?

### #23medv4380  Members   -  Reputation: 98

Like
0Likes
Like

Posted 25 January 2012 - 11:08 AM

Yuukan, I think you have it close.
Your method matches a valid example http://stackoverflow.com/questions/1641580/what-is-data-oriented-design

How you structure the data will depend on what you're doing, and turch does have a point since X and Y are apart of the Position it would be good to keep them together.

You may also want to consider an additional layer of abstraction because if you approach it with what you have you'd end up with a struct for orcs, elves and anything else that would be nearly identical.

### #24turch  Members   -  Reputation: 590

Like
0Likes
Like

Posted 25 January 2012 - 01:10 PM

How you structure the data will depend on what you're doing

Exactly. For the orc example, if all that orcs have to do is update their position with by their speed, the data structure that would make most sense is
struct orc
{
int posx;
int posy;
int speed;
};

struct all_orcs
{
orc[MAX_ORCS]
};


when compiled, the memory acces pattern for the update will look something like this
for (int* ptr; ptr < MAX_ORCS * sizeof(orc); ptr += sizeof(orc)
{
&ptr = &ptr + &(ptr + 2 * sizeof(int));   // first int (posx) = first int + 3rd int (speed)
&(ptr + sizeof(int)) = &(ptr + sizeof(int)) + &(ptr + 2 * sizeof(int));   // second int (posy) = second int + 3rd int
}


which is accessing memory in a very straightforward manner.

However, it is very unlikely that that is all that you want the orcs to be able to do, so you'll have to take into account other operations you want to do on the data.

And as medv4380 said, you probably want to abstract that away behind some sort of component / entity model.

### #25Yuukan  Members   -  Reputation: 100

Like
0Likes
Like

Posted 25 January 2012 - 02:22 PM

So I was right when I said that my "ComputePosition" should be called for every sets of entities?

About the abstraction layer, a PositionProperties struct might be a good idea and all set of entities that need a position will have an array of that struct. Like this:

I don't know how I can merge all Positions into one array and if it's really a great idea, may be the picture above is sufficient.

### #26turch  Members   -  Reputation: 590

Like
0Likes
Like

Posted 25 January 2012 - 03:01 PM

So I was right when I said that my "ComputePosition" should be called for every sets of entities?

Yes, the whole point of DOD is that if you have an operation that you have to perform on all of a certain set of data, you should make the data cache friendly.

I don't know how I can merge all Positions into one array and if it's really a great idea, may be the picture above is sufficient.

That goes back to what I was saying on the first page:

Think of "Position" as a component. An entity (orc, elf, etc) is composed of components (but each one may need different components). For each type of component, you have a <component>Manager. At init, each manager allocates a block of memory big enough for, say 2000 components, as well as some sort of hashed data structure with an ID as the key and the pointer to the component as the value. When you need to create a new entity, go to the manager of each component that the entity needs, and call AllocateNewComponent(unsigned int entityID).

The manager creates a new component in the next free slot and adds the id to the hash table. If there isnt a free block available, the manager allocates a new large block. When something needs a component from a certain entity, it calls <component>Manager.getComponent(ID), which looks it up in the hash table and returns the pointer to the component.

Again, the point of this is making uniform access cache friendly and thus really fast, at the cost of a level of indirection when getting data for any single entity (by having to go through a hash table).

### #27Yuukan  Members   -  Reputation: 100

Like
0Likes
Like

Posted 26 January 2012 - 03:35 AM

Ok so then, how would you write the ComputePosition function? What kind of loop?

### #28turch  Members   -  Reputation: 590

Like
0Likes
Like

Posted 26 January 2012 - 08:19 AM

I'd look at it as a velocity / position update.

struct PhysicsComponent
{
Vector2 position;
Vector2 velocity
};

class PhysicsComponentManager
{
PhysicsComponent* createComponent(unsigned int id);
PhysicsComponent* getComponent(unsigned int id);

void updateComponents(float deltaTime)
{
for each componentBlock in allBlocks
{
for each component in componentBlock
{
component.position += deltaTime * component.velocity;
}
}

List<componentBlock> allBlocks;
};


### #29Yuukan  Members   -  Reputation: 100

Like
0Likes
Like

Posted 26 January 2012 - 11:01 AM

What do you mean by componentBlock?

### #30turch  Members   -  Reputation: 590

Like
0Likes
Like

Posted 26 January 2012 - 11:18 AM

Sorry, that goes back to what I was saying on the first page.

A componentBlock is a chunk of preallocated memory that holds components. When a componentBlock runs out of room for components, a new one is allocated. So a component manager will have one or more componentBlocks, each of which will have many components.

### #31achild  Crossbones+   -  Reputation: 1944

Like
0Likes
Like

Posted 26 January 2012 - 11:34 AM

OP has shown using SoA, and from two other posters I've seen the answer is to better use AoS. Now, granted that the idea is to clump data together, don't these other non-desktop-pc devices still typically have more than 1 cache line? As well as some sort of SIMD instructions?

Why is the advice for Data-Oriented-Design to use AoS then, assuming (incorrectly maybe?) that the above is true?

### #32turch  Members   -  Reputation: 590

Like
0Likes
Like

Posted 26 January 2012 - 11:46 AM

I'm not sure about non-pc devices, but pc's map cache lines to memory in such a way that multiple areas of memory are mapped to the same cache line. If you have an SoA with say 2000 elements in each array, the memory containing positions[ id ] and speeds[ id ] could map to the same cache line, meaning each operation could potentially cause 2 cache misses each time.

Contiguous areas of memory, however, are going to be mapped to different cache lines, so AoS provides a much smoother memory usage pattern, especially when the processor preloads memory into cache. While working on cache line 1, cache line 2 will be loading and so on.

### #33Yuukan  Members   -  Reputation: 100

Like
0Likes
Like

Posted 26 January 2012 - 01:04 PM

I'd look at it as a velocity / position update.

What would you do if you need the position data in another component with this system?

Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

PARTNERS