# need help with data oriented design

This topic is 2056 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

so im still waiting for that eureka moment where i can fully implement what ive learnt about DOD. my problem is although i can understand how to change the collection class into a class containing a collection, i just cant figure out how different collections can interact with each other. say i have a class of balls and a class of bricks and a bat class, in what sort of relationship can their classes have whilst being cache friendly?

please assume ive googled and have been reading/programming for 3 days straight on this topic. Edited by cool mr croc

##### Share on other sites
Hmm, I've read a little about the topic myself. I don't think I can answer you flat-out, but maybe my take will give a more concrete basis for discussion.

Assuming you have many bricks, many balls and only one bat, you might have something like the following:
 class Balls { vector<Point> pos; vector<Color> color; vector<double> radius; }; class Bricks { vector<Point> pos; vector<Color> color; vector<double> width; }; class Bat { Point pos; double width; }; 

To check for collisions with the bat, you could loop through the pos vector for all Balls and compare with the Bat pos, thereby keeping the cache full of Ball positions. Checking for collisions between the Balls and the Bricks... seems trickier to do efficiently unless you assume the number of Balls is much less than the number of Bricks. You could loop through the Balls pos vector and compare to all Bricks pos vectors, thereby keeping the cache filled with Brick positions. The Balls may lose out cache-wise, but perhaps not a big deal.

From what I gather you have to really work hard at getting your data structure right, because changing rules and/or refactoring is hell.

##### Share on other sites
just seeing it laid out like that is a big help, thanks. cant radius and width be removed from the top 2 classes though, as its the same for all entities? and if so could it be held in an entitiesmanager? something like

 enum entityname { e_brick = 0; e_ ball, } class entitismanager { vector<dimension> the dimensions; void somefunction() { //access vector like thedimensions[e_brick]... } } 

##### Share on other sites
i suppose it could be held like

 class Balls { vector<Point> pos; vector<Color> color; vector<double> radius; }; 

another question, if the balls struct ends up with the same datatypes as bricks, is it ok to call it entities and have 2 instances in the manager class? i assume it is

##### Share on other sites
Could you maybe define the term "relationship" some more in this context? What kind of interactions should your data have with eachother?
I've always approached Data-oriented design with some care, since it's really easy to work yourself into some nasty problems when trying to convert existing OO code to DO code. Same goes for approaching DO with an OO mindset, and these days a large amount of programmers have the idea of objects nested deeply in their minds.

I've always thought it easier to first of all take a piece of paper and lay out the steps of how to solve a problem in an OO design. After that you can start switching around steps or altering parts of your implementation on paper until you have a system which centers around data instead of code.

You should also consider whether it will actually give you any real benefits to abandon your trusted paradigm in favor of DOD? Are you seeing such a problem with your cache usage that there's no other option than switching over to DOD? From what I've read you'll gain the most from DOD when developing for consoles where you'll meet more hardware limitations

##### Share on other sites
i have no problem with my current projects, im just trying to learn for when im next on a console and im asked to use DOD.

the relationships between the entities are pretty much just collision.

##### Share on other sites
Data oriented design does not prohibit you from using any form of spacial partitioning, so the approach for doing collision would be the same as you would do in any other paradigm.

I had my "eureka-moment" with DOD when it was described to me as being like how you would design a particle system. You have a pool of data which is run through the appropriate chain of systems to get a desired output, just like how you would emit and update particles.

##### Share on other sites
yes i saw an example on here of that, with code.

http://www.gamedev.net/topic/575076-what-is-data-oriented-programming/page__st__20

but it assumes all particles are the same. i cant seem to get my head around how to handle it if the particles are different. perhaps just other vectors holding the differing values, velocity, colour etc?

##### Share on other sites
This is an issue of proper design and layout of data. In a well-designed particle system you shouldn't try to pool together a bunch of different types of particles each with a differing data structure, you should try to design a particle in such a way that it can be used to build many different outcomes while retaining the same base data structure.

##### Share on other sites
The point of DOD is to sort your data so that it's as consistent for as long as it can. If you need to switch states between entities all the time while looping, you're not doing it right. If the bricks and balls differ enough to require an "if ball else if brick" statement, you should definitely split them up and run separately. Think of them as two sets of unrelated particles.

For collisions, you take the required data (width and maybe position) of the objects, rather than the full objects, put them into a lean vector or two and rip through it in a data consistent manner. This opposed to loading a vector with the full fat objects draining cache and having to jump over the data in them that you don't touch anyway.

Where the data differ, you make separate passes. For example, all ball - ball collisions makes one set, all ball - brick in one set, and then run specialized functions for each. Edited by Zoomulator

##### Share on other sites

This is an issue of proper design and layout of data. In a well-designed particle system you shouldn't try to pool together a bunch of different types of particles each with a differing data structure, you should try to design a particle in such a way that it can be used to build many different outcomes while retaining the same base data structure.
i think what i dont get is how can you encapsulate particles within a particle manager without pooling together a bunch of different types.

The point of DOD is to sort your data so that it's as consistent for as long as it can. If you need to switch states between entities all the time while looping, you're not doing it right. If the bricks and balls differ enough to require an "if ball else if brick" statement, you should definitely split them up and run separately. Think of them as two sets of unrelated particles.

For collisions, you take the required data (width and maybe position) of the objects, rather than the full objects, put them into a lean vector or two and rip through it in a data consistent manner. This opposed to loading a vector with the full fat objects draining cache and having to jump over the data in them that you don't touch anyway.

Where the data differ, you make separate passes. For example, all ball - ball collisions makes one set, all ball - brick in one set, and then run specialized functions for each.
i think this is what i do get, maybe i just need to practise a lot.

##### Share on other sites

i think what i dont get is how can you encapsulate particles within a particle manager without pooling together a bunch of different types.

The type shouldn't matter, it's the data they contain. You group the same kind of data of the particles together, no matter what the particles actual type is, excluding what ever data they don't have in common for the task at hand. A bicycle and an ant aren't the same type, but they both can have data relating to position and speed, so you sort only the data relating to that into the array to be handled. Sorry if I might be repeating myself..

Getting into the mindset of sorting the data like this is really hard, and only pays off if you have huge sets of data that actually can be sorted in a continuous way. If you can't find an obvious way to sort something, then don't waste your time doing it. As time goes, you'll see more things that fits with DOD.

A suggestion to get past the OO mindset could be to skip C++ for a while and use strict C instead. This will force you to think in sets of data rather than data objects and strip away the distractions of classes. C++ has the great benefit that it can swap between OO and procedural, which a lot of C++ programmers tend to forget. Edited by Zoomulator

##### Share on other sites
Hmm, I did think about combining the ball and brick types together for movement, e.g. each has a position and velocity (0 for bricks), then loop through all and move and check for collisions. You could keep the vector sorted by x co-ordinate for easy collision detection. However I don't think the payoff would be enough because it would be hard to match a brick/ball up with it's other attributes, e.g. color etc. It's easy when pos[1] matches with color[1].

##### Share on other sites

[quote name='cool mr croc' timestamp='1338895200' post='4946392']
i think what i dont get is how can you encapsulate particles within a particle manager without pooling together a bunch of different types.

The type shouldn't matter, it's the data they contain. You group the same kind of data of the particles together, no matter what the particles actual type is, excluding what ever data they don't have in common for the task at hand. A bicycle and an ant aren't the same type, but they both can have data relating to position and speed, so you sort only the data relating to that into the array to be handled. Sorry if I might be repeating myself..
[/quote] with your example of bike and ant, would there be a struct containing a vector of speed and one of position? and what would you do with the data they have thats different. say bike had a seat object and ant has a head object. are they just left in the manager class or are they encapsulated in some way as well?

##### Share on other sites

with your example of bike and ant, would there be a struct containing a vector of speed and one of position? and what would you do with the data they have thats different. say bike had a seat object and ant has a head object. are they just left in the manager class or are they encapsulated in some way as well?

The best way may be to keep all objects' values in global vectors. So that both the ant and the bike stores its position in a long position vector. A position is a position, no matter the type, no matter if it has a seat or head.

Now, just having a position might not give you much to relate to so it's not a great example. What you might want though, is if you've got a position and a velocity for your objects, they all go in a respective "global" vector. The function that calculates the new position takes these two vectors, runs through them in one fast linear loop and outputs a new vector with the new positions.

That is sorting data. Instead of going through each of your objects update routine which will change animation, do some health calculation and then calculate its new position, you rip out the calculation of the new position and put it where all those calculations can be called and them alone in a single pass.

You could do this with your OO objects, but that would include a lot more indirection than running through a vector that has only the relevant data lined up, resulting in cache misses all over the place. Edited by Zoomulator

##### Share on other sites
something like this?

 struct velAndPos { Vector2D velocity; Vector2D position; }; vector<velAndPos>bikePositions; vector<velAndPos>antPositions; vector<bool> bikeCollisions; vector<bool> antCollisions; class someKindOfBikeyAntGame { ... void updatePositions(velAndPos* theVelandPos, int size) { for(int i=0;i<size;i++) theVelandPos.velocity+=theVelandPos.velocity; } void collisionDetect(velAndPos* theVelandPos1, int size, velAndPos* theVelandPos2, int size bool* storecollision1, bool* storecollision2) { //loop through, chech collisions, store in respective bool vectors } void update() { updatePositions(&bikePositions[0], bikePositions.size()); updatePositions(&antPositions[0], antPositions.size()); collisionDetect(&bikePositions[0], bikePositions.size(), &antPositions[0], antPositions.size() &bikeCollisions[0], &antCollisions[0]); } }  Edited by cool mr croc

##### Share on other sites
Have you ever studied the first three normal forms of relational database design? They will help you to clearly see the one-to-one, one-to-many, and many-to-many relationships that exist in all kinds of systems. I recommend it, even if you never plan on writing a single line of SQL in your life, because -- as zen and vague as it may sound -- everything is connected.

http://w3schools.in/...e-Normalization Edited by taby

##### Share on other sites
Not quite, but closer.. my point is that bike and ant positions are just positions in a very general sense, so they can be in the same vector.

 // These functions doesn't have to be members of a class, since they only operate on the input given. void updatePositions( vector<Vector2D>& position /*is altered*/, const vector<Vector2D>& velocity ) { assert( position.size() == velocity.size() ); // If they're not the same size at this point, something is probably not right. size_t size = position.size(); for( size_t i = 0; i < size; ++i ) position += velocity; } // Since position and velocity exist in different vectors, no unnecessary data is passed to this collision function. void collisionDetect( const vector<Vector2D>& position, vector<bool> collisionResult /*is altered*/ ) { assert( position.size() == collisionResult.size() ); // A bit of brute force collision detection. size_t size = position.size(); for( size_t i = 0; i < size; ++i ) { for( size_t j = i+1; j < size; ++j ) // Check all following positions. collision = distance( position, position[j] ) < 5; // Collides if distance is less than '5' } } class AntsAndBicycleGame { void update() { updatePositions( positions, velocities ); collisionDetect( positions, collisions ); } vector<Vector2D> positions; // Positions for all instances of all types vector<Vector2D> velocities; vector<bool> collisions; }; 

Unless you actually want to handle the bicycles' positions in a different way than the ants' positions, then you should put them in separate vectors since that's sorting it for different purposes. Edited by Zoomulator

##### Share on other sites
i understand wat you mean now. i hadnt done anything like that as i wanted an explicit return on the collision bools, but i suppose i can just keep track of the order after the function returns.

##### Share on other sites
Excellent!
There's a bit more manual tracking, that's true. One of the downsides of DOD is the extra foresight required when designing the code for it.

Glad I could help ( and be sure to use the rep button <-- ;) )

##### Share on other sites
thanks, its helped alot. how about encapsulating further entity data? i assume its ok to create a little struct that contains all of that?

##### Share on other sites
I tend to think of DOD in a Component/Entity manner, where the Components are just pieces of Data, and an Entity is a collection of Components, and there are systems, which act on the components. So, the Brick would have a Location Component (Position and Velocity), a Color Component, a RectDimensions, and a Strength Component (how hard it is to smash), and it would be processed via the Brick System (for how to handle collisions). The ball entity would have the same components, except a Circle Component instead of Rectangle, and it would be processed by the Ball System, and the Paddle would have all those components, except it would be processed by the Paddle System. The paddle could have extra components as well, such a bullets, if it is able to shoot bullets (like Arkanoid did).

However, all these components are stored in like vectors, and are handled at the same time, but run through the different systems, based on which entity owns that component.

kind of like this (pseudo code):
 class System { virutal void HandleLocation(Location &locationComponent) = 0; // fill with rest }; class BrickSystem : public System { void HandleLocation(Location &locationComponent); }; typedef uint32_t TEntity; class Component { TEntity Owner; // who owns us, unique value System *ComponentSystem; // System to handle this component }; class Location : public Component { Vector2 Location; Vector2 Velocity; }; std::vector<Location> LocationList; BrickSystem *BrickSys; void main() { // Initialize the systems BrickSys = new BrickSystem(); // Load the bricks for (each brick to load) { TEntity BrickEntity = GetUniqueId(); Location BrickLocation(BrickEntity, CurrentBrickX, CurrentBrickY, 0.0f, 0.0f); BrickLocation.SetSystem(BrickSys); LocationList.push_back(Location); // Load other components into brick } } void Update() { for (each it = item in LocationList) { // Will call brick System handleLocaiton for brick's, BallSystem Handle Location for ball, etc it->GetSystem()->HandleLocation(it); } } 

Or something like that. Edited by BeerNutts

##### Share on other sites
thanks mate thats looks very interesting. ill have a look once i finish this program, ill probably have a que or two.

##### Share on other sites
one more question and ill be donw, consider the following code

 class application { vector<pos> ballPos> vector<pos> batPos> vector<pos> brickPos> vector<bool> ballCol> vector<bool> batCol> vector<bool> brickCol> vector<vel> ballVel> vector<vel> batVel> vector<vel> brickVel> int somethingOnlybatHas1; int somethingOnlybatHas2; int somethingOnlybatHas2; int somethingOnlyBricksSystemHas; void update() { //loop through collections and call check collisions and resize } void checkCollision(//pos vec1, pos vec2, col result1, col result2) { //check collision, store result } void resize() { } }; 

is there a better way to encapsulate this so the variables unique to each system is held within its class? i was planning on having system classes each containing their containers. and perhaps with their own update function. does that sound good?