Sign in to follow this  
Shyr

OOP and DOD

Recommended Posts

Recently, I have been studying various design patterns. While they all seem very interesting and useful in different situations, I think I've gotten a little confused about structuring my code. Code is never clean enough. So I spend much time refactoring and thinking of better ways to simplify code to make it easier to read later.

 

I would like to hear about your opinions of Object-Oriented Programming and Data-Oriented Design. I'm not really asking if I should not be using one or the other, but rather when and how I should efficiently use OOP and DOD?

 

OOP as I mean it here refers to classes with an interface, attributes, and methods.

DOD as I mean it here refers to structures with external functions that use them.

Share this post


Link to post
Share on other sites

OOP and DOD are two very useful paradigms among hundreds, possibly thousands, that developers can pull out as appropriate.

 

Although religious wars have been fought over nuance, object oriented effectively means that you have objects and the objects have a series of functions/methods to perform a cluster of tasks.  

 

Similarly subject to religious wars, data oriented effectively means that you are organizing data in a way that is friendly to the cache or other underlying hardware.

 

As the programmer you can use one, the other, or both, as you see fit in your code. You can also use flow based paradigms, event driven paradigms, parallel paradigms, and more, in whatever ways you see fit in your code.

Share this post


Link to post
Share on other sites

 

 

I would like to hear about your opinions of Object-Oriented Programming and Data-Oriented Design. I'm not really asking if I should not be using one or the other, but rather when and how I should efficiently use OOP and DOD?

 

 

Ok I'm trying to think of a way to answer this without getting into several paragraphs, starting a religious war, or pointlessly confusing you. 

 

The thing to understand is that OOD/DOD are about organizing data within your game/app and how that data is transformed from one state to another.  At the end of the day both will get you where you're going, to get your game level from one frame to the next one.  I'd say that if you're trying to figure out when to use OOD versus DOD, I'd consider these 3 main things:

 

1) Is your problem mainly about optimization, about running a simulation as fast as possible?  If so, DOD might be the best way for you to go.

 

2) Do you have smaller numbers of objects that interact in complex ways with each other during an update?  OOP is probably better for you.  But,

Do you have large numbers of objects that dont interact with each other?  DOD is probably better and faster.

 

and

 

3) Is multi-threading a consideration?  In this case DOD might be better for you.

Share this post


Link to post
Share on other sites
In my opinion, and I'm assuming we're talking about high performance software development and C++ (since you've tagged the thread with this language), use DOD whenever possible and OOP when forced to because (even though I'm not sure if DOD has been formally and completely defined) what comes to mind technically when thinking of it is that it help us to tackle a couple of problems with OOP:

1. Inheritance abuse (including CPU costs of virtual function calls although generally that is an optimization).

2. Cache wastage through composition abuse and inheritance.

3. Destructors, constructors, member functions, member operator overloading, etc. leading more functional code writing instead of OOP.

Technically, as been stated before, the main result that you get from this is more POD and less objects, sometimes automagically achieving a better memory usage. Ultimately, you want to balance these things so that your only reason to use the (few) advantages of OOP is convenience.

Share this post


Link to post
Share on other sites
 

In my opinion, and I'm assuming we're talking about high performance software development and C++ (since you've tagged the thread with this language), use DOD whenever possible ...

 

Let me expand a bit on this -- DOD is really the art of wringing utmost performance from a set of hardware that has specific, real-world characteristics -- machines have a word-size, a cache-line size, multiple levels of cache all with different characteristics and sizes, it has main memory, disk drives, and network interfaces all of which have specific bandwidths and latencies measurable in real, wall-clock time. Furthermore it has an MMU, and DMA engines, and it has peripheral devices that require or prefer that memory objects used to communicate with it appear in a certain format (e.g. compressed textures, encoded audio). Because of the already large -- and still growing -- disparity between memory access speed and CPU instruction throughput, it has been a lesser-known truth for some time that memory-access patterns, not CPU throughput or algorithmic complexity, is the first-order consideration for writing performant programs. No fast CPU or clever algorithm can make up for poor memory access patterns on today's machines (this was not the case earlier in computing history when the disparity between memory access speeds and CPU throughput was not so mismatched; I would estimate it has been the case since around the time of the original Pentium CPU, but hadn't become visible to more mainstream programmers until probably 10 years ago, or less).

 

If performance is critical, DOD is the only reasonable starting point today. Period. End of Story.

 

But one must have a reasonable grasp of where performance is critical -- it would be unwise to program every part of your program at every level as if DOD is necessary or desirable in the same way that writing the entirety of your program in Assembly language would be -- in theory, you might end up with the most efficient program possible, but in practice you'll have put an order of magnitude more effort into a lot of code that never needed that level of attention to do an adequate job, and you'll have obfuscated solutions to problems where other methods lend themselves naturally. For instance, UI components would gain nothing by adopting DOD, yet a DOD solution would likely give up OOP approaches that fit the problem so naturally that UI widgets are one of the canonical example-fodder used when teaching OOP.

 

 

 

... and OOP when forced to because (even though I'm not sure if DOD has been formally and completely defined) what comes to mind technically when thinking of it is that it help us to tackle a couple of problems with OOP:

1. Inheritance abuse (including CPU costs of virtual function calls although generally that is an optimization).

2. Cache wastage through composition abuse and inheritance.

3. Destructors, constructors, member functions, member operator overloading, etc. leading more functional code writing instead of OOP.

Technically, as been stated before, the main result that you get from this is more POD and less objects, sometimes automagically achieving a better memory usage. Ultimately, you want to balance these things so that your only reason to use the (few) advantages of OOP is convenience.

 

Yet, its important to maintain awareness that OOP and DOD are not necessarily at odds. You can't, for example, answer the question "what's DOD?" with "Not OOP." Whatever programming paradigm(s) you choose to adopt, its prudent to select and leverage what features it can offer in service of DOD, for the parts of your program that adopt DOD. It might not be possible to write a DOD solution that looks exactly like a typical OOP solution, but its very possible to write a DOD solution that looks *more like* a typical OOP solution than like a typical Procedural solution. Again, DOD is (and must be) prime where you have deemed performance to be critical, but there are no language features or programming paradigms that it forbids; like all things in engineering, there must always be a considered balance of competing needs.

Share this post


Link to post
Share on other sites

 

But one must have a reasonable grasp of where performance is critical -- it would be unwise to program every part of your program at every level as if DOD is necessary or desirable in the same way that writing the entirety of your program in Assembly language would be -- in theory, you might end up with the most efficient program possible, but in practice you'll have put an order of magnitude more effort into a lot of code that never needed that level of attention to do an adequate job, and you'll have obfuscated solutions to problems where other methods lend themselves naturally. For instance, UI components would gain nothing by adopting DOD, yet a DOD solution would likely give up OOP approaches that fit the problem so naturally that UI widgets are one of the canonical example-fodder used when teaching OOP.

 

QFT. One of the reasons that DOD is still relatively unknown outside of certain areas (game programming, high-performance computing) is because it solves a very specific problem: memory access bound performance.

 

But for the vast majority (IMHO) of software written today, CPU/memory bound performance is an order of magnitude less important than the much lower bandwidth issues (data access, network access, etc).

 

Most "typical" business software spends its time waiting for database queries or REST APIs to complete. If DOD improves your algorithm even by 1000%, that's not going to help much if the total time spent waiting for process x to complete is 90% dependent on a high latency process.  

 

I'm not arguing against DOD; it's very good for it's intended purpose. I'm simply attempting to explain why it's not more widely known.

Share this post


Link to post
Share on other sites

Thanks so much everyone. From what I gathered it seems that DOD, or functional programming, is faster than OOP but OOP is better for more complex behaviors. I often hear that using inheritance, abstract classes, polymorphism and such is slower due to v-table searches, but does that cause significant reduction of performance (to the point where say a player would notice)?

 

I like the idea of using abstract data and objects in C++ because it seems like a good way to organize code. On the other hand, I don't want to make a code base that is terribly inefficient.

Share this post


Link to post
Share on other sites

Thanks so much everyone. From what I gathered it seems that DOD, or functional programming,


Not quite. Again, you have your terms mixed up. "Functional programming" is something else entirely.

I would just call it DoD. As far as I know, it doesn't have any synonym. Its more vocal proponents might just call it "engineering," but not everyone would agree with that claim. Edited by Oberon_Command

Share this post


Link to post
Share on other sites

Thanks so much everyone. From what I gathered it seems that DOD, or functional programming, is faster than OOP but OOP is better for more complex behaviors. I often hear that using inheritance, abstract classes, polymorphism and such is slower due to v-table searches, but does that cause significant reduction of performance (to the point where say a player would notice)?

 

I like the idea of using abstract data and objects in C++ because it seems like a good way to organize code. On the other hand, I don't want to make a code base that is terribly inefficient.

 

We cant tell you which would be better or faster because we dont know what your code will do or what the data will look like.  My recommendation is to arrange your code in whatever way makes the most sense to you.  While you're doing that you will probably learn a lot, and as a result when you're done you will probably think "if I were to do it again I'd do it all differently"... which is a normal thing.  But, you cant really jump to that without first learning what you learned the first time around.

 

Of course that's not to say that you cant learn more about how the hardware works before you start coding.  DOD is all about arranging data so that your code will operate on it optimally given specific ways that the hardware works.  So, I'd recommend learning more about how memory caches and how the prefetcher works.  But then again, I kind of think that you're worrying too much about optimizations.  Just because your code follows DOD instead of OOD doesnt mean it will be any faster, any easier to write or maintain, or any more robust.  In fact if you blindly try to go that route you might end up with just the opposite.

Share this post


Link to post
Share on other sites

One of the reasons that DOD is still relatively unknown outside of certain areas (game programming, high-performance computing) is because it solves a very specific problem: memory access bound performance.

 

But for the vast majority (IMHO) of software written today, CPU/memory bound performance is an order of magnitude less important than the much lower bandwidth issues (data access, network access, etc).

 

Most "typical" business software spends its time waiting for database queries or REST APIs to complete. If DOD improves your algorithm even by 1000%, that's not going to help much if the total time spent waiting for process x to complete is 90% dependent on a high latency process.  

 

I'm not arguing against DOD; it's very good for it's intended purpose. I'm simply attempting to explain why it's not more widely known.

 

But IO-bound software struggling between external systems and memory requires the same optimization techniques and principles as software struggling between memory and CPU: read data sequentially or at least in large pages, read data only once (if possible, never), organize data so you don't waste bandwidth reading unneeded data because it's interspersed with data you need, don't waste time and/or space storing redundant information that can be recomputed inexpensively from a small working set of data, and so on.

Apparently different techniques, like normalizing the structure of a relational database and organizing data structures in memory as "structures of arrays" actually serve the same purpose in the same way.

Edited by LorenzoGatti

Share this post


Link to post
Share on other sites

1) Is your problem mainly about optimization, about running a simulation as fast as possible?  If so, DOD might be the best way for you to go.

 

 

 

2) Do you have smaller numbers of objects that interact in complex ways with each other during an update?  OOP is probably better for you.  But,

Do you have large numbers of objects that dont interact with each other?  DOD is probably better and faster.

 

 

...a couple of problems with OOP:

1. Inheritance abuse (including CPU costs of virtual function calls although generally that is an optimization).

2. Cache wastage through composition abuse and inheritance.

3. Destructors, constructors, member functions, member operator overloading, etc. leading more functional code writing instead of OOP.

 

... Let me expand a bit on this -- DOD is really the art of wringing utmost performance from a set of hardware that has specific, real-world characteristics -- machines have a word-size, a cache-line size, multiple levels of cache all with different characteristics and sizes, it has main memory, disk drives, and network interfaces all of which have specific bandwidths and latencies measurable in real, wall-clock time. ...

If performance is critical, DOD is the only reasonable starting point today. Period. End of Story ...

 

 

 

 From what I gathered it seems that DOD, or functional programming, is faster than OOP but OOP is better for more complex behaviors.

 

I'm not sure how you gathered that ....

 

Data Oriented programming and Object Oriented programming are not at odds and are not necessarily faster or slower. ...

 

OOP is clusters of behaviors around a blob of data. ...

 

DOD means designing around smooth flow of data. General guidance is to have long, continuous strands of data that flow through the cache and processing around a predictable stride.

 

 

I think I'm even more confused now than I was before. People, and even you, are saying that DOD is more efficient, but at the same time you're saying that it's not? It is but it isn't? :(

 

In my studies, it was said that computationally speaking OOP has a larger footprint because data is spread throughout memory rather than being stored in one place. It would be more resource intensive to perform searches on that data than a DOD approach would lend itself to. Was that wrong? You mention that DOD is an approach that prefers continuous strands of data, which would be easier and quicker to handle than v-table searches. I don't understand. Is it like using a std::vector (continuous memory, faster) versus a std::list (not continuous, slower)?

 

If it won't cause a noticeable bog in a 2d game, I'd like to use OOP. At the same time, I want to develop for mobile, where performance and resource usage appear to be quite important. :) It might be better to share examples of specific subsystems in a typical game that may work better with OOP or DOD. For example, I heard that physics simulation might work better with DOD.

Share this post


Link to post
Share on other sites

I think I'm even more confused now than I was before. People, and even you, are saying that DOD is more efficient, but at the same time you're saying that it's not? It is but it isn't? :(


It seems like you're looking for one of us to state that one of these is always faster. That is not the case - we would be spouting dogma if we said that. It really depends on what you're trying to do, your data is (same thing), and where your latencies are. DoD can be and usually is more efficient, but it doesn't always result in more efficient code since not all algorithms can be arranged to operate on streams of contiguous data.

You should also stop thinking of DoD and OOD as opposites. Sometimes the data layout in a DoD solution is equivalent to the one in the OOP solution, meaning that the DoD and OOP solutions are the same thing, for one thing. Remember, OOP is fundamentally a modelling tool that focuses on maintaining state invariants. You can use OOP to implement a data-oriented design, if not on the usual level of granularity of typical OOD.
 

Is it like using a std::vector (continuous memory, faster) versus a std::list (not continuous, slower)?


That's a good example, yes. But I am reluctant to claim that every data structure can be represented efficiently with a vector (or several vectors) or something like it. Many or even most can be, but that doesn't mean there aren't cases where you need to do something different. Same thing applies to DoD. Edited by Oberon_Command

Share this post


Link to post
Share on other sites
I think I'm even more confused now than I was before. People, and even you, are saying that DOD is more efficient, but at the same time you're saying that it's not? It is but it isn't? :(

 

Efficiency in code is a strange thing.  Typically the inefficiencies are not what you expect.  Generally the first time a program is analyzed for performance all kinds of weird stuff shows up. You might discover billions of calls to string comparison, or billions of iterations through some loop deep-down in the system.  

 

It is rare for performance problems to be exactly you expect them to be, unless you've been doing performance analysis for years, in which case occasionally you can guess right.

 

 

Data oriented design is more efficient because you wrote and tested for that specific thing. You spent time and effort ensuring that the system will efficiently travel across the data.  Perhaps you have a half million particles and you build a particle system, your data oriented design will ensure that the CPU streams quickly and efficiently over those half million particles, taking advantage of cache effects and whatever prefetch commands are available on the system.

 

It takes a great deal of testing to ensure something that claims to be data oriented really is.

 

 

 

In my studies, it was said that computationally speaking OOP has a larger footprint because data is spread throughout memory rather than being stored in one place. It would be more resource intensive to perform searches on that data than a DOD approach would lend itself to. Was that wrong?

 

Yes, that is a poor generalization.  

 

Object oriented programming is based around clusters of objects and operations. There is typically no significant difference in memory footprint. Data can potentially be scattered across memory if that is how the programmer wrote it.  This is not a problem by itself if the operations are naturally scattered.  However, if operations are sequential, and also if memory cannot be linearly traversed, then your code might not to take advantage of certain benefits from locality of data. Note there are several conditions involved there.

 

Data oriented development means actively seeking out those conditions and intentionally taking action to ensure when you perform sequential operations the memory is traversed in a hardware-friendly way.  

 

Programmers who follow object oriented rules can completely violate the rules of data oriented development.  They can also closely follow the rules of data oriented development.  The two are unrelated.

 

 

 

You mention that DOD is an approach that prefers continuous strands of data, which would be easier and quicker to handle than v-table searches. I don't understand.

 

It has nothing to do with vtables.  

 

Going back to a particle system example:

 

A naive programmer might make a particle object. They'll give the particle a position, mass, velocity, and several other variables. Then they will create an array of 500,000 particle objects.  When they run the code, they will iterate over each object, calling 500,000 functions. Each function will update the object, then return.  They'll provide many different functions to manipulate the particles, and whenever they work with the system, call each function 500,000 times.

 

A more savvy programmer might make a class of particles. They will create an array of positions, an array of masses, an array of velocities, and a way to set the number of particles, taking care to ensure the processing works on an interval that steps one cpu cache line at a time.  Generally working within the cache you only pay the cost for the initial load, so if you can fit 4 in the cache at once you pay for one, the other three are effectively free. The programmer can then set the number to 500,000 particles. Then they will make a single function call which will traverse all 500,000 particles all at once. They'll provide exactly the same functionality to manipulate the particles as above, but each will be completed with a single call that processes all particles, rather than an enormous number of calls that process each particle.

 

An even more savvy programmer might take advantage of SIMD calls on the system to process 4, 8, or more particles at a time rather than processing them individually in a tight loop. Then instead of just having four in the cache and paying for a single cache load, they'll also only pay for a single operation rather than four, giving even faster results.

 

 

All three of them have the same effect of calling 500,000 particles. Two of them just took better advantage of their hardware's capabilities.

 

 

Also note that this type of design only works in limited situations.  There needs to be LOTS of things that can be accessed sequentially.  Lots of particles being shifted. Lots of pixels in an image filter.  If there are a small number of items, perhaps only a few hundred or a few thousand, the benefit is probably so small to not be worth it.  Also if the items cannot be arranged sequentially the design change does not work.

 

 

 

 I want to develop for mobile, where performance and resource usage appear to be quite important.

 

Your concern is admirable but misplaced.  This is something you should not worry about in about 97% of the time.  In the rare roughly 3% of the time it should be very obvious that you need to do something about it, you will not miss it by accident.

 

The rules for optimization are:

 

1. Don't worry about it.

2. Advanced:  Don't worry about it yet.

3. (Experts Only): Don't worry about it until after you've carefully measured, then only worry about those specific pieces indicated by measurement.

Share this post


Link to post
Share on other sites

I think I'm even more confused now than I was before. People, and even you, are saying that DOD is more efficient, but at the same time you're saying that it's not? It is but it isn't? :(

 

You're confused partly because this is a complex topic.  If you have little knowledge about a complex topic and you ask general questions, confusion is usually the result.

 

If you give us more specific examples of what problems you're trying to solve, then we can be more specific about whether a DOD or OOD approach would work better for you.  

 

Other than that I can only repeat what I said above.  You should arrange you code in the way that makes the most sense to you.  Most people typically think of scenegraph objects in an OOD way, and that's a simple and general way to break things up.  

Edited by 0r0d

Share this post


Link to post
Share on other sites

One way to help you grok OOP, Functional, Procedural is to think of them as spoken languages. say English, German and Japanese. Each is a method of communication but each has different words, sentence structures and intonations.

 

Within a language you have many dialects, which on the surface are similar but dig deep enough and there are many subtle differences, these are the various computer languages.

 

Traditionally the languages fall in the paradigms as follows:

 

OO (Objects) - C#, C++, Java

Procuedural (Data and functions) - C, Fortran

Functional (Everything is data, including functions) - SQL, ML based, Haskel, Lisp based.

 

You can DOD in all of these paradigms.

 

In recent years the lines between paradigms has blurred as many languages are moving to a multi paradigm state, Such as Scala, F# and even C#. Most being a mix between the OO and functional worlds but there roots and core use remain true t their origins. 

Share this post


Link to post
Share on other sites

you might try thinking of DoD as designing data structures based on the hardware, and OOP as a way to organize code.  as you can see those two things are not mutually exclusive.  DoD is an optimization method.  OOP is a coding style.  and the performance hit of OO vs procedural code is negligible. new() and dispose() are what kill. so you don't new or dispose during runtime (at least not a lot). instead you allocate everything, run, then release everything. mobile development limitations (from what i hear) appear to be more related to lack of ram and high battery consumption rates due to graphic scene complexity.

Share this post


Link to post
Share on other sites
Yes it is confusing for beginning programmer like me. But this is how I see it. By my observations and readed articles about it. OOP is way to handle growing complexity as software base grows larger.
It a way to split software enginering over many Programmer. Tackle large scale software engenering.
I think. It is learned in software enginering education so it where beginners start to come in touch with it. As I did not have that education OOP is as vague to me as DOD.
But I am into C++.
As for games they say start smal like a phong clone or space invader.
Then with a language where easy'er to start in and full supporting OOP.
Make little game that runs everywhere even in browser. Like java.
Or C#.


They say DOD is about optimisation but its more. It is also about hardware knowledge and build a solution that fits hardware at best for a very performance and tech pushing software problem. That last is crusial part that makes DOD relevant or not.

As the best example of this case is. Where you have high need to push the hardware. And the hardware is very sensitive for OOP misbehavior. And you problem there is lot of thing to compute.

So first you need to know if you need DOD.
1 ) A platform exclusive means you can focus on optimise and learn the platform and it architecture in deep detail.
2 ) A platform that much more sensitive to OOP waist of bad memory acces pattern where compare to common used CPU
3 ) a software solution that need to get the most out of it.

1 ) A console Exclusive!
2 ) A in order cpu with lack of big caches but predicable cache latency, a excotic beast the Cell CPU of the PS3 is.
3. ) a PS3 triple A game exclusive.


But not exclusive there are exception, like for crossplatform titles to get PS3 up to level of other platform might need to invest in PS3 port team with same need to delf deep in it architecture. Studio with dedicated game engine team. Can do this to.

That why some of the pro DoD article come from developrs who worked on PS3 exclusives or advance port, like a very hardware demanding cross platform title to utilise PS3 to it fullest.
Like DicE

Thus DOD is about high performance computing and hardware knowledge to take full advantage of it.
It where advanced software enginering, meets hardware architecture knowledge. As a next step beyond a advances and experience software engineer. Who need to be expert in architecture to.

This means that for console release exclusive, this expertise of platform is starting to grows with each production.The next big title 2 of the same dev they get more knowledge get even more out of it. So sequel 2 on the same platform looks much better. This keeps growing to where the 4 th big title get even more out of the platform because of the growing platform architecture knowledge and game show this leap in understanding of the platform. And then nextgen comes. These DOD guru need to learn a new platform and it architecture again.

Where small titles running on whatever platform it undoable for smal team to figure out multi platform
Or don't have the resources or the game is not that demanding. As for PC you have this wild configurations posiblities.
From game related articles DOD seams to fit ECS very good. The reason that DOD ECS examples can perform worse then OOP because it tested in small tech demo example where there often is no deep knowledge of the platform tested on. Often these programmers have high OOP skill but DOD is new .
Where platform often can handle small scale OOP bad behavior because CPU it designed to handle most common software by having large caches, out of order cores and advanced branch prediction units. And the DOD solution uses probaly the wrong data structure. Or the array size is to small or the mix OOP with DOD where OOP works as anti DOD Patern.

Also those who sucesfuly utilised DOD on PS3 CELL. Also uses thos SPE units all 7 minus reserverd ones. So comparing DOD singlethreaded solution to a OOP single threaded solution is not the real practical gain. As hardware architecture for DOD is crusial so is multicore CPU use for them. Also a unused iGPU on APU is wasted architecture compute power usable with C++ AMP OpenCL or DirectX compute .

Most DOD examples are ECS based and uses often a OOP mix. So first you need to have a game idea that is so demanding that you need those 8 cores . Know the hardware as deep as posible. Which bit limited on PC platform. But C++ is also for managing memory. So to me Java for DOD is step back.

It best to delf first deeply in the theory, then follow programmers that just start using DOD and draw hard conclusions, often fallback on OOP very fast. Thus DOD mixing it with OOP. And DOD turn out bad. So have bad experience with it.

A good start is Richard Fabian online DOD book.

As it hint to DoD solution for OOP solutions, that important if you know the OOP way but Running into a wall with DOD.


My guess is its big advance topic and that online book explains it deeplier then any one does to my knowledge. So what do the expert here think about that online DOD book.

Share this post


Link to post
Share on other sites

@Norman Barrows @SuperG

 

Ok. That makes sense. I especially like the PS3 example. DOD is just a way to optimize for specific hardware. Because of that, what may be optimal and efficient for one system might not be for another. So you would have different DOD solutions for GBC than you would for 3DS because the hardware is different (eg. 3DS RAM > GBC RAM). That's why most consoles can be backward compatible but not the other way around. Because an optimization for PS4 hardware is still beyond the specs of a PS1 and therefore inefficient. But if they allowed it, a PS4 has more than enough resources to run something that would work on a PS1. Code to the platform?

Share this post


Link to post
Share on other sites

That's somewhat untrue -- In general, the most broadly-applicable DOD-type data transformations will benefit other platforms even if it is not absolutely optimal. In part this is because details of e.g. cache-line sizes, number of cache levels, associations of said caches, and relative lateness of each cache level through to main memory don't, in practice, have a lot of variance. Cache lines on applications processors are 16-words everywhere I'm aware of. L1 data cache is 16 or 32 KB everywhere, latency of about 3 cycles, usually 4-way set-associative. L2 caches are 256k-512k per core, latency around 10-12 cycles, 4 or 8-way set associative, L3 caches are 2-4MB shared among 2-4 Cores on simpler/slower cores (like PS4/XBONE) or 6-8 MB shared among 4 fast, wide superscaler cores (e.g. Intel i3/i5/i7) 8 way associativity or sometimes full associativity, about 36 cylcle latency, memory latency about 90 cycles if its in the page table, more if not. The prefetcher acts like an infinite L4 cache if your access patterns are well-predicted (linear forwards/backwards is best, consistent non-contiguous strides next-best), with latency not much worse than L3. Real L4 caches, where you find them, are typically a victim-cache. So on and so forth.

 

But even if there were greater variance, the transformations you make to make good use of any kind of cache are similarly beneficial to any other kind of cache, simply because caches and memory hierarchies are universally more similar than they are different, whatever the fine details may be.

 

The PS3 is notable in particular for the SPUs in its cell processor, which provided essentially all of the PS3s computational power -- these were streaming-processors, like DSPs, with no real "caches" to speak of (each SPUs local store had similar access properties to a cache, but was all the memory that an SPU could see, DMA was the only way to speak to main memory, other SPUs, or the rest of the system) and as such they essentially required DOD practices to achieve reasonable computational throughput. But developers also found that these transformations benefited scalar/altivec code on the PPU, and in cross-platform titles even benefitted Xbox360 and PC targets. The changes that were necessary and crucial to make the PS3 work as well as it was designed were good for other platforms as well, even when they weren't strictly reliant on such transformations in the way that the PS3's SPUs were.

Share this post


Link to post
Share on other sites

@Norman Barrows @SuperG

 

Ok. That makes sense. I especially like the PS3 example. DOD is just a way to optimize for specific hardware. Because of that, what may be optimal and efficient for one system might not be for another. So you would have different DOD solutions for GBC than you would for 3DS because the hardware is different (eg. 3DS RAM > GBC RAM). That's why most consoles can be backward compatible but not the other way around. Because an optimization for PS4 hardware is still beyond the specs of a PS1 and therefore inefficient. But if they allowed it, a PS4 has more than enough resources to run something that would work on a PS1. Code to the platform?

 

When you get into older systems there was a lot more variance than there is today -- Not only did the GBC have a very small amount of RAM, it also had no caches and could execute code and read graphics directly from the cartridge. The GBA didn't have caches either, but it had 32kb of on-chip "fast" RAM that was low-latency and 32bits wide, plus 128KB of "work" RAM that was higher-latency and only 16bits wide -- in essence, "fast" RAM was intended, and was used, as a kind of programmer-controlled cache, where programmers intentionally put all their "hot" (frequently-accessed) data there, and you could still access the carts directly. The DS and 3DS have modern, if low-end, ARM cores that have basic, smallish caches, and relatively lots of system memory (10s or 100s of megabytes!) but can't execute code directly from the cartridge, code and graphics has to be loaded into system memory. DoD on all of these systems would be wildly different, but their CPUs were so slow (read: equally matched to RAM speed) that the DoD we speak of as with modern consoles only applies in a recognizable way to the most recent portables -- PSVita, PSP, 3DS (and NDS to a lesser degree).

 

On home consoles, the playsation line has been a hodge-podge of wildly different architectures right up through the PS4. Sega systems were wildly different as well, through the Dreamcast. Nintendo as well, up through the Gamecube. Microsoft's systems have not been so wildly different from one another, despite starting with x86, jumping to PPC, and back to x64 again, in all other regards the evolution of Microsoft systems has been a relatively straight line.

 

But if you look at recent systems, they're all pretty homogeneous and not much different than a typical PC (PS4 and XBONE are almost exactly PCs in most regards), this is even true of the Xbox 360 and original Xbox, the Wii U, Wii, and Gamecube -- even the dreamcast. They all have traditional, non-exotic CPUs and GPUs based on commercial-standard architectures (more or less), with caches, flat memory models (and most of them unified memory architectures). Consoles are a lot more similar to one another (and to PCs) in the past couple generations than in generations before. With these "modern" machines those broadly-applicable DOD-type transformations pay off basically everywhere, different platforms might still have specific tricks but they're 10-20% of optimization potential, not 80-90%. In days before modern consoles, when each machine was wildly different, those specific tricks were ~80% of optimization potential, so you really did have to approach each machine as a different beast.

Edited by Ravyne

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this