Jump to content
  • Advertisement
  • 12/09/18 10:16 PM

    The Faster You Unlearn OOP, The Better For You And Your Software

    General and Gameplay Programming

    GameDev.net
    Quote

    Object-oriented programming is an exceptionally bad idea which could only have originated in California.
      - Edsger W. Dijkstra

     

     

    Maybe it's just my experience, but Object-Oriented Programming seems like a default, most common paradigm of software engineering. The one typically thought to students, featured in online material and for some reason, spontaneously applied even by people that didn't intend it.

    I know how succumbing it is, and how great of an idea it seems on the surface. It took me years to break its spell, and understand clearly how horrible it is and why. Because of this perspective, I have a strong belief that it's important that people understand what is wrong with OOP, and what they should do instead.

    Many people discussed problems with OOP before, and I will provide a list of my favorite articles and videos at the end of this post. Before that, I'd like to give it my own take.

     

    Data is more important than code

    At its core, all software is about manipulating data to achieve a certain goal. The goal determines how the data should be structured, and the structure of the data determines what code is necessary.

    This part is very important, so I will repeat.

    Quote

    goal -> data architecture -> code

    One must never change the order here! When designing a piece of software, always start with figuring out what do you want to achieve, then at least roughly think about data architecture: data structures and infrastructure you need to efficiently achieve it. Only then write your code to work in such architecture. If with time the goal changes, alter the architecture, then change your code.

    In my experience, the biggest problem with OOP is that encourages ignoring the data model architecture and applying a mindless pattern of storing everything in objects, promising some vague benefits. If it looks like a candidate for a class, it goes into a class. Do I have a Customer? It goes into class Customer. Do I have a rendering context? It goes into class RenderingContext.

    Instead of building a good data architecture, the developer attention is moved toward inventing “good” classes, relations between them, taxonomies, inheritance hierarchies and so on. Not only is this a useless effort. It's actually deeply harmful.

     

    Encouraging complexity

    When explicitly designing a data architecture, the result is typically a minimum viable set of data structures that support the goal of our software. When thinking in terms of abstract classes and objects there is no upper bound to how grandiose and complex can our abstractions be. Just look at FizzBuzz Enterprise Edition  – the reason why such a simple problem can be implemented in so many lines of code, is because in OOP there's always a room for more abstractions.

    OOP apologists will respond that it's a matter of developer skill, to keep abstractions in check. Maybe. But in practice, OOP programs tend to only grow and never shrink because OOP encourages it.

     

    Graphs everywhere

    Because OOP requires scattering everything across many, many tiny encapsulated objects, the number of references to these objects explodes as well. OOP requires passing long lists of arguments everywhere or holding references to related objects directly to shortcut it.

    Your class Customer has a reference to class Order and vice versa. class OrderManager holds references to all Orders, and thus indirectly to Customer's. Everything tends to point to everything else because as time passes, there are more and more places in the code that require referring to a related object.

    Quote

    Instead of a well-designed data store, OOP projects tend to look like a huge spaghetti graph of objects pointing at each other and methods taking long argument lists. When you start to design Context objects just to cut on the number of arguments passed around, you know you're writing real OOP Enterprise-level software.

     

    Cross-cutting concerns

    The vast majority of essential code is not operating on just one object – it is actually implementing cross-cutting concerns. Example: when class Player hits() a class Monster, where exactly do we modify data? Monster's hp has to decrease by Player's attackPower, Player's xps increase by Monster's level if Monster got killed. Does it happen in Player.hits(Monster m) or Monster.isHitBy(Player p). What if there's a class Weapon involved? Do we pass it as an argument to isHitBy or does Player has a currentWeapon() getter?

    This oversimplified example with just 3 interacting classes is already becoming a typical OOP nightmare. A simple data transformation becomes a bunch of awkward, intertwined methods that call each other for no reason other than OOP dogma of encapsulation. Adding a bit of inheritance to the mix gets us a nice example of what stereotypical “Enterprise” software is about.

     

    Object encapsulation is schizophrenic

    Let's look at the definition of Encapsulation:

    Quote

    Encapsulation is an object-oriented programming concept that binds together the data and functions that manipulate the data, and that keeps both safe from outside interference and misuse. Data encapsulation led to the important OOP concept of data hiding.

    The sentiment is good, but in practice, encapsulation on a granularity of an object or a class often leads to code trying to separate everything from everything else (from itself). It generates tons of boilerplate: getters, setters, multiple constructors, odd methods, all trying to protect from mistakes that are unlikely to happen, on a scale too small to mater. The metaphor that I give is putting a padlock on your left pocket, to make sure your right hand can't take anything from it.

    Don't get me wrong – enforcing constraints, especially on ADTs is usually a great idea. But in OOP with all the inter-referencing of objects, encapsulation often doesn't achieve anything useful, and it's hard to address the constraints spanning across many classes.

    In my opinion classes and objects are just too granular, and the right place to focus on the isolation, APIs etc. are “modules”/“components”/“libraries” boundaries. And in my experience, OOP (Java/Scala) codebases are usually the ones in which no modules/libraries are employed. Developers focus on putting boundaries around each class, without much thought which groups of classes form together a standalone, reusable, consistent logical unit.

     

    There are multiple ways to look at the same data

    OOP requires an inflexible data organization: splitting it into many logical objects, which defines a data architecture: graph of objects with associated behavior (methods). However, it's often useful to have multiple ways of logically expressing data manipulations.

    If program data is stored e.g. in a tabular, data-oriented form, it's possible to have two or more modules each operating on the same data structure, but in a different way. If the data is split into objects with methods it's no longer possible.

    That's also the main reason for Object-relational impedance mismatch. While relational data architecture might not always be the best one, it is typically flexible enough to be able to operate on the data in many different ways, using different paradigms. However, the rigidness of OOP data organization causes incompatibility with any other data architecture.

     

    Bad performance

    Combination of data scattered between many small objects, heavy use of indirection and pointers and lack of right data architecture in the first place leads to poor runtime performance. Nuff said.

     

    What to do instead?

    I don't think there's a silver bullet, so I'm going to just describe how it tends to work in my code nowadays.

    First, the data-consideration goes first. I analyze what is going to be the input and the outputs, their format, volume. How should the data be stored at runtime, and how persisted: what operations will have to be supported, how fast (throughput, latencies) etc.

    Typically the design is something close to a database for the data that has any significant volume. That is: there will be some object like a DataStore with an API exposing all the necessary operations for querying and storing the data. The data itself will be in form of an ADT/PoD structures, and any references between the data records will be of a form of an ID (number, uuid, or a deterministic hash). Under the hood, it typically closely resembles or actually is backed by a relational database: Vectors or HashMaps storing bulk of the data by Index or ID, some other ones for “indices” that are required for fast lookup and so on. Other data structures like LRU caches etc. are also placed there.

    The bulk of actual program logic takes a reference to such DataStores, and performs the necessary operations on them. For concurrency and multi-threading, I typically glue different logical components via message passing, actor-style.  Example of an actor: stdin reader, input data processor, trust manager, game state, etc. Such “actors” can be implemented as thread-pools, elements of pipelines etc. When required, they can have their own DataStore or share one with other “actors”.

    Such architecture gives me nice testing points: DataStores can have multiple implementations via polymorphism, and actors communicating via messages can be instantiated separately and driven through test sequence of messages.

    The main point is: just because my software operates in a domain with concepts of eg. Customers and Orders, doesn't mean there is any Customer class, with methods associated with it. Quite the opposite: the Customer concept is just a bunch of data in a tabular form in one or more DataStores, and “business logic” code manipulates the data directly.

     

    Follow-up read

    As many things in software engineering critique of OOP is not a simple matter. I might have failed at clearly articulating my views and/or convincing you. If you're still interested, here are some links for you:

     

    Feedback

    I've been receiving comments and more links, so I'm putting them here:

     

    Note: This article was originally published on the author's blog, and is republished here with kind permission.



      Report Article


    User Feedback




    I've been working as a developer in a java shop for quite some time now, and yes, I do feel the author's pain. I must admint that there have been times that I dreamt of hiring a Terminator to kill James Gosling before he could invent java or to smash up all the windows of the Oracle head office.

    Maintaining a huge pile of java (and we do java EE, which adds an entire dimension of madness) can be a nightmare. I often feel like Miss Marple trying to chase a variable through layers of functions that just forward it to another function in another class, then hitting an interface and having to figure out which of the several implementations is used in this situation and then completely disappearing. Only to discover that it has been magically injected elsewhere else through a deployment descriptor (silently read config file). Often full text search through everything has been my best friend.

     

    So yes, it is easy to create an object oriented jungle. Especially with the style that java very much stimulates: dividing everything into tiny objects that all jalously hide their data, lots of layering, hiding complexity in implicitly read config files, dragging in frameworks for everything (and not maintaining them, because that is not a user story). But at least I am old enough to know that creating a C nightmare is at least as easy, especially if people try to be a bit too clever. Often the thing that keeps C programmers in line is that the whole thing will likely not compile or core dump quckly if they go too far. I much prefer C++ if only because it doesn't try to bully you into some particular coding style, however in a corporate environment  that advantage will probably disappear fast. I still think that OO has many very valuable aspects and should not be written off so easily. If one properly, it is a life-saver. Too bad is is so often used improperly.

     

    Having said that, my current project uses an ECS approach and I have to say it works very nicely. It does have a specific usecase however. I think ECS works best with RPG or Civilization style games or a certain kind of physical simulation, because those have entities that have loads of data and behaviour. And most of the behaviour touches only a relatively small part of the data. In a pure OO like approach, those classes easily grow out of control, or they degenerate into a bag of pointers to sub-objects. In a way that converges to some kind of ECS, where all the sub-objects take the role of components. My experience is that as long as you keep the Systems small and simple, the concept is very modular and easy to maintain and expand. Often adding more features is no more than adding a component or two plus a system. It's easy to start small and scale up. In a pure OO approach this requires much more planning. But I wouldn't like to do ECS without OO. ECS in pure C sounds like a real challenge to me.

    Share this comment


    Link to comment
    Share on other sites



    Create an account or sign in to comment

    You need to be a member in order to leave a comment

    Create an account

    Sign up for a new account in our community. It's easy!

    Register a new account

    Sign in

    Already have an account? Sign in here.

    Sign In Now

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!