• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
  • entries
  • comments
  • views

Speculation on semantically-aware version control

Sign in to follow this  
Followers 0


Working on a massive code base comes with some interesting challenges.

Consider this scenario, loosely based on a real-world problem we recently encountered at work:

  • Create a multi-million line code base
  • Divide this into over two dozen branches for various independent development efforts
  • On some given branch, find some popularly-edited code that needs reorganization
  • Move some functions around, create a few new files, delete some old crufty files

    Given these changes, how do you cleanly merge the reorganization into the rest of the code tree? Between the time of the reorg and the merge, other people have made edits to the very functions that are being moved around. This means that you can't just do a branch integration and call it good; everyone has to manually recreate the changes in one way or another.

    Either the programmer who did the reorg is burdened with manually integrating all the smaller changes made from other branches into his code prior to shipping his branch upstream, or every programmer across every branch must recreate the organizational changes with their own changes intact.

    Obviously part of this is just due to the underlying issue that one file is popular for edits across many branches. But the clear solutions to the problem are all annoying:

    • Option One: Force one person to manage many changes. This is gross because it overburdens an individual.
    • Option Two: Force all people to manage one change. This is even worse because it requires everyone to fiddle with code they may not even care about.
    • Option Three: Never write code which becomes popular for edit. This is obviously kind of dumb.
    • Option Four: Never reorganize code. Even more dumb.

      At the root of the problem is the way version control on source code currently works. We track textual differences on a line-by-line (or token-by-token) basis. The tools for diffing, merging, history tracking, and integration are all fundamentally ignorant of what the code means.

      What if this were not true?

      Suppose we built some language that did not store code in flat text files. Instead, it divides code into a strict hierarchy:

      Project -> Module -> Function -> Code

      You could insert classes or other units of organization between "modules" and "functions" if the language is so designed, of course.

      Now, suppose that instead of having all code in a module go into a folder, and all functions in the module going into text files in that folder, we just treat a module as a data unit on disk.

      Within this unit, we have arbitrary freedom to stash code however we want. The IDE/editor/other tools would understand how to open this blob of data and break it up into classes, functions, and maybe even individual statements or expressions.

      So here comes the interesting bit. Strap on your helmet, kiddies, we're going to go fast.

      • Assign each atomic unit of code (say, a function, or maybe even a statement if you want to get crazy detailed, but that's probably a bad idea) a GUID.
      • Store the data as a GUID followed by the textual code associated with that GUID.
      • Store alongside the data a presentation metadata model which describes how to show these units of code to the programmer. This should be fully configurable and rearrangeable via the editor UI.
      • Each of the smallest level-of-detail objects gets stored in a separate file, identified by its GUID.

        Given a set of code attached to a GUID, we no longer show revisions in the version control system as edits to a file. Instead, we show them more granularly, following the organizational hierarchy defined by the language: project, module, function, code. The project as a whole is grouped as a tree, allowing easy visualization of the hierarchy. Open a single node and you can see all changes relevant to that node and its children, on any level of granularity you like.

        This sidesteps our original problem in two interesting ways. First, if we just want to reorganize code in a module without changing its functionality, we can do so by modifying the presentation metadata alone. This allows even an old dumb text-based merge utility to preserve our changes across arbitrary branches.

        Second, and more fascinating, what if we want to reorganize code across modules? All we have to do is record that the code from one list of GUIDs moved from one module to another. If we don't store modules as folders, but instead as another layer of metadata, we can make another simple textual change that records the GUIDs belonging to one module GUID in revision A, and another module GUID in revision B.

        Why GUIDs? Easy: it allows us to rename any atomic unit of code, or any larger chunk of code units, arbitrarily without breaking any of the system. Delete a function? No problem! Just remove its GUID from version control history like you would have deleted a file in the old approach. Add a function? Also no problem; it just becomes a new file in source control. Move a function around to another file or module or even project? Who cares?! It just changes metadata. Not the code itself.

        So let's bring it all together. What if we want to have the exact original scenario? Programmer A makes several changes to organization of some module Foo. Meanwhile, programmers B through Z are making changes to the implementation of Foo, on different code branches.

        Merging all of this becomes trivial even under existing integration tool paradigms, because of how we decided to store our code on disk. Even cooler, anyone can engage in reorganizations and/or functionality changes without stomping on anyone else when it comes time to merge a branch upstream.

        All this requires is a little lateral thinking and some tool support for the actual code editor/IDE. If you want to support things like browsing the code repo from the web, all you need to do is add a tool that flattens the current project -> module -> function -> code hierarchy into an arbitrary set of text files - again, trivial to do with a little metadata.

        The more I think about this approach to version control, the more convinced I am that Epoch is going to try it out. Flat text files are dumb and outmoded; it's time we used computers to do our work for us, the way it was always meant to be.

Sign in to follow this  
Followers 0


This is essentially exactly what we did with the content streams system, and it's one of the primary reasons we did it.


We also discussed how this could be extended to code; essentially, you pretend the code is "content", in the form of the AST. Thus, you can do merges at the semantic level instead of the dumb textual level that version control systems typically support.


Share this comment

Link to comment

This has got to be one of the most common conversations I have at work. The strange thing is that with everyone talking about it, no one has actually done it yet...


I think it is worth mentioning that your described storage format is pretty much the structure of a git repository (file -> blob, metadata -> tree, etc). Might as well avoid reinventing the wheel, and just use git as your native storage format.


Share this comment

Link to comment

I wonder if very descriptive but concise commenting can make all the difference in version control.   If the source control software is made to look for key words, sort of flags if you will, in the commenting then this could direct the software to highlight the things which matter the most to the developer.


Share this comment

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now