Jump to content

  • Log In with Google      Sign In   
  • Create Account






The Triumph of the Unix Philosophy

Posted by ApochPiQ, 08 October 2011 · 1,089 views

Software development, I suppose like pretty much any field, is rife with camps, schools, doctrines, philosophies, and other assortments of ways of thinking about various things. And, as with most things, these can be over-generalized into two basic groups: the monolithic, and the decentralized. This is of course an over-generalization, so there will be holes in the analogy; but despite this, I think the general breakdown holds in most areas, and can yield some particularly valuable insights into software creation.

More specifically, I want to talk about the processes of creating software themselves. Not "waterfall" and "agile" and those kinds of processes - the actual nitty-gritty, day-in-day-out stuff. What do your tool chains look like? When you get an idea for a software product, how do you go from that idea to an implementation? In the world of game development, how do your designers, artists, audio engineers, composers, and writers collaborate to create a finished product?

I contend that there are two basic ways to build a tool chain. It is worth noting that most real tool chains will be hybrids; they will fall somewhere on a continuum from one end of the spectrum to the other. Hardly anyone does anything purely monolithic or purely decentralized. Chances are, though, that every tool set leans heavily towards one side or the other.



Monolithic Tool Systems: Go Big or Go Home
If you can count the number of tools in your pipeline on the fingers of one hand (using basic grade-school counting, no cheating with alternative number bases!) then you're probably in the monolithic camp. There's a tool that does content creation - text editing, music integration, level design, you name it. There's a build tool that might also handle publication. And there's a sprinkling of little guys - format converters, validators, etc. that are basically just bit players in the grand architecture.

Most places I've worked are monolithic. They favor a simple philosophy: one giant tool to rule them all. Of course, again, in reality it's probably somewhere between two and five tools, but there aren't many. The programs launched in the process of building a master of the game are few and complex, likely glued together with some kind of scripting language, and generally the flow is short but easy to understand.

The advantage of monolithic processes is precisely in that ease of understanding. You take some assets, pump them through the build process, and out comes a game. Simple, clean, and effective.

Sadly, the advantages end there - but before we delve too far into that rabbit hole, let's look at the competing philosophy.


Decentralized Tool Systems: No Single Point of Failure
In a decentralized world, you don't have large programs. You have dozens, maybe even a couple of hundred small programs. Each one does something very specific, very focused, and very simple. A build process, for example, is not a handful of tools chained together by a script: it's a simple program that shells out to a potentially vast number of helper programs to do numerous small changes to the data until it arrives in the final master format.

This environment can be tricky. As the number of tools proliferates, it may seem like the complexity is overwhelming, even prohibitive; who's going to keep track of all these little utilities? Who owns the process? What if something goes wrong?

On paper, decentralized tool systems might seem like a net loss. So why would anyone ever want to build one?


Unix
Interestingly, this dichotomy is almost as ancient as computing. In fact, the first massively successful embodiment of the decentralized philosophy is none other than the Unix operating system family. In the Unix world, you accomplish things by chaining together a number of simple but flexible tools. The number of ways in which a few basic programs can be combined is astronomical, and this is what makes the Unix-style command line so immensely powerful. Anyone who has seen a wizard cast sed/awk/grep spells can attest to this.

So why is the decentralized mode of operation in Unix appealing for game tools? The simple answer is combinatorial flexibility. The same thing that makes a typical Unix shell more powerful than MS-DOS's command prompt can teach us tremendous amounts about how to build general tool pipelines.


PowerShell
My biggest complaint with the Unix philosophy is that everything is piped between commands in textual format. This means that every program you want to combine must parse text and output results in a way that is compatible with other programs in the chain. If you've ever tried piping data from one program into an incompatible one, and spent the next few hours writing shell scripts or something to try and glue them together, you know how painful this can get.

PowerShell takes the Unix philosophy of combining small programs and extends it in a brilliant way. Instead of piping text between programs, you pass around entire .Net objects. This is a bit of a head-trip to get used to at first, but once you grok it, the power is almost uncanny. When you get used to writing your shell scripts in terms of objects and reflection, you can do incredible things in very short bursts of code. My humble opinion is that PowerShell is the future of command-line tool chains, and that its core tweak to the Unix philosophy is a gold mine of educational material for ways to construct software pipelines in general.


Answering the Question Why
So why decentralized? It's difficult to quantify this in a way that makes sense, and indeed a lot of my own (nascent) leaning towards decentralized pipelines remains more intuitive than objective; but if I could sum it up briefly, it'd be like this:

Decentralized systems offer immeasurable gains in flexibility, and flexibility is key during much of a software pipeline's development life-cycle.

There is of course a point where your pipeline settles down, things harden, and you can just crank stuff through it; but there's a much longer span in the production of a typical game where the pipeline must remain adaptable to the ever-changing demands of the project itself. As time goes on, the odds of drastic change becoming necessary approach one. Eventually, you will reach a point where something significant must be adjusted, and that's the moment where the difference between monolithic and distributed systems will really jump out.

In a traditional monolithic approach, major changes to the ordering or sequence of events can be a nightmare. You have to hope that someone remembers why Step 47 is dependent on Step 36 (because god knows there's not going to be any documentation about it!). Even worse, if suddenly step 47 must precede step 36, you have to cross your fingers and pray that the dependency is safe to invert. The larger the monolith, the harder this becomes - and, ironically, my gut feeling is that it becomes inevitable as tools become larger and more sophisticated.


An Alternative Vision
Suppose your tool pipeline were structured like PowerShell: everything is done by composing pipes of rich objects. Instead of "read file, convert file, write file" being the order of the day, you pass everything through a special channel that speaks in terms of rich self-describing data. Using a combination of reflection and careful design, you could arrange an entire build process or content mastering pipe as a sequence of simple, almost-trivial steps.

More importantly, you could then recompose those steps arbitrarily. Need to convert a bunch of texture data into a new file format? Just run the fragment of the pipeline responsible for file conversion, using a new output phase. Need to build a binary for a new platform? Just add a new output handler to the code-processing pipe. Want a way to test build system changes in isolation? Now you don't have to bother building a parallel build system - which can be a nightmare all its own in a monolithic environment - because you can just run the fragment of the pipeline that you're changing and validate the results independently.

This kind of decentralization is, I feel, priceless - but not cheap. It's a perspective shift that affects everything, and it's incredibly difficult to transition from a monolith to a decentralized system in mid-stream. I suspect that the best time to do the change is between projects, when nobody is reliant on the pipeline to be working at maximum capacity; but certainly there is at least potential for making gentle transitions while people are still trying to get work done.


Conclusion
In my book, the Unix philosophy is a big winner here. More accurately, the PowerShell philosophy is the real winner, because piping rich objects is far more powerful and easy to use than piping raw text. Maybe there's a connection between the fact that most games shops run Windows these days and the trend towards monolithic pipelines; maybe it's just observational bias from the places I've been, I don't know.

But I would wager a nice beer that there's a lot of potential in embracing a different way of structuring tool pipelines, and I hope to discover first-hand whether or not that is the case in the near future.


Happy hacking!




Conclusion
In my book


So you are writing a book? ;) Hehe

September 2014 »

S M T W T F S
 123456
78910111213
14151617 18 1920
21222324252627
282930    
PARTNERS