would like advice on working on a large piece of software...

Started by
13 comments, last by Feidias 19 years, 7 months ago
hi, i've been working on a 2d RPG for around the past 5 months. including the editor, this project is over 10k lines of code and over 65 files. it's starting to get huge. i didnt work on my editor for a few months, and then came back to it. wow, i couldnt remember how half of the stuff worked. it didn't help that the level editor is absolutely the biggest pile of crap code ive ever written, either. really sloppy and messy. that's because i started out coding it all sloppily, making everything public, thinking "oh its only a small editor, i will finish it in a week or 2"... well, that "small" editor is pretty damn complex now, with a whole load of nice features. ive been working on it for the past 2 or 3 weeks, and as im about to finish up working on it, im starting to get nervous that im going to come back to my game code, and think the same things i thought with my editor. the main difference though, is that my game is pretty damn slick looking. everything is nice and encapsulated, so its not that big of a deal. but theres still a point to this post. when the code starts to get really big, i start to get nervous with every new feature i add to the game. i start thinking "damn, this code looks so nice right now, i dont want to mess with it", but the game is probably only half finished, so obviously i HAVE to "mess with it". i just get nervous that a new piece of code will break encapsulation or repeat data or cause some subtle bug that will make me go insane for the next 2 weeks. what kind of things can i do to avoid that from happening? basically, im just looking for general advice on working on a large project. things to do, things to not do, etc. i think there are some sort of systems out there which help you manage changes you make to your code, called CVS or something? could someone explain maybe how these things work, and if i should get one set up? thanks a lot for any advice.
FTA, my 2D futuristic action MMORPG
Advertisement
I've been using SubVersion for a couple of months now and it's helped a lot when adding new stuff in, mainly because I always have a restore point to a place where my code was nice.

In an ideal world we'd have everything planned out way in advance, so the features are already there (perhaps stubbed out or ready to plug in) but in the real world of hobby game programming this is seldom the case.

I guess the best way is to think ahead at least a few features, planning on how to add them before you come to add them, making assessments in advance about the impact to the wider code base on both macro and micro levels.

I also find it helps to make plenty of notes and comment the code fully, even if it seems pointless. Making notes at the time you're doing something really helps if you revisit the section and think "oh man, what was I thinking when I did this?".

I know we should all plan it way in advance as then everything would be smooth to implement, a matter of coding to our plan but I can never quite get into that mentality. It is to my detriment, I acknowledge that and am slowly turning myself around to the idea of preplanning.

Other than that, I've also found writing highly modular code to be beneficial. You have a chunk of code that's fully tested and has a nice interface with predicatble inputs and outputs, you just know that sucker's gonna hold up when you throw other things into the mix... at least in theory [wink]
CVS and other versioning systems just keep changes to the code so that you can "back up" if something you add causes a horrible bug, and you need to undo it. Similarly, you can create "old" versions of the code to see where things went wrong. It also generally keeps track of who makes what changes so the perpetrator of particularly bad code can be suitably beaten.

But back to the original idea:

I try to make sure each and every function has comments right after the function declaration. Generally a nice full word representation of the function name [Render Object Delete Tree for void ro_deltree()], a short english description of its function, and a code example of common usage. A description of parameters and returns is also generally added when applicable.

For me anyways, the code example is what has helped the most looking back at large code. Generally it gives me a good idea about what I was thinking when I wrote it, as I know how the function was meant to be used.
Extensively test each component you write by itself before you add it.
Using CVS or other versioning systems is a great habit to get into, especially if you end up working with other people (CVS can automatically merge changes for you). Since no one has done any sort explaination as to how these systems work, I guess it's up to me. Basically you set up the server to keep a "repository" of your code. You "check in" or "commit" your changes into the repository and the program keeps track of the version of every file you keep in the repository. You can then revert to old versions, compare versions, and much more advanced versioning stuff (branching, etc.). If you want to try using CVS in Windows, you will need the CVS NT version of the server. There are many client programs available (commandline programs, jcvs, etc.) and a few IDEs (Eclipse and I believe Dev-C++) that support it. It is well worth learning.

Also, I have found that using doxygen or similar documentation systems can save you a long time figuring out what that DoStuff() function you wrote 2 years ago was supposed to do or what the global integer "a" is supposed to do. I always had trouble with inconsistent documentation and programs like doxygen lets you have well structured comments that you can convert to HTML or PDF. Of course, it requires you to spend that 15 seconds to write the comments. But believe me, it can save you hours of headscratching.
I've used CVS for a while now and is extremely to use although the new Subversion software is meant to improve upon CVS is a few ways.

You can run both CVS and Subversion servers of your computer without any need to expose your repository to the outside world, I've only ever user CVSNT on my own computer which is quite easy to set up and use with a client like TortoiseCVS.

Here's the link to the CVSNT page and here's a guide to getting it set up.
Ideally, you'd start with a complete design of everything you need, and then keep the design docs up to date as you make inevitable changes to the design. Also, you should write up nice documentation for each class/method you write. Of course few people do either, especially for solo or hobby projects. What I've found more practical is to come up with some of your own design patterns, and write those down. For example, you might come up with a standard way for handling UI elements (buttons, etc). You might come up with a standard way for handling autonomous game elements (such as the player, and monsters). It's far more realistic to come up with a few design templates/patterns for your game, than it is to design each individual component one at a time. This approach is also more tolerant of mid-development changes - you frequently change individual components, but rarely change core concepts.
Quote:Original post by Anonymous Poster
Extensively test each component you write by itself before you add it.


This is commonly known as unit testing, and it's a very good idea. I've been investigating unit testing recently, and I've decided on CppUnit, which automates C++ unit testing. There's also JUnit, which does unit testing for Java.

I've worked on large projects and I'm telling you that the key to not getting yourself in the mess that you're in is documentation. I know that we developers aren't keen on documentation, but it can really make a difference in a large project.

First of all, it's nice to have a requirements document. This can be pretty short. It pretty much says what your software needs to do and spells out specifically what features it has. This is important when dealing with clients, as it makes it clear to both parties what will be in the delivered software. For hobby programmers, that kind of thing isn't needed, but a requirements document helps keep you focused on what you need to do, and prevents feature bloat. Feature bloat often occurs when you think in the middle of the project, "Oooh, that would be neat!", and then add it. This is generally a bad idea, can cause design problems, and you may end up never completing anything. It's vital to work towards a goal, a point where you can say that it's done. If you want to add more features after you've defined the requirements, make a list of them somewhere and make them into the requirements for the next version. Note that no implementation details should be included in the requirements document.

Secondly, it's very helpful to have some kind of design document before you begin a project, even if the project is not large. Small projects can sometimes get much bigger than intended. A design document is in my opinion the most difficult part of the project. This is where you specify how you will implement the features you specified in the requirements document. This is where you do most of your thinking: you specify what libraries you use, define your classes and modules, and make diagrams showing how the classes and modules interact, etc. That's important so your code doesn't become a hopeless cludge, and it reminds you of what is happening in your project. Note that you do not usually need to go into little details here, like individual functions and so forth. The design document is more of a guide to follow while coding, not a detailed set of instructions. Your code may very well deviate from the design in minor ways. Also, don't be afraid to revise your design document if you see design flaws while coding.

Finally, the absolute most important thing is to *document your code*. Make comments and plenty of them. I like to keep a narrative in my comment of what is happening. Comments don't need to be things like "Assign '2' to the variable x". That you can see plainly in the code. They should be more like "Check to make sure that the surface is still available". It's like explaining to someone what you are doing and why you are doing it. That someone will be you when you come back later and have forgotten what you were doing.

I suggest putting a comment header at the top of your file, saying what it is in it. Put comments at the top of your classes, saying what they do. Put comments above your functions, saying what they do, what parameters they accept, what they return, and what needs to have happened before calling it. Put comments in the actual code, narrating what is happening. It's annoying and time consuming, I know, but it will save you much more time later on when you're trying to figure out what is happening.

I suggest putting comments in the format an automatic documentation generator can read. Javadoc for Java is a great tool, and for C++, I personally use Doxygen, which generates some nice code documentation. The documentation files that these tools produce are very nice for reference purposes.

As other posters have been saying, use source control. It is a nice backup device when you lose something, and keeps track of your changes for you. When I first used source control, I saw it as minorly useful, but now I consider it indispensable. CVS and subversion are free and commonly available. I found them (well, at least CVS) hard to understand at first, but there is some really good documents out there on the web that are good for learning them.

EDIT: corrected minor spelling and grammar errors
Mess with it in ways that make it smaller. Fish out the redundancy and make it use a function. Look for similar things and get the function to handle them too, adding only as much power as you need (and make sure that they're similar enough that you can handle things in the same basic way). Consider making small helper classes for various things. Use polymorphism and rip out switch statements. Avoid "explicitly not doing" things (e.g. empty else blocks, if/else structures that just return a boolean in each case anyway, etc.).

I.E. learn the fine art of refactoring.

And don't be afraid! The "if it ain't broke don't fix it" mentality leads to just these sorts of messes. It's ok to think that a non-broken thing shouldn't be fixed, but only if you note:

- if it contains ugly redundancy, it's broken.
- if it's not clear, it's broken.
- if the comments don't correspond to what's being done, it's broken.
- if the comments don't explain why you're doing something the way you do it (as opposed to just restating what the code does), it's broken. (Sometimes the best comment is none at all.)
- if it contains ugly redundancy, it's broken.
(And yes, the irony is very intentional.)

Don't worry about repeating data so much... first check that your existing code doesn't repeat data, then add the new code. Then check if the new code repeats data from the old code. If it does, there's some redundancy for you; so fix it. Breaking encapsulation isn't going to happen if you got your keywords right in the old code; you'll get an error at compile-time. That's what the 'private' and 'protected' declarations are for, after all. And bugs... well, that's a fact of life (or at least of programming); just make sure you know what to test for. And each time you get a new bit of code working, make sure it didn't break anything that worked before (i.e., you should "run regression (tests)").

And ultimately, the best thing is to just get experience making these changes, over time you'll learn to see all kinds of potential changes and it will become second nature to make them.
Quote:Original post by Zahlman
- if it contains ugly redundancy, it's broken.
- if it's not clear, it's broken.
- if the comments don't correspond to what's being done, it's broken.
- if the comments don't explain why you're doing something the way you do it (as opposed to just restating what the code does), it's broken. (Sometimes the best comment is none at all.)
- if it contains ugly redundancy, it's broken.
(And yes, the irony is very intentional.)


You forgot to add:
- If making a change in one place breaks a bunch of other places, it's broken

That's the most important one! If you can figure out why so many pieces of code are so tightly coupled with this code, and change it so that they're not, then you make it easier to make changes in the future without breaking things. It's also a strong justification for unit testing, as you can run the test suite after making some changes, and it will usually show you what broke fairly quickly (if you did it right).

However, when you've already built up a large base of code, it can be a very large task to do either one completely. The best advice I can give is to push on with the changes, but when things start to break, never use band-aids to fix the problems. Band-aids cover up the symptoms of each problem, but the underlying design flaws are still there, allowing them to pop up later when you make another change. If you do that a lot, then after a while you'll have band-aids on top of band-aids all over the program, and then small changes can break a whole bunch of them.

This is why many developers prefer refactoring. Instead of making the code messier when you fix a problem, you try to make it cleaner. If that means rewriting a whole class and changing all the places that call it, that's usually not a big deal. Often a few cut-n-paste and global search-n-replace operations will do the trick. Of course, the sooner you make these changes, the better. The longer you wait when you realize something needs to be refactored, the more painful it is going to be.

When a change in class A breaks code in other places classes B and C, it may mean that your class is not as well-encapsulated as you thought, or that that other classes are too tightly coupled with class A. Look at the class relationships and find out if there is a way to separate them a bit more so that what you do in class A only affects what is in class A.

This usually means replacing lower-level public methods in A with higher-level public methods. If B and C are calling get/set methods for individual member variables in A, try to find a way to get rid of them and use more high-level methods that perform actions. The "Pragmatic Programmer" book has some advice I like that goes something like this. This is a train:

order.getCustomer().getAddress().getState().getSalesTaxRate()

If you try to change any of the classes in this train, you can cause a train wreck. If you use trains a lot, you box yourself in and make it very difficult to make changes to your code. In this case, you can clean it up by providing a method like this (note how it replaces lower-level functions with a higher-level one):

order.calculateSalesTax()

This keeps the client code from having to know what's inside the order object. It's not very likely that the code using the order object cares what the sales tax rate is in a particular state, it just needs to know how much sales tax to charge for the current order. Some orders may just have an address tied to them and not an actual customer in the database. Some orders will be from different countries and won't have states.

You get the point. ;-) Public members are often bad, but individual get/set methods for private members can be just as bad. Sometimes you can't help it, and for simple classes or structs it doesn't matter. If you have a 3D vector class, you can go ahead and make every member and method in it public. If you have a camera class with a position vector, it's ok (and really necessary) to have a getPosition() method. But as your objects and their relationships become more complex, they need to get more high-level.

This topic is closed to new replies.

Advertisement