What should be stored in a repository?

Started by
21 comments, last by Mybowlcut 12 years, 11 months ago

Personally, I always use visual studio, which is free.

If you can't agree on a common one, then you can use a tool like CMake to automatically generate project files for several different IDEs.

Ah. Good point, I had forgotten about Express.

Advertisement
I don't know if I'd go quite as far as Hodgeman suggests -- though I can certainly see what he's getting at, and in a professional environment the additional risk-mitigation delivered by that solution is worth the cost of maintaining it. I've seen quite a number of AAA build systems ranging from "just VS project files" to very complicated messes implemented in no less than 4 distinct languages -- of the dozen or so I've seen across as many game studios, I'd say that its a nearly-even split between those that do include all the tools and those that assume a suitable suite of tools has been installed somewhere on the system.

IMO, what should be in the repository is all the *source* data (code, art assets, configurations, scripts, third-party headers/libraries, etc) that is necessary to build the software. This probably also includes the source data for any necessary in-house tools that are specific to the primary application.


A couple things to consider:
Be wary of unnecessary or user-specific files pouting your repository -- things like Windows' thumbnail files or machine/user-specific config files (such as Visual Studio user preference files) -- most Revision Control Systems let you configure that files ending in certain extensions are to be ignored, which is what I'd recommend.

Following on, its probably best not to check in any intermediate files -- its too easy for an unsuspecting dev to change them, and then wonder what's going on when the intermediate file has been regenerated. A good example of this is not to include makefiles or VS Project files in a repo if you are using CMake. In fact, disallow those file types from the repository as described above. This can be seen as an extension of the "DRY" principle (Don't Repeat Yourself) -- its good for databases, its good for code, and it turns out that its good for repositories too (which are basically databases of code).

An argument can be made for keeping the code/build and art repositories as separate entities. Artists and programmers tend to work quite differently, so it should be no surprise that they may not be served well by a single tool -- imagine teaching your art team how to use GIT, or shackling your code team to some fancy GUI-based VCS. Also many of the typical text-focused VCSs don't deal well with binary files like sounds or images, or don't provide/integrate with additional tooling (such as visual diff programs). Another argument, is that art seems to be holding, at least for now, to the old checkout-modify-checkin model, while many coders now prefer the more-modern, decentralized workflows made possible by DVCSs such as Git, Mercurial, Bazaar, Fossil and others. Using this separated setup, you would have a 3-part build -- one part builds the tools and executables, another builds the finalized art assets from the source assets ("cooks" them), and another sucks up all those files to create the disc-image or installer package.

throw table_exception("(? ???)? ? ???");

Should the repository include the boost library if it's used?
Should the repository include the boost library if it's used?
I'd normally say yes, because I like to have all the project dependencies in there, so you can just check out the repo and build on any PC.

...but boost is so insanely huge that depending on your version control software, it can completely slow everything down...
Solutions to this that I've seen are:
1) Only check in the parts of boost that you're actually using.
2) Check in a ZIP file containing boost, and tell people to unzip it after they've checked out the repo.

[quote name='simpler' timestamp='1306311262' post='4815486']Should the repository include the boost library if it's used?
I'd normally say yes, because I like to have all the project dependencies in there, so you can just check out the repo and build on any PC.

...but boost is so insanely huge that depending on your version control software, it can completely slow everything down...
Solutions to this that I've seen are:
1) Only check in the parts of boost that you're actually using.
2) Check in a ZIP file containing boost, and tell people to unzip it after they've checked out the repo.
[/quote]

Ok, thanks. Do you believe that you in the future can take for grant that people downloading your repo has boost on their computer? Ofcourse this depends on context but will boost become such a standard library?
[font="arial, verdana, tahoma, sans-serif"]

I don't know if I'd go quite as far as Hodgeman suggests -- though I can certainly see what he's getting at, and in a professional environment the additional risk-mitigation delivered by that solution is worth the cost of maintaining it. I've seen quite a number of AAA build systems ranging from "just VS project files" to very complicated messes implemented in no less than 4 distinct languages -- of the dozen or so I've seen across as many game studios, I'd say that its a nearly-even split between those that do include all the tools and those that assume a suitable suite of tools has been installed somewhere on the system.

IMO, what should be in the repository is all the *source* data (code, art assets, configurations, scripts, third-party headers/libraries, etc) that is necessary to build the software. This probably also includes the source data for any necessary in-house tools that are specific to the primary application.


A couple things to consider:
Be wary of unnecessary or user-specific files pouting your repository -- things like Windows' thumbnail files or machine/user-specific config files (such as Visual Studio user preference files) -- most Revision Control Systems let you configure that files ending in certain extensions are to be ignored, which is what I'd recommend.

Following on, its probably best not to check in any intermediate files -- its too easy for an unsuspecting dev to change them, and then wonder what's going on when the intermediate file has been regenerated. A good example of this is not to include makefiles or VS Project files in a repo if you are using CMake. In fact, disallow those file types from the repository as described above. This can be seen as an extension of the "DRY" principle (Don't Repeat Yourself) -- its good for databases, its good for code, and it turns out that its good for repositories too (which are basically databases of code).

An argument can be made for keeping the code/build and art repositories as separate entities. Artists and programmers tend to work quite differently, so it should be no surprise that they may not be served well by a single tool -- imagine teaching your art team how to use GIT, or shackling your code team to some fancy GUI-based VCS. Also many of the typical text-focused VCSs don't deal well with binary files like sounds or images, or don't provide/integrate with additional tooling (such as visual diff programs). Another argument, is that art seems to be holding, at least for now, to the old checkout-modify-checkin model, while many coders now prefer the more-modern, decentralized workflows made possible by DVCSs such as Git, Mercurial, Bazaar, Fossil and others. Using this separated setup, you would have a 3-part build -- one part builds the tools and executables, another builds the finalized art assets from the source assets ("cooks" them), and another sucks up all those files to create the disc-image or installer package.

A great post, thanks Ravyne. I'd like to ask a few questions:[/font]
[font="arial, verdana, tahoma, sans-serif"] [/font]
[font="arial, verdana, tahoma, sans-serif"] [/font][font="arial, verdana, tahoma, sans-serif"]I have a root folder "E:\Dev\Projects\SSZS". This folder will store the entire project. It has two folders inside it, "project" (the Visual Studio project) and "repository" (the project hosted on SVN). This seems like a nice structure because you don't mess up IDE-specific stuff with the repository stuff. The only problem is that if I run the code below, it can't find the font file because I'm running the application from within a separate folder to the repository folder:[/font]
[font="arial, verdana, tahoma, sans-serif"] [/font]
MyFont.LoadFromFile("gui\\fonts\\arial.ttf", 50);

What is the way around this? Do you keep your IDE project in with the repository files?
Also, is it possible to limit what file extensions the repository (e.g. Google Code) will take? Or will each user have to set it up in their SVN client?

[quote name='simpler' timestamp='1306311262' post='4815486']Should the repository include the boost library if it's used?
I'd normally say yes, because I like to have all the project dependencies in there, so you can just check out the repo and build on any PC.

...but boost is so insanely huge that depending on your version control software, it can completely slow everything down...
Solutions to this that I've seen are:
1) Only check in the parts of boost that you're actually using.
2) Check in a ZIP file containing boost, and tell people to unzip it after they've checked out the repo.
[/quote]
I've always wondered about only using parts of boost. Is it advisable in the context of wanting to save space? Is it easy to do?


[quote name='simpler' timestamp='1306311262' post='4815486']Should the repository include the boost library if it's used?
I'd normally say yes, because I like to have all the project dependencies in there, so you can just check out the repo and build on any PC.

...but boost is so insanely huge that depending on your version control software, it can completely slow everything down...
Solutions to this that I've seen are:
1) Only check in the parts of boost that you're actually using.
2) Check in a ZIP file containing boost, and tell people to unzip it after they've checked out the repo.
[/quote]


This does not need to be so.

You can have multiple repositories, or multiple depots, or multiple sub-projects within a single repository.

I recommend that you have a tree that is separate from your main project. In the professional world I've seen it called "common", "external", "CM", and a few other names. This way you can still have the stuff and still recover, but it doesn't clutter your daily use machine.


You need to understand your own purpose for your version control system.

If your purpose is to provide a history and safety net for your own projects as you travel between home and school and other sites, then you probably don't need much more than just your source code. That will let you roll back minor bugs, but is otherwise of little value.

If your purpose is to collaborate with a large number of people then you need to share everything. You can't assume they'll have the same version of boost and other libraries, or the same version of compilers, or the same version of maya exporters, so all those items needs to be there.

If your purpose is so that you can re-create any build from any time in history by simply knowing a build number, then you absolutely need to store your tools, external programs, and external libraries. You will even want to store patches and service releases as you incorporate them to the game. You need to store all the compilers, since a compiler update can modify how builds were generated. You need to store the matching versions of photoshop and maya and other data-centric programs, since differences between versions may introduce small differences in the files, in turn causing the build to generate different results.


While it is true that these are big items, they change infrequently. For any substantial versioning system you are already going to need a rather nice st of machines; the added storage cost for a few extra gigabytes is trivial in comparison to the other costs.
Yeah in the case where we stored a zipped copy of boost, it was in a separate "thirdparty" repo. The problem was that simply checking out this repo for the first time, or updating the repo when someone added a new 3rd-party library, became ridiculously slow with the un-zipped boost present.


Yeah, it's a pretty occasional hiccup due to it being in another infrequently changing repo, but these infrequent waits add up with large team sizes. The reason we went over to storing a zipped version of boost in that case, was that the time saved per "thirdparty" update multiplied by the team size was over a full day of lost man-hours. They didn't like the idea of spending hundreds of dollars just to update our (free) dependencies.
Do you have any answers to my questions? I'm really unsure of how to go about storing IDE project files and the working copy of the project...

You only want to *avoid* storing such things if they are being generated from a tool such as CMake, due to the reasons I mentioned above. Namely, if you store the project files and someone changes them, then someone else regenerates the project file from CMake, it will overwrite those changes. Instead, you want to encourage the first guy to make the necessary changes in the CMake configuration, and a good way to do that is to make sure that the VSProject files are never checked in. Every time the CMake config is updated, they have to get the latest version and re-run CMake to generate the new VS file. This also allows them to muck with their own project files without worry of breaking everyone else -- this lets them test out new configs locally before committing them to the CMake config for all to use.


If your project *only* targets visual studio as its development environment, then by all means go ahead and store the project and solution files -- the ones you have to watch out for are the machine or user-specific files -- things such as IDE customization and that sort of thing. Another caveat is that if you store the solution, and the solution names a dependency on lets say, Boost, which is referenced by the full path "C:/libs/boost/1_49" or something, then *everyone* that builds must also install boost at that exact path (they can't have it somewhere else, otherwise they'll need to change the path in the project file for theirs, and when they check in they break everyone else's build.

On way around this is to come up with a standard directory structure for these sorts of dependencies (3rd party libs, SDKs, tools, etc), and create on each machine environment variables for each relevant section -- On my machines, I define the environment variables DEVBIN for tool binaries, DEVSDK for SDKs, DEVLIB for libraries, and DEVSRC where I store all my projects. Of course, the structure of each has to be identical across machines, but at least this way I can move them anywhere on my hard disk -- for example, on my desktop I keep this stuff on my raided drive E, but on my laptop I don't have a secondary drive or partition, so its all on drive C.

This is a good middle ground between full user-centric customization and enforcing a strict machine format.

throw table_exception("(? ???)? ? ???");

This topic is closed to new replies.

Advertisement