3rdParty libraries and Source Control.

Started by
5 comments, last by Codeka 14 years, 7 months ago
I was wondering what everyone consensus was on storing 3rdparty libraries in source control. Ive been in projects where all 3rdparty libraries where required to be pre-built for each supported platform/compiler in a "libs" folder. As well, a tagged tarball of the source was committed. I have also been on projects that strictly said "no pre-compiled libraries or 3rdparty libraries in source control". I see some very good reasons to have all 3rdparty libraries atleast committed in pre-compiled form, so in the future you can compile the source code without the worry the a library you use does not even exist anymore. It also makes it easy to hit the ground running when newcomers come in to develop. However, where does this stop? Some 3rd party libraries are HUGE and this can create some extremely long checkout/clone times. Is it more efficient to put these 3rdparty libraries in source control in plain source code form that is compiled along with your project? This has its advantages because you dont have to have pre-compiled versions of the libraries for each platform. However, image putting something like boost or qt in there, which takes hours to build. Although the 3rdparty libraries would only compile on the first build of the project, it would be a very big inconvenience for people to go through. Or, should I not commit any 3rdparty libraries at all? This saves space and as long as the user has all the required libraries compiled and knows how to edit configuration files, it can be really efficient for download and build speeds. But, using this scheme makes setting up a freshly checkout of your project complex. That is more than likely going to scare away any newcomers to your project. As well, this library 2 or 3 years from now could possibly not even exist. What is the most efficient way to handle these libraries? Thanks, - Dead1ock
Advertisement
I see little to no reason not to check in both. Unless your source control machine is severly hurting for space (in which case it wont be useful for long) I would suggest checking in the source (if you have it) for 3rd party libraries and the compiled libs for it (with various flavors debug/release/cstd runtime variants)

This does a couple things:
1) The libs are already compiled and no one has to waste time compiling them.
2) Your projects can know reliably where the .lib is to link with since it is in a fixed/relative location via the source control (although property sheets can make this more flexible if supported)
3) Allows people to easily have access to the source to build new variants of the libs if not previously, but now needed. Same for making minor modifications/adjustments.
4) You can define a workspace that is "absolutely everything" needed to compile.

In general it is my personal opinion that this should exist outside of your general source's path/depot so that you can branch your code without having to duplicate your 3rd party source and hence not waste any space (since 3rd party code is usually relatively static. Clearly there is some need for revisions for updating 3rd party libs, my favorite implementation of that so far is to simply put different versions of 3rd party software into version named subfolders within the depot and to link against the specific version you currently have implemented in your source code. The include paths providing the version's location can be easily adjusted with property sheets in each branch of the code allowing different branches to use different versions if necessary.
// Full Sail graduate with a passion for games// This post in no way indicates my being awake when writing it
I personally never store any 3rd party libraries, in source or in code, or in general, anything that is either avaialble elsewhere (like 3rd party libraries) or machine-generated.

When I had my first "industry" job I was shocked to see that it was a common practice to store everything under version control, 3rd party libraries, applications and even the right version of the compiler to make the project compile. I didn't like the practice but I did see many good things in doing so. Most of them, however, were more business than software related.

There are a variety of reasons why to store 3rd party libraries (in binary) in source control systems.

  • No need to configure and install 3rd party dependencies. However, you get in trouble once you have different operating systems, processor architectures or compiler versions to support. If you can stick with one combination, say win32-x86_64-msvc9 or you're using a language like Java, it's may work to store 3rd party libs. In any other case, this is pretty damn constraining.
  • You can define a workspace that contains everything to compile and run the program. Again, this works as long as you stick to 1 OS/1 cpu architecture/1 compiler version, etc. How far are you willing to go with this? At one big american company I worked for, they had a convention of even putting the correct Microsoft Visual C++ version installers in the source control tree. How convenient is that?
  • A new/different version of dependency library will not break your app. A new, non-backward compatible version of your dependancy library may appear, and you will either have to stick to the old version, fix your software and require a new version or incorporate a workaround to support two versions. Or you could store the 3rd party library in your source control and do nothing. If a software system in production breaks down because of a new dependency library version, the client may not be willing to pay for it. Thus it will be cheaper to keep the correct version bundled with the software sources and this will never happen. However, if you're not in a situation where you have to care about production lines, your software can be considered out-of-date and deprecated and better just update your software for the new dependency versions.
  • Your boss or your marketing dept may think it's a practical solution to version mismatches. It doesn't mean that it actually is, but your software will turn obsolete quicker.
  • All the reasons in this list are bad reasons. No matter how convenient they might seem to be at first, they mostly don't work out in practice. You are only adding more constraints to your development cycle. Even at best, all my experiences with these "workspace in version control"-type solutions have started out by reading the README.txt and tweaking compiler paths and mostly it would have been easier to install the dependencies manually.


If you're free from marketing and business related, I recommend these best practices:

  • No 3rd party libraries in version control. Especially no binaries, they impose the most restrictions. The source code is much more versatile than the binaries, but it may be a pain in the ass make it compile and requires the dependencies of the dependency to be included too.
  • Don't store what you can compute. No binaries in the source control. Also no other machine generated data or code, like stuff you used a script to generate/export (but include the scripts that do the process, if applicable). Applies also to intermediate build files, like Makefiles on a ./configure-based system (or MSVC solution files in a CMake-based system).
  • If you must include dependencies, use svn:externals or similar. Usually, however, you are better off if you install dependencies manually (assuming easy availability of correct versions of installers, distribution packages, etc).


I kind of like the previous poster's idea of having a separate workspace-repository with the dependencies. It would not impose the limitations or force a user to actually use those libraries, but they are there and available if another user wishes to use them. Your original source tree will contain your code only and won't be cluttered by a ton of 3rd party libraries and script hacks to get them built.

And remember: Even Windows works with multiple processor architectures these days. A single binary distribution just won't cut it. If you add different compilers or compiler versions and other operating systems to the equation, the solution to the initial problem is worse than not solving the problem at all.

rant of the day,
-Riku
Quote:Original post by riku
I kind of like the previous poster's idea of having a separate workspace-repository with the dependencies. It would not impose the limitations or force a user to actually use those libraries, but they are there and available if another user wishes to use them. Your original source tree will contain your code only and won't be cluttered by a ton of 3rd party libraries and script hacks to get them built.


It's a requirement for most commercial SDK's and libs. You are bound by a lot of EULA's for middleware and dcc packages to only use the code/libs/tools for the duration of the project, or the license period you've paid for. If you check that stuff into your main code repository, there's only one way to get rid of it all at the end of the project - delete the entire repository.
I do keep all major dependencies in my Mercurial repository (where major means anything that doesn't come with the OS or compiler - excepting boost).

However, I restrict dependencies to open-source libraries, store all of them in source code form, and adapt all of them to use the same cross-platform build system (in this case, CMake). Build process on all supported platforms (Windows, Mac and Linux) is a simple 'cmake .. && make' (or can optionally use XCOde on Mac and Visual Studio on Windows).

I find that most of my preferred dependencies are not in major package repositories (apt, yum and macports), and Windows of course doesn't have a package manager, so keeping all in a single source code repository and using a single build system significantly eases cross-platform development.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Quote:Original post by dclyde
I see little to no reason not to check in both. Unless your source control machine is severly hurting for space (in which case it wont be useful for long) I would suggest checking in the source (if you have it) for 3rd party libraries and the compiled libs for it (with various flavors debug/release/cstd runtime variants)

This does a couple things:
1) The libs are already compiled and no one has to waste time compiling them.
2) Your projects can know reliably where the .lib is to link with since it is in a fixed/relative location via the source control (although property sheets can make this more flexible if supported)
3) Allows people to easily have access to the source to build new variants of the libs if not previously, but now needed. Same for making minor modifications/adjustments.
4) You can define a workspace that is "absolutely everything" needed to compile.

In general it is my personal opinion that this should exist outside of your general source's path/depot so that you can branch your code without having to duplicate your 3rd party source and hence not waste any space (since 3rd party code is usually relatively static. Clearly there is some need for revisions for updating 3rd party libs, my favorite implementation of that so far is to simply put different versions of 3rd party software into version named subfolders within the depot and to link against the specific version you currently have implemented in your source code. The include paths providing the version's location can be easily adjusted with property sheets in each branch of the code allowing different branches to use different versions if necessary.


Thank you (as well as everyone else) for the reply(s),

When it comes to tools that are required to build the source repository, what are my options? The project im currently working on requires qmake for makefile generation. Should I include just the source for these tools and make a "bootstrap" shell script that compiles these tools and installs them? Or, for big libraries like Qt (which contains qmake) and Boost, should I make the user/developer download them and compile/install them because of their shear size?

As a side note, im using gitorious for my project hosting, so hard drive space is not a problem.

On a note of branching these environments, what would you recommend as being the best scheme for implementing this? A repository per platform/compile? As well, if these 3rdparty libraries are checked out from a different repository, would I do this trough gits equivalent of "svn:externals"?

Thanks for the help guys,
- Dead1ock
For my projects, what I usually do is have an "external" folder next to my main "source" folder. The "external" folder does not get checked into source control, but it has a specific directory structure and contains all the headers and libraries that my project needs to build. For example:
...\trunk\external                  \include                  \lib                      \win32                            \debug                            \release                      \x64                          \debug                          \release         \source
The idea is that all my projects are set up to reference the include and library directory under "external". When a new team member comes on board, I can just zip up my external folder and send it to him. If he's using a different compiler to me, then obviously he'll have to set it up himself, but that's the idea anyway.

The two main exceptions to this are boost and DirectX. I expect everybody to have those installed separately and set up within Visual Studio already.

Also, this set up is specific to my Windows builds. On Linux, I just expect everybody to apt-get the dependencies (autoconf is used to detect when they've forgotten something) :-)

This topic is closed to new replies.

Advertisement