Fun with GCC (or A Lesson Relearned)
I had a collection of C code that I'd built up over the years. I suppose I still have it, but it's sitting on the hard drive of a closeted dormant computer that I don't want to bother setting up. Besides, I'm not really happy with that bit of code anymore. My style has evolved over the years and I don't write C in quite the same way as I used to. Plus, I've learned a thing or two since I started putting all of that together. Using it again without rewriting it would just bug me too much. So recently, I set out to start over with it.
In a nutshell, I'm putting together a package of C libraries that I can use for different sorts of apps. Granted, most of my hobby coding is in D nowadays, but I don't want to lose my skill at C. It took me too many years to get to where I am to just give it up completely (I did toy with doing my rewrite in C++, but gave up on that rather quickly -- D has me too spoiled to touch C++ anymore). It's all organized rather neatly under a self-contained directory tree. These days, I prefer to avoid IDEs and work from the command line and a text editor (a licensed version of SublimeText 2 is my editor of choice), so I need a good tool to manage my build process. I went with premake4.
I've been using premake for quite a while without any difficulty, so it was only natural to use it again for the rewrite. There are different ways to use premake, one master config file for every project in the "workspace" or a separate config file for each project (my preference). Both have their pros and cons, but in the end the way I want to set up my source tree made me realize early on that I'm going to hit the cons no matter which approach I take. Nothing major, mind you, but when working from the command line every bit of convenience counts. I did make an attempt to restructure things and combine my multiple premake files into one master script, but I managed to run afoul of GCC's pickiness with the order in which libraries are specified on the command line (see below). I realized I was either going to have to make the config script even more complex with some custom Lua, deal with the annoyances of the multiple premake script approach, or do something different.
For my D projects these days, I use a custom build script for compilation, written in D. It was simple to put together and I can copy it from project to project with only minor modifications. And it's dead easy to maintain across projects. I've actually got several different versions of it, as it improves with each new project I create. I realized it wouldn't take too much to knock up a version that could compile my C projects. So that's what I did. Less than 10 minutes after I opened the file I had it automatically pulling in C files and compiling libraries just fine. Then it came time for the executables.
If you've ever used gcc before, you've likely encountered a situation when you were getting "undefined references" all over the place even though you are 100% certain you specified the correct library in the linker settings of your IDE and that you configured the library path properly. After consulting Google, you will have learned that gcc takes its list of libraries and processes them in the reverse of the order in which you passed them along. So if library B depends on library A, you have to pass them in this order: "gcc -lB -lA...". Doing it in the reverse will cause linker errors. This is a lesson I learned long ago, so I rarely run into that sort of error anymore. But, as it turns out, I didn't understand the issue as well as I'd thought.
In my build script, I have separate functions to compile, archive (create a static lib) and link (create an executable). All source file names are pulled from a directory, appended to an array, and passed to the compile function for individual compilation. If the project currently being processed is a library, then the file name extensions are changed from ".c" to ".o", all of the names globbed together into one string, and passed along to the archive function where the library is created. If the project is an executable, a list of required libraries is handed off to the link function along with the object file names for the executable to be created. It was here that things broke down. More undefined references.
This one really had me stumped. The premake build system had been working fine until I combined the configurations into one, resulting in undefined references when compiling. Using the verbose option showed me that the libraries were not being passed in the correct order. But with my D build script, I was certain they were properly set up. The verbose option confirmed it. So what the hell was going on?
Google, unfortunately, showed me nothing. I was at a total dead end. I kept staring at my script, changing things here and there randomly. Staring at the directory tree, moving things around. Sitting in my chair and doing nothing but thinking and cursing. After over an hour of mounting frustration, a thought suddenly popped into my head. A small adjustment to my build script and compilation was successful.
The problem that bit me was that any object files you pass to gcc during the link step need to be specified on the command line before the libraries on which they depend. Given that the libraries are just collections of objects, that makes perfect sense. I'd just never known it or had to care about it before. The string I was sending to the OS originally looked something like "gcc -lfoo -lbar baz.o buz.o...", whereas it should have been "gcc baz.o buz.o -lfoo -lbar...". It doesn't matter in which order the object files are specified, they just need to appear in the list before the libraries. How many years now have I been using gcc?
So that's how what should have been a less-than-20-minute side project turned into a nearly hour-and-a-half time sink. If you are going to work on the command line, it pays to know your compiler inside and out.