Stages of compilation of a C++ application

Started by
3 comments, last by stonemetal 15 years, 9 months ago
Hi, I'm trying to brush up on my computer science knowledge, and to this end I'm trying to figure out what happens during compilation of a C++ application. However, google presents a few different explanations. Is anyone aware of a generally accepted description of the process? From what I understand it goes... Preprocessor In this stage all the preprocessor commands (i.e. #include, #define...anything with a hash) are dealt with. The relevant files and macros are pasted into the appropriate places (so you have all the function prototypes you need). Compile At this stage the program is translated into assembly code. The syntax of the C++ code is analysed and if there are mistakes (missing semicolons, extra brackets, missing function definitions) the compiler will flag an error. Assembler This stage translates the assembly code into machine code and produces object files. ...anything else of note? Linker This stage attaches any important libraries and builds the final executable. I'm fairly sure the first two steps are correct, but the last two I'm not so confident on.
Advertisement
Quote:Compile At this stage the program is translated into assembly code. The syntax of the C++ code is analysed and if there are mistakes (missing semicolons, extra brackets, missing function definitions) the compiler will flag an error.

Not quite true. The compiler compiles code into object files, which contain machine code not assembly code. However alot of compiliers do allow you to output the assembly code as well, however that is not used in this process.

Quote:Assembler This stage translates the assembly code into machine code and produces object files. ...anything else of note?

Its only used if the compiler supports inline assembly, and there are inline assembly code used in the project. Else, its not needed. This is not a separate stage during the compile process; it is invoked by the compiler to translate assembly code.

Quote:Linker This stage attaches any important libraries and builds the final executable.

Almost true. The linker binds all of the linker symbols inside of the built object files together. What this means is to insure all of the linker symbols are resolved (Found inside of other object files) and to build the final binary, placing the data that was resolved in it as well as code and instructions.

The linker only links with other libraries if it is told to. By default, they normally link with the standard libraries, but this is not required--it depends on how you use it.

---

I suppose I am primarily targeting several build tools here. As there is no strict standard, it is possible that other tools do things differently.
Sounds about right. Of course, you could split the compile step up into quite a few separate passes, but of course, they all deal with the compile step you described: turning source code into assembly.

And this is a bit simplified too. These steps aren't always cleanly separated. (The preprocessor typically isn't a separate executable as it was once upon a time, but is done early on by the compiler, which means it may be more or less interleaved with the compile step. And most compilers allow you to at least partially merge the compile and link steps. MSVC has link-time code generation, for example, which more or less postpones compilation until link-time, to be better able to make global optimizations.
And as said above, the assembler step isn't usually a separate step either, but is done near the end by the compiler. I'd say it makes sense to list it as a separate step conceptually though.

And of course, this distinction isn't really required by the C++ standard in any case.

But yeah, that's just nitpicking. Overall, I'd say you're right. :)
As far as I know, there's no "Assembler" stage on most compilers. The compiler itself generates object files from the source code by translating it directly into machine code. there's really no need to generate intermediate assembly code.

The linker's job is to combine all the object files and library files into a finished executable and ensure that all symbols are resolved (that "extern foo;" is actually defined somewhere for example).

These links might help:

Wikipedia - Object code
Wikipedia - Linker

Edit: Seems I was a bit too slow. [smile]
Template code generation(logically not sure if it is an actual separate step) happens after macro processing but before compilation.

This topic is closed to new replies.

Advertisement