• Advertisement
Sign in to follow this  

How C++ Programs Are Compiled (A Brief Look)

This topic is 780 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I was having difficulties figuring out how to properly compile and link a C++ program I was working on, so I did some research on how the compiling actually works.  I felt I was so informed, I should post to my blog about it, and then I was so proud of my blog post I thought I should spam this forum with a link to it.  So I present to you an introductory look at just how C and C++ programs are compiled, from the basics of preprocessing to the difference between static and dynamic libraries.  I hope you find it informative and maybe even useful for your own programming : http://dunoid.org/index.php/2016/03/02/the-c-compilation-process/

Edited by Dunoid

Share this post


Link to post
Share on other sites
Advertisement

You can always start a developer journal on this site, even if all you want to do there is post excerpts that link to your real blog. That's perfectly fine ("spamming" the forums isn't though).

 

That being addressed now, we don't need to keep piling on OP's rep now, do we?

 

 
 
He's got a net positive now, mainly due to his (very mature and reasoned) response to Apoch's critiques. I feel like the system is working as designed.
Edited by Josh Petrie

Share this post


Link to post
Share on other sites

A brief suggestion. I would not use the basic information of compilers as a portion of a resume. The topic is actually more complex then what your blog is covering. Here's a veeeeeery basic explanation. A lot of details are skipped.

 

compiler.gif

 

In reality, your compilers are usually divided up into several stages. Like the GCC compiler, which is really a series of scripts and pipes.

 

First stage is the Lexer, which tokenizes your code. This is usually a context free grammar that makes the next stages easier to manage. By "Context Free", the easiest way to understand it is that the human meaning is stripped out, and all we care about is the structure. These tokens are characteristic strings with meaning and rules. You can look this up at some point. Most data foundation based books will cover this topic. here's a link https://en.wikipedia.org/wiki/Lexical_analysis

 

The lexer is usually combined with a parser. The syntactic analyzer. This is an error checking stage. This makes sure that your code follows the grammatical rules by analyzing the string of tokens.

 

Somewhere along those lines, you have an optimizer. If you look at the structure of your tokens, it kinda looks like an abstract syntax tree. The optimizer will look for defined patterns, and will make attempts at improving the code where it can.

 

Example:

int a = 0;
For ( int i = 0; i < 30; ++i)
{

    ++a;
}

The optimizer will see this for loop. Notice that there's absolutely no need for this for loop as it's just iterating and adding. This is an absolutely redundant piece of code. When the optimizer makes it's edit the code will look like this.

int a = 29

Once the code passes the syntax check, and gets optimized, it's passed to the assembly stage. The assembly stage reads the tokens and produces the assembly code. This assembly code is implementation specific. Different compilers will do different things. And will make different assembly code for their targeted devices. This may seem redundant, but it's an important stage for how things get linked, and other things. I'm not talking about it because it gets complex. REAL COMPLEX. I barely understand it.

 

The assembly code is then converted into either hex or binary with a set of rules. It depends on what the system reads. Most commonly Binary. But hex is used in some scripting languages, and virtual machines. That... is also pretty specific to the targeted system.

Edited by Tangletail

Share this post


Link to post
Share on other sites

Example:

int a = 0;
For ( int i = 0; i < 30; ++i)
{

    ++a;
}

The optimizer will see this for loop. Notice that there's absolutely no need for this for loop as it's just iterating and adding. This is an absolutely redundant piece of code. When the optimizer makes it's edit the code will look like this.

int a = 29

 

[OCD]

Just needed to correct this -- a would be equal to 30 in the given example, not 29.

[/OCD]

Share this post


Link to post
Share on other sites

Probably not?

When i = 29, a = 29. At the end of that loop, i is incremented to 30, and breaks the for loop. The for loop will only tick when i < 30, it may not be equal to thirty.

Edited by Tangletail

Share this post


Link to post
Share on other sites

Probably not?

When i = 29, a = 29. At the end of that loop, i is incremented to 30, and breaks the for loop. The for loop will only tick when i < 30, it may not be equal to thirty.



for (int i=0; i < K; ++i)
{
   // This code is executed K times
   // (OCD disclaimer 1: normal cases where K >= 0 && K < int.MaxValue)
   // (OCD disclaimer 2: and where the optimizer didn't optimize out the loop.)
   // (OCD disclaimer 3: and where 'i' and 'K' are not modified in any other way than what is seen here.)
   // (OCD disclaimer 4: and where the processor, RAM, etc. do not have hardware defects.)
   // (OCD disclaimer 5: and where the thread/process/computer is not shut down abruptly.)
}
(OCD edit: forgot the "int")
(OCD edit 2: moved the first disclaimer to line up nicer with the others) Edited by Nypyren

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement