Sign in to follow this  

how machine code is generated

This topic is 4748 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Whisking right on by a lot of important steps, my understanding of the compilation process ends with lexed/parsed/validated code being saved in the form of a binary tree, with each section of every statement being saved as a node on the tree. The compiler then traverses the tree (node by node) and matches machine code 'templates' to source/object code instructions it recognizes. The swapped in templates are then optimized and saved in bytecode to the executable file. (1) Am I at least somewhere on target? (2) This is the *big* one: where do these machine templates come from? If I wanted to write my own compiler, would I have to know machine (for the targeted platform) or do the processor manufacturers supply libraries of these templates? Where do they come from/how are they obtained??

Share this post


Link to post
Share on other sites
1. You're in the ball park. The byte codes are stored to object files (.obj for most windows compilers) and then the linker merges those into the executable.

2. The templates are data structures, typically instanced in a header file as an array, with more or less one element for each opcode. Processor manufacturers supply documentation for their cpus. Compiler writers use these documents to build the templates. You can find docs for x86 based cpus at www.sandpile.org.

Share this post


Link to post
Share on other sites
Quote:
Original post by helmslar
(2) This is the *big* one: where do these machine templates come from? If I wanted to write my own compiler, would I have to know machine (for the targeted platform) or do the processor manufacturers supply libraries of these templates? Where do they come from/how are they obtained??


It is common to use templates for the machine code of an abstract machine that doesn't actually exist, but has a relatively simple set of instructions. The instruction set is designed so that each of those instructions can be easily mapped to x-many "real" instructions, once the target machine is known. Most optimizations will be done upon the "intermediate code", although some may be possible post-translation (if your compiler is smart enough to, say, generate optimized SSE/MMX routines).

In some languages/environments, this "intermediate code" is actually what gets executed - well, interpreted. A program looks at each opcode in the intermediate code, looks it up in a table, and executes a corresponding fragment of machine code. (Conceptually, anyway; presumably it is possible to optimize this process - and also to save the bits of machine code in sequence as they are looked up, and use that information directly on subsequent runs of the function in question). This is the principle behind the Java VM and the .NET platform. The Java Hotspot VM takes things one step further: speed optimizations are not done on the .class file bytecode (well, relatively few anyway), but instead the VM runs an "optimizer" thread at the same time as the rest of the program, which tracks which bits are most often used, and optimizes those pieces of machine code on the fly.

At least, that's my understanding of how it works...

Note that for non-performance-critical applications, it is often fine to just interpret the intermediate "bytecode" continually. Python works like this: your script is compiled to a .pyc file, which is interpreted by the Python VM. Also, some file formats are actually really a kind of bytecode - as I have just discovered (well, rediscovered in far more detail), MIDI works this way.

Interpretation is a powerful technique. To master Teh Matricks(TM), you must not draw too sharp a conceptual line between "data" and "code". To do so is to ignore Turing's Big Idea.

Share this post


Link to post
Share on other sites

This topic is 4748 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this