Jump to content
  • Advertisement
Sign in to follow this  
  • entries
    14
  • comments
    6
  • views
    5905

Compilers articles

Sign in to follow this  
Vilem Otte

965 views

It's me, bugging around again.

I finally got back to work on my compilers-related series, while it is focused on people untouched by automatons, VMs and compilers in general (or more likely the ones that are using it, while not knowing what goes inside), I tried to keep code-base quite small (and still manageable and readable by average viewer).

The funny thing around this is, that I've never actually worked on any big compiler (like GCC or such). So the series might give a look of someone who tried to understand how it works and how to do it, with formal languages and automatons theory in head (or at least whatever stayed there after university). While doing the compiler by myself as a form of education and challenge, I realized how little of this is covered in general - and that a lot of people don't fully understand what goes on behind the scenes - which motivated me to learn about compilers and languages more in theory and write at least a few-article series to introduce others into this problematic (and possibly help them avoid a lot of troubles I've hit while making my first compiler).

Actually the original final version of the compiler (even before I started working on articles) was able to compile something like this:int test(int* x){ x[0] = *x * 2; return -1;}int mult(int** x){ int* y = x[0]; y[0] = x[0][0] * 5; return -1;}int main(){ int z[3] = {1, 2, 3}; int x = 7; int y = test(&x); y = mult(&z); while (x > 0) { y = y * 2; x = x - 1; } return z[0];}
Into an assembly like this:.data.texttest: mov eax, 0 push eax mov ebx, [ebp+0] mov eax, ebx mov eax, [eax] push eax mov eax, 2 pop ebx mul eax, ebx pop ebx mov edx, 4 mul ebx, edx mov edx, [ebp+0] mov [ebx+edx], eax mov eax, 1 neg eax pop ebx push eax mov eax, 1 add ebx, eax mov eip, ebxmult: mov eax, 0 mov ebx, 4 mul eax, ebx mov ebx, [ebp+0] mov eax, [eax+ebx] mov [ebp+8], eax add esp, 4 mov eax, 0 push eax mov eax, 0 mov ebx, 4 mul eax, ebx mov ebx, [ebp+0] mov eax, [eax+ebx] mov ebx, eax mov eax, 0 mul eax, 4 mov eax, [ebx+eax] push eax mov eax, 5 pop ebx mul eax, ebx pop ebx mov edx, 4 mul ebx, edx mov edx, [ebp+8] mov [ebx+edx], eax mov eax, 1 neg eax sub esp, 4 pop ebx push eax mov eax, 1 add ebx, eax mov eip, ebxmain: mov eax, esp add eax, 4 mov [ebp+4], eax add esp, 4 mov eax, 1 push eax mov eax, 2 push eax mov eax, 3 push eax mov eax, 7 mov [ebp+20], eax add esp, 4 push ebp mov ecx, esp mov ebx, [ebp+20] mov eax, ebx mov eax, edx push eax mov ebp, ecx push eip call test pop eax sub esp, 4 mov esp, ebp pop ebp mov [ebp+24], eax add esp, 4 push ebp mov ecx, esp mov ebx, [ebp+4] mov eax, ebx mov eax, edx push eax mov ebp, ecx push eip call mult pop eax sub esp, 4 mov esp, ebp pop ebp mov [ebp+24], eaxL0: mov ebx, [ebp+20] mov eax, ebx push eax mov eax, 0 pop ebx sub eax, ebx neg eaxjle L1 mov ebx, [ebp+24] mov eax, ebx push eax mov eax, 2 pop ebx mul eax, ebx mov [ebp+24], eax mov ebx, [ebp+20] mov eax, ebx push eax mov eax, 1 pop ebx sub eax, ebx neg eax mov [ebp+20], eax jmp L0L1: mov eax, 0 mov ebx, 4 mul eax, ebx mov ebx, [ebp+4] mov eax, [eax+ebx] sub esp, 24 pop ebx push eax mov eax, 1 add ebx, eax mov eip, ebx__start: push ebp mov ebp, esp push eip call main pop eax sub esp, 0 mov esp, ebp pop ebp
While it was structured as procedural-based compilers it wasn't good enough to be presented. So for article purposes I tried to re-work everything in better structure from scratch. Not covering just practical implementation, but hitting a theory from time to time.

At that point I realized what all I want to put in the article(s):


  • Arithmetic math operations on integers
  • Arithmetic math on pointers and their reference/dereference operators
  • Variables and arrays
  • Standard control constructs (if, do-while, while, for)
  • Function calling (e.g. define and use a call convention)
  • Interaction with host application (the one running VM)
  • More types - shorts, bytes and floats (and of course variable promotion)

    So far, the articles are still somewhere in the middle, slowly getting finished. Although I'm already planning what is going to be next, but let's keep that for next entry.

Sign in to follow this  


1 Comment


Recommended Comments

A fascinating subject. I have no formal education in it but have also worked out the process of converting a language into pseudo-assembly to run through a VM and it is not just rewarding in itself, but gives one a far better grasp of what our real-world compilers are doing for us. A subject to be recommended to all programmers.

Share this comment


Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!