Language DefinitionIn the beginning we need to define our language. For start it is going to be fairly simple (hopefully we are going to extend it further) and our expressions will only consist of numbers (integer values), operators (for the beginning we're fine with +, -, * and /) and parentheses.
Backus-Naur FormFor language definition we are going to use so-called BNF (Backus-Naur Form) which is a method to describe context-free grammar. The form is actually set of simple rules written as: [code]
AutomataWith this grammar, we can generate any possible expression that is valid in the language, which is great, but... we do not need to generate all valid inputs of our language, that will be left for the user, we need to accept the language and during that process it is needed to generate the code that is simple enough for some sort of virtual machine to execute. As the language is context-free, it is possible to use a recursive descent parser to accept (e.g. it is valid expression) or reject (e.g. it is invalid expression) the input. I'm not going to write too much about parsers in the series - I challenge users to try writing them. Of course you are free to use Flex or any other generator (but I do not recommend this, unless you know how exactly you would parse something - these production tools are awesome, but only when you know how the actual parsing works). What is a recursive descent parser? It is a top-down parser, built from mutually recursive procedures. Each of these procedures implements often one (and sometimes more) of productions of the grammar it recognizes. The exciting thing about this is that it closely resembles BNF!
Compilation and ExecutionThis is a process of producing machine code from some other language. Let me present a simple abstraction here, let us have a 'program': "Compute three plus four" Such sentence is, well... not really useful, we need to translate it into something we can work with. One of such formal languages is math. If we translate such sentence into: [code] 3+4 [/code] We know a way to compute it (there are multiple computation models how to compute this), let me present one: [code] MOVE reg0 3 MOVE reg1 4 ADD reg0 reg1 PRINT reg0 [/code] Having a machine with 2 registers, allowing for 3 instructions, where:
- Instruction MOVE takes 2nd argument (value) and moves it into 1st argument.
- Instruction ADD takes value stored in register in 2nd argument and adds it to value stored in register in 1st argument
- Instruction PRINT prints the value in 1st argument onto screen