How/when do compilers do syntax checking?
Members - Reputation: 140
Posted 02 January 2012 - 09:29 PM
So I'm asking you guys, since I really know nowhere else to look. How do compilers perform syntax checking? when do they do it? Do they:
-check syntax of all source files, then only proceeds to compile if there's no coding typo/typing error
or do they:
-compile the source, catch error along the way, and then decide whether to continue/stop the compilation.
Right now, this is what I'm using. The compiler compiles all source, and abort the process once it encounters any error (and also write the error to log file). Thing is, I think there are way too much error cases to catch, and thus this approach is kinda ugly and of course there are MANY times where my script compiler just crash and doesn't report anything (a non-reproducible bug/flaw). Do you have a suggestion of how to write a good syntax checker? thanks btw
Moderators - Reputation: 20123
Posted 02 January 2012 - 10:35 PM
Members - Reputation: 257
Posted 03 January 2012 - 02:08 AM
There are a lot of libraries that generate a lexer/parser for you which basicially do some form of grammar checking. Have a look at lex & yacc or antlr. For my latest script language I used Antlr and it's a very good library!
Crossbones+ - Reputation: 3726
Posted 03 January 2012 - 05:01 PM
- Lexing step: First thing the compiler does is read in the characters and split them into logical bits. If your language prevents something odd like certain unicode characters, there will be an error here that prevents other steps.
- Parsing step: The second step is to take those logical bits and pull out the language syntax. If you require to have a classname between 'class' and the open bracket, that sort of thing will be detected here.
- Lexical Analysis: As you pull the syntax out into abstract structures, you can run into issues like methods with the same name, or constructor looking syntax that doesn't have the same name as the class it's in. All of these errors tend to be more ad-hoc and are the 'rules' of the language rather than the syntax itself.
For me, I have a error collection that is returned from each step. If it is populated, I don't proceed to the next step.