Jump to content

  • Log In with Google      Sign In   
  • Create Account


How/when do compilers do syntax checking?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
5 replies to this topic

#1 Triad_prague   Members   -  Reputation: 140

Like
1Likes
Like

Posted 02 January 2012 - 09:29 PM

Hi, I've finished a small scripting language resembling a C-language. I named it clsl (stands for C-Like Scripting Language). I miss one critical implementation though .... and that's the syntax checker!!

So I'm asking you guys, since I really know nowhere else to look. How do compilers perform syntax checking? when do they do it? Do they:
-check syntax of all source files, then only proceeds to compile if there's no coding typo/typing error
or do they:
-compile the source, catch error along the way, and then decide whether to continue/stop the compilation.
Right now, this is what I'm using. The compiler compiles all source, and abort the process once it encounters any error (and also write the error to log file). Thing is, I think there are way too much error cases to catch, and thus this approach is kinda ugly and of course there are MANY times where my script compiler just crash and doesn't report anything (a non-reproducible bug/flaw). Do you have a suggestion of how to write a good syntax checker? thanks btw Posted Image
the hardest part is the beginning...

Sponsor:

#2 frob   Moderators   -  Reputation: 19006

Like
3Likes
Like

Posted 02 January 2012 - 10:35 PM

The second one. In compiler theory and textbooks all the phases are separate and distinct. In real life, the good compilers can do several things simultaneously. They spit out each one as soon as it is encountered so it won't crash without message. It is also a good practice to continue after an error to provide the programmer with information about as many issues as possible since the error message may stem from an issue at another location.
Check out my personal indie blog at bryanwagstaff.com.

#3 e‍dd   Members   -  Reputation: 2105

Like
3Likes
Like

Posted 03 January 2012 - 12:34 AM

Have a read of this and follow some of the more obvious links therein. Traditionally, by the time the AST is built, the syntax checking will have been done.

#4 jeroenb   Members   -  Reputation: 257

Like
2Likes
Like

Posted 03 January 2012 - 02:08 AM

When you parse your files for the statements you already should be doing some form of syntax checking, otherwise you might be processing an invalid file format.
There are a lot of libraries that generate a lexer/parser for you which basicially do some form of grammar checking. Have a look at lex & yacc or antlr. For my latest script language I used Antlr and it's a very good library!

Crafter 2D: the open source 2D game framework

Blog: Crafter 2D
Twitter: @crafter_2d


#5 Triad_prague   Members   -  Reputation: 140

Like
0Likes
Like

Posted 03 January 2012 - 07:30 AM

+1 for the responses. You've shed some light here. I'll be back here for more questions later....Posted Image

EDIT : it seems that I'm on the right track. Just need to catch more error cases :)
the hardest part is the beginning...

#6 Telastyn   Crossbones+   -  Reputation: 3724

Like
1Likes
Like

Posted 03 January 2012 - 05:01 PM

There are three steps in traditional compilers where these sort of things are caught. Most of them report all errors at one level together rather than stopping.

- Lexing step: First thing the compiler does is read in the characters and split them into logical bits. If your language prevents something odd like certain unicode characters, there will be an error here that prevents other steps.
- Parsing step: The second step is to take those logical bits and pull out the language syntax. If you require to have a classname between 'class' and the open bracket, that sort of thing will be detected here.
- Lexical Analysis: As you pull the syntax out into abstract structures, you can run into issues like methods with the same name, or constructor looking syntax that doesn't have the same name as the class it's in. All of these errors tend to be more ad-hoc and are the 'rules' of the language rather than the syntax itself.

For me, I have a error collection that is returned from each step. If it is populated, I don't proceed to the next step.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS