Writing a compiler?

Started by
25 comments, last by FearedPixel 21 years, 1 month ago
For my university project next year, I am planning on writing a compiler for a language I created. At this point I guess I am a bit ahead of myself, but I have over a year to research the topic and understand it. I have a few years C/C++ experience, half a year of C#, and I know a bit of assembly. I guess I will be writing the compiler itself in C#. My question is, does anyone know any resources (websites/books) that would help me? I guess I would need information about the windows exe protocol (so I know exactly what makes up an executable), and how exactly I could convert my high-level statements into assembly. I have already done some research in tokenizers and how my code should be parsed, the area which I think I am lost in is the step after that, actually converting to assembly and builing an executable. (although I have done excercises that involved converting loops and conditions into assembly etc.) Any help appreciated. [edited by - FearedPixel on March 4, 2003 7:46:06 AM]
Advertisement
I heard Compilers: Principles, Techniques, and Tools is a good book. Havent read it though.
"-1 x -1 = +1 is stupid and evil."-- Gene Ray
the dragon book is the de-facto standard

here be dragons

[edited by - petewood on March 4, 2003 7:54:52 AM]
Look here for some additional info

http://www.scifac.ru.ac.za/compilers/conts.htm

You should probably check out the Lex & Yacc link in my sig too..

If it''s a interpreted language you would check out the script language tutorials here at gamedev and I think they got one at www.flipcode.com too.. Check out the ''Lex & Yacc'' link in my sig too

God speed.

#define Email Lex & Yacc Function Pointers Virtual Terrain Knowledge Base Real Programmers
"Evolution is NOT a mistake"
I found this nice link

http://netsoc.kst.dit.ie/files/books/Computers/

Check that "Compilers And Compiler Generators" link.
Game Scripting Mastery would probably be quite relevant to this. It teaches you to create a compiler and virtual machine for a scripting language.

I''ve got it, and I highly recommend it.
I agree that the Dragon Book (Aho et al.) is excellent.

I used it to write my first interpreter and it gave me (what I feel is) a very strong understanding of the underlying concepts behind compilers, languages, and language design.

I absolutely reccomending checking it out of the library (80 dollars is steep for a book!)

Thanks everyone for the help.

I would intend to make it compiled, however if it turns out to be too difficult I will look into making it interpreted, and produce exe''s which are simply the interpreter with the program attached to the end of it.

Either way, I definitly want to be able to produce windows executables.
quote:Original post by FearedPixel
I guess I will be writing the compiler itself in C#.

Why? C# isn''t noted for its text processing capabilities, nor for low-level operations (though you really don''t need low-level operations to write a simple compiler - you do to write a competitive compiler, though). Believe it or not, Perl would be a better option to write a compiler in than C# (syntax validation, tokenization, pattern recognition, assembly source generation... all text operations, all easier/better done in Perl).

Not that you can''t do it in C# or that C# is a poor choice. I''m just wondering if you''re constrained by, say, the languages you know as opposed to the best tool for the job...

quote:My question is, does anyone know any resources (websites/books) that would help me? I guess I would need information about the windows exe protocol (so I know exactly what makes up an executable), and how exactly I could convert my high-level statements into assembly.

I don''t think you need to know anything about Windows'' executable "protocol". All you need to know is how to generate correct (and preferrably fast) assembly and then invoke the assembler.

quote:I have already done some research in tokenizers and how my code should be parsed, the area which I think I am lost in is the step after that, actually converting to assembly and builing an executable.

If you''ve parsed the code, then you''ve gathered the information necessary to determine the unambiguous intent of the code (if there''s ambiguity not covered by the language definition, spit the code in the user''s face) and can emit the appropriate assembly language constructs, taking care of issues like register allocation, the frame pointer and so forth. Personally, I dislike writing assembly (except on a SPARC!) so I would have written my compiler to emit C and then invoked GCC. That''s how cfront (the original C++ frontend) worked.

Another thing you might want to consider is to define your language such that it can both be runtime interpreted (very useful during development/debugging) and compiled (usually smaller distributable, [marginally] faster execution) - maybe even JIT compiled! Support reflection and introspection would also be cool, and very instructive.

Are you required to have designed the language, or can you extend/implement an existing language (eg adding multipass reference resolution to C++, maybe packages too, to eliminate #includes or at least eliminate the need for forward referencing, or write a native compiler for Python)?

*HTTP 500
*HTTP 500
*Proxy 11001
check out flex++/bison++ they are good tools for creating scanners and parsers (things that all compilers need)

I''m in a compiler course right now, and we use these tools extensively. They wouldn''t be useful for a C# project however, but maybe C# has something similar.

This topic is closed to new replies.

Advertisement