Writing a compiler - how?
Writing a compiler - how?
Hi there,
I'm new on this forum, so please, don't be angry with me if I ask something stupid. And sorry for my poor English, it is not my native tongue.
Well, I would like to write a compiler for a subset of C, but I don't know, where to start. It would be a "fun" project for me, nothing serious. What should I read/learn to do this? I would not use lexer/parser generator, because I would like to learn how a whole compiler build up. Oh, and I'm not newbie in programming.
Thanks for the answers,
tloszk
I'm new on this forum, so please, don't be angry with me if I ask something stupid. And sorry for my poor English, it is not my native tongue.
Well, I would like to write a compiler for a subset of C, but I don't know, where to start. It would be a "fun" project for me, nothing serious. What should I read/learn to do this? I would not use lexer/parser generator, because I would like to learn how a whole compiler build up. Oh, and I'm not newbie in programming.
Thanks for the answers,
tloszk
Re: Writing a compiler - how?
Here's a copy of the classic compiler tutorial by Crenshaw:
http://compilers.iecc.com/crenshaw/
And here's a small C-like compiler that might give some insight:
http://exmortis.narod.ru/comp_src/bflat050.zip
http://compilers.iecc.com/crenshaw/
And here's a small C-like compiler that might give some insight:
http://exmortis.narod.ru/comp_src/bflat050.zip
Re: Writing a compiler - how?
Look at Tiny C Compiler - http://bellard.org/tcc/
It is really TINY opensource compiler able to produce executables (linker is built-in). This project may be used as startup for yours.
It is really TINY opensource compiler able to produce executables (linker is built-in). This project may be used as startup for yours.
Re: Writing a compiler - how?
I'm currently writing a language interpreter for my os, parts of the concept are similar e.g. parsing, recursive descent, and evaluating constant expressions for optimization. you're welcome to read the source but its a basic interpreter not c, so ymmv...
Re: Writing a compiler - how?
I always marvel at how often a given implementation of a concept is confused for study material on the field in general.tloszk wrote:What should I read/learn to do this? I would not use lexer/parser generator, because I would like to learn how a whole compiler build up.
+1 for the book links from berkus. The Dragon Book is to compiler building what K&R is to the C language.
Every good solution is obvious once you've found it.
Re: Writing a compiler - how?
+1 for Dragon Book. It's a must have for both student and collectors.
Re: Writing a compiler - how?
Thanks for all the answers, as I thought, it is not an easy finger exercise. So I will read the Dragon Book, but until I did not do that, I won't write a single line of code, because it would be a piece of waste.
So thanks for you all!
So thanks for you all!
Re: Writing a compiler - how?
Hi,tloszk wrote:Thanks for all the answers, as I thought, it is not an easy finger exercise. So I will read the Dragon Book, but until I did not do that, I won't write a single line of code, because it would be a piece of waste.
So thanks for you all!
You might want to write some code as you go along - the dragon book covers all aspects.
Also, you may want to read up on lex and yacc - those tools can really help you once you know what lexing and parsing entail, to get a compiler up and running in a reasonable amount of time.
If you need any specific help, feel free to ask me - I'm a compiler engineer by trade!
Cheers,
James
Re: Writing a compiler - how?
Hi tloszk ,
Free online links for you learn Compiler Construction. Hopefully it is of some use to you
Free online links for you learn Compiler Construction. Hopefully it is of some use to you
- Free Video Lectures on Compiler Development from University of Washington
- Source of of Compiler Blue , it makes use of .Net reflection to generate the final assembly. It makes use of a recursive descent parser
- Compiler Construction by Writh . Note : Direct link to pdf.
- NPTEL compiler development Course from IIT Kanpur
- Practical Compiler Development
Re: Writing a compiler - how?
Free Online Course:
Planned Launch April 23rd - duration 8 Weeks
https://www.coursera.org/course/compilers
Planned Launch April 23rd - duration 8 Weeks
https://www.coursera.org/course/compilers
50₰
Re: Writing a compiler - how?
I recently wrote my own programming language which compiles to javascript. The first version took me about 20 minutes (though the compiler is in python) so I would not say this is a crazy project (although c is a hard language). The default Compiler Construction: Theory and practice by William A Barrett, Rodney M. Bates, David A. Gustafson and John D. Couch. Essentially the key to creating a compiler would be to first look for tokens like ;,{,:,\n which seperate lines. Then if there is a { or : look for keywords such as if, else etc. Try to determine if there is an assignment using the "=". Hunt for types like "int" and so on. After you've got through of lexing and parsing generating code is easy.
Get back to work!
Github
Github
Re: Writing a compiler - how?
Argh! What?! No! Argh!!! Are you seriously suggesting that horrific ad-hoc approach is a substitute for proper lexing and parsing!?Essentially the key to creating a compiler would be to first look for tokens like ;,{,:,\n which seperate lines. Then if there is a { or : look for keywords such as if, else etc. Try to determine if there is an assignment using the "=". Hunt for types like "int" and so on. After you've got through of lexing and parsing generating code is easy.
Re: Writing a compiler - how?
@ACcurrent:
Seriously. That is basically violating everything that is written in any of the books recommended here. If you go about lexing a language procedurally like you are suggesting, you will get nowhere.
Especially since it's so completely unnecessary with C, for which complete BNF descriptions exist.
Seriously. That is basically violating everything that is written in any of the books recommended here. If you go about lexing a language procedurally like you are suggesting, you will get nowhere.
Especially since it's so completely unnecessary with C, for which complete BNF descriptions exist.
Every good solution is obvious once you've found it.
Re: Writing a compiler - how?
It works so... I'm compromising speed as a compiler's speed does not really matter, its more the speed of the executable code that matters. And after all, the 1st version of JS# was written in 20 minutes. Though yes, I do agree on the fact that some times the method I described above is not really going to cut it. In fact I have found that using "ad-hoc" methods makes it much easier to write optimization loops. The code can get messy though and trudging through it can be a pain. Essentially what I mean is that the main challenge is lexing and tokenization. After one tokenizes the language the work becomes easy.
Get back to work!
Github
Github
Re: Writing a compiler - how?
No. I'm sorry, I can support many opinions that I don't share, but this one is just plain wrong.ACcurrent wrote:It works so... I'm compromising speed as a compiler's speed does not really matter, its more the speed of the executable code that matters. And after all, the 1st version of JS# was written in 20 minutes. Though yes, I do agree on the fact that some times the method I described above is not really going to cut it. In fact I have found that using "ad-hoc" methods makes it much easier to write optimization loops. The code can get messy though and trudging through it can be a pain. Essentially what I mean is that the main challenge is lexing and tokenization. After one tokenizes the language the work becomes easy.
Not only that, but you advocated your utterly braindead solution to someone else, and that I cannot abide.
Think about it this way, Lexing matches regular languages. You can do that with regular expressions. Parsing matches context-free or context-sensitive languages. Many languages (including all C dialects, Java, and anything with the '(Type)x' cast notation) require help from the semantic analysis step to parse.
Once parsed, the AST then has to be semantically analysed for semantic flaws. Unless you don't care about error checking for some reason.
Then, that AST has to be converted into some simpler intermediate representation to flatten structure and prepare for optimisation. Let's assume you don't care about optimisation for the moment as that's more complex.
To prepare for code generation, you need a way of converting your IR into machine instructions for your chosen target. Not only that but you'll need to know all about the target's ABI and instruction set.
A nontrivial compiler is the most difficult program that exists on a computer, kernel included.. It cannot be done in 131 lines of the world's most crufty Python. What you implemented was an (extremely poor) source-to-source translator.