How to make a compiler
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: How to make a compiler
Actually, in terms of the time spent, the dominant part of most compilers is the lexical analyzer - which in most cases, amounts to a Deterministic Finite State Automaton, whether formally constructed or as an ad-hoc equivalent. Either way, I would have a lot of trouble seeing that as a graph coloring problem.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Re: How to make a compiler
Only if your compiler is an optimizing one.mariuszp wrote:It's literally just lots of graph problems; graph coloring, vertex reachability, etc.
Re: How to make a compiler
That's only true if you either use an existing backend or don't care about optimization. Lexical analysis is solved and there are so many algorithms to do it easily: recursive descend, PEG, LR parsers generators, or even Earley parsers that can handle any CFG.Schol-R-LEA wrote:Actually, in terms of the time spent, the dominant part of most compilers is the lexical analyzer - which in most cases, amounts to a Deterministic Finite State Automaton, whether formally constructed or as an ad-hoc equivalent. Either way, I would have a lot of trouble seeing that as a graph coloring problem.
A LLVM-style Greedy RA is superior to graph coloring RAs in practice, prove me wrong. Graph coloring just cannot easily handle non-homogeneuos register files (and of course, it's also quite expensive). When I wrote an implementation (only a bit over 1k SLOC, can do live range splitting but not register spilling right now) inspired by the LLVM one, I was surprised (i) how natural it is, and (ii) how good the results are if you tune it a bit (for smallish functions with only a few basic blocks/variables, the result is often optimal). It's incredibly easy to tune -- when you see a missed optimization, it's usually straightforward to see how to extend the RA to handle it.mariuszp wrote:It's literally just lots of graph problems; graph coloring, vertex reachability, etc. Prove me wrong.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: How to make a compiler
facepalm I can't believe I fell for that. Yeah, that can describe more or less any practical algorithm, can't it (as P is a subset of NP).mariuszp wrote:All problems in NP have a polynomial-time reduction to graph coloring.Schol-R-LEA wrote:Actually, in terms of the time spent, the dominant part of most compilers is the lexical analyzer - which in most cases, amounts to a Deterministic Finite State Automaton, whether formally constructed or as an ad-hoc equivalent. Either way, I would have a lot of trouble seeing that as a graph coloring problem.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Re: How to make a compiler
Doesn't hold when you want to parse C++. (And context-sensitivity gives you a PSPACE-complete problem.)
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Re: How to make a compiler
The lexer is actually the simplest and least time consuming part of writing a compiler. There's also pretty much only two ways to design a lexer; one which transforms the entire source file into a token array in a single step which is then handed off to the parser, or one which generates tokens for the parser on the fly. I personally prefer the former design for reasons of performance and code simplicity.Schol-R-LEA wrote:Actually, in terms of the time spent, the dominant part of most compilers is the lexical analyzer - which in most cases, amounts to a Deterministic Finite State Automaton, whether formally constructed or as an ad-hoc equivalent. Either way, I would have a lot of trouble seeing that as a graph coloring problem.
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: How to make a compiler
Sorry if I wasn't clear; I wasn't referring to the implementation time, but the running time during a compile. Depending on the number of potential transitions, and the amount of optimization the compiler performs, lexical analysis can consume up to 80% of the running time (the textbook Modern Compiler Design by Grune, et. al., explains why in detail).Qbyte wrote:The lexer is actually the simplest and least time consuming part of writing a compiler. There's also pretty much only two ways to design a lexer; one which transforms the entire source file into a token array in a single step which is then handed off to the parser, or one which generates tokens for the parser on the fly. I personally prefer the former design for reasons of performance and code simplicity.Schol-R-LEA wrote:Actually, in terms of the time spent, the dominant part of most compilers is the lexical analyzer - which in most cases, amounts to a Deterministic Finite State Automaton, whether formally constructed or as an ad-hoc equivalent. Either way, I would have a lot of trouble seeing that as a graph coloring problem.
Writing a lexical analyzer is pretty easy, comparatively. Writing an efficient one is another story, though as you said, it still is usually simpler and faster to implement than the other components.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Re: How to make a compiler
Hi mariuszp,
With availability of lot of video courses online, you can do anything you set your mind to. You will need to invest some time and energy for it. With a combination of a book ( a good free one suggested already by others ) and online resources, you should be able to get it done. The classic text by Aho et el is another good reference. Hopefully you can find a low priced used copy online.
I think, these videos are very instructional.
https://courses.cs.washington.edu/cours ... index.html
--Thomas
With availability of lot of video courses online, you can do anything you set your mind to. You will need to invest some time and energy for it. With a combination of a book ( a good free one suggested already by others ) and online resources, you should be able to get it done. The classic text by Aho et el is another good reference. Hopefully you can find a low priced used copy online.
I think, these videos are very instructional.
https://courses.cs.washington.edu/cours ... index.html
--Thomas
Re: How to make a compiler
I remember enjoying reading this too.
https://holub.com/compiler/. The author has released the book and sources in public domain. Read this if you want to learn how to write a lex and yacc clone. However i did not like the design of the final compiler he came up with. There are far more modern and more modular ways to do it. But a good read for sure.
--Thomas
https://holub.com/compiler/. The author has released the book and sources in public domain. Read this if you want to learn how to write a lex and yacc clone. However i did not like the design of the final compiler he came up with. There are far more modern and more modular ways to do it. But a good read for sure.
--Thomas
Re: How to make a compiler
Boost.Spirit is (IMHO) a strong C++ alternative to the C (lex/flex, yacc/bison) way of going about things.
Upsides:
Additionally, let me mention Crafting Interpreters by Bob Nystrom as an introduction on how to make an interpreter (which isn't that different from making a compiler, really). I found that piece to be more enlightening than many more theoretical approaches, especially if you're a beginner in the art and can't wrap your head around all the related lingo yet.
Upsides:
- Integration. No additional syntaxes, no additional tools, no additional building steps.
- Direct generation of C++ object hierarchy.
- Modularity and re-usability of subparsers. (For example, I've created a Boost.Spirit parser for "quoted strings" that includes the usual C escapes, octal, hexadecimal, and Unicode character constants, and generates a UTF-8 std::string. And re-used it in multiple projects since.)
- In combination with Boost iterators, can give very good error messages on failed parses with very little effort.
- Compile times. (Can be alleviated significantly by modularization, see above.)
- "Classic" C++ template error messages (when building the parser) can be daunting. (Might get better with Contracts.)
Additionally, let me mention Crafting Interpreters by Bob Nystrom as an introduction on how to make an interpreter (which isn't that different from making a compiler, really). I found that piece to be more enlightening than many more theoretical approaches, especially if you're a beginner in the art and can't wrap your head around all the related lingo yet.
Every good solution is obvious once you've found it.
Re: How to make a compiler
I presume, that figure of 80% corresponds to non- or nearly non-optimizing compilers.Schol-R-LEA wrote:Depending on the number of potential transitions, and the amount of optimization the compiler performs, lexical analysis can consume up to 80% of the running time (the textbook Modern Compiler Design by Grune, et. al., explains why in detail).
The languages that allow writing code with exponential compile times (where the compiler needs to try this or that in order to resolve ambiguity, potentially many times nested) are probably a separate category of their own.
Re: How to make a compiler
I like things like http://people.cs.pitt.edu/~soffa/research/Comp/gurr.ps as ways to make graph colouring or analysis go through, though I'm admittedly coming from the point of view of someone not really interested enough in compilers to code up an implementation of it or a more recent algorithm in the same vein.Korona wrote:A LLVM-style Greedy RA is superior to graph coloring RAs in practice, prove me wrong. Graph coloring just cannot easily handle non-homogeneuos register files (and of course, it's also quite expensive). When I wrote an implementation (only a bit over 1k SLOC, can do live range splitting but not register spilling right now) inspired by the LLVM one, I was surprised (i) how natural it is, and (ii) how good the results are if you tune it a bit (for smallish functions with only a few basic blocks/variables, the result is often optimal). It's incredibly easy to tune -- when you see a missed optimization, it's usually straightforward to see how to extend the RA to handle it.