Writing a compiler - how?

JamesM · Post by **JamesM** » Wed May 02, 2012 10:41 am

turdus wrote:
JamesM wrote:... I think some people are missing the point.

RPN is extremely easy to parse as there is no precedence. However, you need to have your language defined such that it uses RPN.
Sorry, but you're totally wrong. Any infix notation with different precedence operators can be converted into predecende-free postfix notation without problems, and it's totally independent of the language.

Have you checked the link I posted?

No, it can't. It can't by definition - the precedence and associativity of operators are part of the language. Therefore, a mapping between infix and postfix operators cannot exist that is independent of the language.

If you can convert to postfix, you have defined precedence, ordering and associativity and as such can use stuff like the shunting yard algorithm to move things around.

But using RPN is not a drop-in replacement for recursive descent or stack-automaton parsing; RPN + shunting yard or a similar algorithm is.

Thomas · Post by **Thomas** » Wed May 02, 2012 1:17 pm

Take everything I say with a grain of salt , I do not work with compilers , but once I have found bugs in the code generated and reported it to the compiler team

. That's the most compilerly thing I have done

. Later this week , I will try to post a mind map summarizing whatever I learn 't hoping that it is of some use to someone.

--Thomas

turdus · Post by **turdus** » Wed May 02, 2012 1:41 pm

JamesM wrote:...the precedence and associativity of operators are part of the language.

No. These rules are defined by pure math, and therefore common among all languages. It doesn't matter you wrote the following in C, in Java, PHP or in BASIC, it's the same.

Code: Select all

(1+2)*3-4<5+(7/6)

Of course language may define new operators or use different pattern (like .lt.), but it will be definitely called broken if returns false as result for the above.

Thomas · Post by **Thomas** » Wed May 02, 2012 1:51 pm

No. These rules are defined by pure math, and therefore common among all languages

Incorrect ! . James is right.

It doesn't matter you wrote the following in C, in Java, PHP or in BASIC, it's the same.

They are mostly imperative programming languages
--Thomas

Combuster · Post by **Combuster** » Wed May 02, 2012 1:55 pm

Still, the most common operator in maths doesn't function equally between formal maths, C and Basic and many other languages, and then we're not even talking about a detail like precedence. Yes I'm talking about the single =

And nobody complains about that being broken even though it's obviously wrong.

Rudster816 · Post by **Rudster816** » Wed May 02, 2012 3:29 pm

Combuster wrote:Still, the most common operator in maths doesn't function equally between formal maths, C and Basic and many other languages, and then we're not even talking about a detail like precedence. Yes I'm talking about the single =

And nobody complains about that being broken even though it's obviously wrong.

Undoubtedly the most illogical part of the C operator precedence is the >=, >, etc operators taking precedence over the && operator.
E.g.

Code: Select all

if (x < 10 && x >= 0)

means

Code: Select all

if ((x < (10 && x)) >= 0)

As far as parsing goes, the order of operations does present a problem in that you must read the entire expression, and iterate through all of the operators to find which to evaluate first, second, third, etc. The effort is further complicated by the fact that some operators are evaluated left to right while others are left to write. The fact that the order for C like languages is somewhat unintuitive is pretty much irrelevant to the compiler writer though.

AndrewAPrice · Post by **AndrewAPrice** » Wed May 02, 2012 3:57 pm

Rudster816 wrote:Undoubtedly the most illogical part of the C operator precedence is the >=, >, etc operators taking precedence over the && operator.

It shouldn't. You must be using a non-conforming compiler?
http://en.wikipedia.org/wiki/Operators_ ... precedence

Rudster816 · Post by **Rudster816** » Wed May 02, 2012 4:39 pm

MessiahAndrw wrote:
Rudster816 wrote:Undoubtedly the most illogical part of the C operator precedence is the >=, >, etc operators taking precedence over the && operator.
It shouldn't. You must be using a non-conforming compiler?
http://en.wikipedia.org/wiki/Operators_ ... precedence

I think you need to read your own link

. Number 8 is >, <, >=, <=. Number 13 is &&.

mnovotny · Post by **mnovotny** » Wed May 02, 2012 4:50 pm

Rudster816 wrote:
Code: Select all
if (x < 10 && x >= 0) 
means
Code: Select all
if ((x < (10 && x)) >= 0) 

That's Pascal you are talking about. In C the expression evaluates as:

Code: Select all

if ((x < 10) && (x >= 0))

AndrewAPrice · Post by **AndrewAPrice** » Wed May 02, 2012 5:39 pm

Rudster816 wrote:I think you need to read your own link . Number 8 is >, <, >=, <=. Number 13 is &&.

I stand corrected. May explain a couple of bugs I've encountered in the past.

I've got into the habit of explicitly using brackets lately since I deal with many languages day-to-day, just for situations like this!

Jezze · Post by **Jezze** » Wed May 02, 2012 5:42 pm

I saw no one had posted the url to this awesome tutorial.

It shows how to create a small subset-of-c-language for llvm!

http://gnuu.org/2009/09/18/writing-your ... -compiler/

Oh, and the source for the tutorial can be found here:

https://github.com/lsegal/my_toy_compiler

gerryg400 · Post by **gerryg400** » Wed May 02, 2012 8:37 pm

Rudster816 wrote:Undoubtedly the most illogical part of the C operator precedence is the >=, >, etc operators taking precedence over the && operator.
E.g.
Code: Select all
if (x < 10 && x >= 0) 
means
Code: Select all
if ((x < (10 && x)) >= 0) 

You said that ">=,>,etc" take precedence over "&&" so doesn't that mean ?

Code: Select all

if ((x < 10) && (x >= 0))

IMHO the only little problem with C precedence is the bitwise & and | operators having lower precedence than the (in)equality operators. The others feel pretty intuitive to me.

turdus · Post by **turdus** » Thu May 03, 2012 6:44 am

Combuster wrote:Yes I'm talking about the single =

"Single =" is not an operator, just a mere symbol. "Let" and "equals" are. And they work as expected for every language, regardless to representation.

@Thomas: can you name any sane language where addition has higher precedence than multiplication? Or can you name a language where associativity of parentheses differs from usual? Or speaking of logical connectives, where negation, conjuntion or disjuntion differs from their mathematical equivalent?

Thomas · Post by **Thomas** » Thu May 03, 2012 8:38 am

Dear turdus,

@Thomas: can you name any sane language where addition has higher precedence than multiplication? Or can you name a language where associativity of parentheses differs from usual? Or speaking of logical connectives, where negation, conjuntion or disjuntion differs from their mathematical equivalent?

You are mixing concept and implementation together. The class of languages you mentioned represent imperative programming languages and most of them have a similar syntax . take for example FORTH , there is no need for a parenthesis or precedence

Precedence and operators are for humans and therefore part of the language. In "pure" mathematics there are only relations /funtions defined over a set . Addition and Subtraction etc are basically functions defined over the set of real , natural , complex numbers .... symbols + , - etc are just for the sake of convenience but leads to ambiguity . Complex operations are represented as composition of functions.

eg a+b * c ...mathematically it would be put as SUM_OF(a , PRODUCT(b,c) ) where SUM_OF and PRODUCT are functions. Notice that here no precedence problems occur.

--Thomas

turdus · Post by **turdus** » Thu May 03, 2012 9:43 am

Dear Thomas,

You are right, but I think you answered a different question.
I think you (not you but in general) are mixing concepts and implementations. For me, it's clear that there're separate abstractions (taking only expressions into count):
1. tokenizer: convert symbols to individual items which have meaning and properties (this is a constant, this is an operator etc.)
2. analizer: examine correlation between tokens by semantics means (create a machine parseable flow)
3. runtime: calculates the value of the expression

So for example, 1. will convert "==" into "equals" operator. 2. can convert from infix to postfix for example, or use postfix input. 3. calculates the value of the expression (totally independent from 1. and 2.).

As for forth, it's still need precedence, it's simply leave this to the programmer by forcing prefix notation. But, as long as infix, prefix and postfix notation can be converted to each other, this does not modify the output at 2.

I hope you got what I mean: a language can specify the way how precedence is represented (with operator tables, explicit pharenteses or forcing the programmer to use a specific notation, whatever), but it cannot change the way how to interpret it. A language can specify the symbols for things (let can be "=" or ":=", greater than can be ">" or ".gt.") but it cannot change the meaning or precedence of these operators.

I understand that this seems strange most of you, because they teach you how to program in a specific language, and therefore you learn simplier definitions with as few abstraction level as possible. That's how most universities do.

For me at the university in the first 2 semesters we learned programming by only matchematical mean with paper and pencil. I'm pretty sure you cannot imagine this (or even imagine how to learn programming without a computer). You can bet most of you would cry if had to understand the matchematical formula of a simple loop (for the courious, I've attached it). So my learning language was math. Later I had courses that tought how specific languages like C, ADA, Eiffel, Java etc. implement that theory.

This way of teaching complicates things, takes long, no concrete results for years, and therefore not used by most universities. But it gaves you the understanding of programming in it's purest essential form.

OSDev.org

Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?

Re: Writing a compiler - how?