Writing a compiler - how?

Programming, for all ages and all languages.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: Writing a compiler - how?

Post by JamesM »

turdus wrote:
JamesM wrote:... I think some people are missing the point.

RPN is extremely easy to parse as there is no precedence. However, you need to have your language defined such that it uses RPN.
Sorry, but you're totally wrong. Any infix notation with different precedence operators can be converted into predecende-free postfix notation without problems, and it's totally independent of the language.

Have you checked the link I posted?
No, it can't. It can't by definition - the precedence and associativity of operators are part of the language. Therefore, a mapping between infix and postfix operators cannot exist that is independent of the language.

If you can convert to postfix, you have defined precedence, ordering and associativity and as such can use stuff like the shunting yard algorithm to move things around.

But using RPN is not a drop-in replacement for recursive descent or stack-automaton parsing; RPN + shunting yard or a similar algorithm is.
User avatar
Thomas
Member
Member
Posts: 281
Joined: Thu Jun 04, 2009 11:12 pm

Re: Writing a compiler - how?

Post by Thomas »

Take everything I say with a grain of salt , I do not work with compilers , but once I have found bugs in the code generated and reported it to the compiler team :mrgreen: . That's the most compilerly thing I have done :mrgreen: . Later this week , I will try to post a mind map summarizing whatever I learn 't hoping that it is of some use to someone.

--Thomas
User avatar
turdus
Member
Member
Posts: 496
Joined: Tue Feb 08, 2011 1:58 pm

Re: Writing a compiler - how?

Post by turdus »

JamesM wrote:...the precedence and associativity of operators are part of the language.
No. These rules are defined by pure math, and therefore common among all languages. It doesn't matter you wrote the following in C, in Java, PHP or in BASIC, it's the same.

Code: Select all

(1+2)*3-4<5+(7/6)
Of course language may define new operators or use different pattern (like .lt.), but it will be definitely called broken if returns false as result for the above.
User avatar
Thomas
Member
Member
Posts: 281
Joined: Thu Jun 04, 2009 11:12 pm

Re: Writing a compiler - how?

Post by Thomas »

No. These rules are defined by pure math, and therefore common among all languages
Incorrect ! . James is right.
It doesn't matter you wrote the following in C, in Java, PHP or in BASIC, it's the same.
They are mostly imperative programming languages
--Thomas
Last edited by Thomas on Wed May 02, 2012 1:57 pm, edited 1 time in total.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Writing a compiler - how?

Post by Combuster »

Still, the most common operator in maths doesn't function equally between formal maths, C and Basic and many other languages, and then we're not even talking about a detail like precedence. Yes I'm talking about the single =

And nobody complains about that being broken even though it's obviously wrong. :wink:
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
Rudster816
Member
Member
Posts: 141
Joined: Thu Jun 17, 2010 2:36 am

Re: Writing a compiler - how?

Post by Rudster816 »

Combuster wrote:Still, the most common operator in maths doesn't function equally between formal maths, C and Basic and many other languages, and then we're not even talking about a detail like precedence. Yes I'm talking about the single =

And nobody complains about that being broken even though it's obviously wrong. :wink:
Undoubtedly the most illogical part of the C operator precedence is the >=, >, etc operators taking precedence over the && operator.
E.g.

Code: Select all

if (x < 10 && x >= 0) 
means

Code: Select all

if ((x < (10 && x)) >= 0) 

As far as parsing goes, the order of operations does present a problem in that you must read the entire expression, and iterate through all of the operators to find which to evaluate first, second, third, etc. The effort is further complicated by the fact that some operators are evaluated left to right while others are left to write. The fact that the order for C like languages is somewhat unintuitive is pretty much irrelevant to the compiler writer though.
User avatar
AndrewAPrice
Member
Member
Posts: 2299
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: Writing a compiler - how?

Post by AndrewAPrice »

Rudster816 wrote:Undoubtedly the most illogical part of the C operator precedence is the >=, >, etc operators taking precedence over the && operator.
It shouldn't. You must be using a non-conforming compiler?
http://en.wikipedia.org/wiki/Operators_ ... precedence
My OS is Perception.
Rudster816
Member
Member
Posts: 141
Joined: Thu Jun 17, 2010 2:36 am

Re: Writing a compiler - how?

Post by Rudster816 »

MessiahAndrw wrote:
Rudster816 wrote:Undoubtedly the most illogical part of the C operator precedence is the >=, >, etc operators taking precedence over the && operator.
It shouldn't. You must be using a non-conforming compiler?
http://en.wikipedia.org/wiki/Operators_ ... precedence
I think you need to read your own link ;). Number 8 is >, <, >=, <=. Number 13 is &&.
User avatar
mnovotny
Member
Member
Posts: 27
Joined: Mon Aug 10, 2009 2:54 am
Location: Slovakia

Re: Writing a compiler - how?

Post by mnovotny »

Rudster816 wrote:

Code: Select all

if (x < 10 && x >= 0) 
means

Code: Select all

if ((x < (10 && x)) >= 0) 
That's Pascal you are talking about. In C the expression evaluates as:

Code: Select all

if ((x < 10) && (x >= 0))
Keep the world with all its sin
It's not fit for livin' in
User avatar
AndrewAPrice
Member
Member
Posts: 2299
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: Writing a compiler - how?

Post by AndrewAPrice »

Rudster816 wrote:I think you need to read your own link ;). Number 8 is >, <, >=, <=. Number 13 is &&.
I stand corrected. May explain a couple of bugs I've encountered in the past.

I've got into the habit of explicitly using brackets lately since I deal with many languages day-to-day, just for situations like this!
My OS is Perception.
User avatar
Jezze
Member
Member
Posts: 395
Joined: Thu Jul 26, 2007 1:53 am
Libera.chat IRC: jfu
Contact:

Re: Writing a compiler - how?

Post by Jezze »

I saw no one had posted the url to this awesome tutorial.

It shows how to create a small subset-of-c-language for llvm!

http://gnuu.org/2009/09/18/writing-your ... -compiler/

Oh, and the source for the tutorial can be found here:

https://github.com/lsegal/my_toy_compiler
Fudge - Simplicity, clarity and speed.
http://github.com/Jezze/fudge/
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Writing a compiler - how?

Post by gerryg400 »

Rudster816 wrote:Undoubtedly the most illogical part of the C operator precedence is the >=, >, etc operators taking precedence over the && operator.
E.g.

Code: Select all

if (x < 10 && x >= 0) 
means

Code: Select all

if ((x < (10 && x)) >= 0) 
You said that ">=,>,etc" take precedence over "&&" so doesn't that mean ?

Code: Select all

if ((x < 10) && (x >= 0)) 
IMHO the only little problem with C precedence is the bitwise & and | operators having lower precedence than the (in)equality operators. The others feel pretty intuitive to me.
If a trainstation is where trains stop, what is a workstation ?
User avatar
turdus
Member
Member
Posts: 496
Joined: Tue Feb 08, 2011 1:58 pm

Re: Writing a compiler - how?

Post by turdus »

Combuster wrote:Yes I'm talking about the single =
"Single =" is not an operator, just a mere symbol. "Let" and "equals" are. And they work as expected for every language, regardless to representation.

@Thomas: can you name any sane language where addition has higher precedence than multiplication? Or can you name a language where associativity of parentheses differs from usual? Or speaking of logical connectives, where negation, conjuntion or disjuntion differs from their mathematical equivalent?
User avatar
Thomas
Member
Member
Posts: 281
Joined: Thu Jun 04, 2009 11:12 pm

Re: Writing a compiler - how?

Post by Thomas »

Dear turdus,
@Thomas: can you name any sane language where addition has higher precedence than multiplication? Or can you name a language where associativity of parentheses differs from usual? Or speaking of logical connectives, where negation, conjuntion or disjuntion differs from their mathematical equivalent?
You are mixing concept and implementation together. The class of languages you mentioned represent imperative programming languages and most of them have a similar syntax . take for example FORTH , there is no need for a parenthesis or precedence

Precedence and operators are for humans and therefore part of the language. In "pure" mathematics there are only relations /funtions defined over a set . Addition and Subtraction etc are basically functions defined over the set of real , natural , complex numbers .... symbols + , - etc are just for the sake of convenience but leads to ambiguity . Complex operations are represented as composition of functions.

eg a+b * c ...mathematically it would be put as SUM_OF(a , PRODUCT(b,c) ) where SUM_OF and PRODUCT are functions. Notice that here no precedence problems occur.

--Thomas
User avatar
turdus
Member
Member
Posts: 496
Joined: Tue Feb 08, 2011 1:58 pm

Re: Writing a compiler - how?

Post by turdus »

Dear Thomas,

You are right, but I think you answered a different question.
I think you (not you but in general) are mixing concepts and implementations. For me, it's clear that there're separate abstractions (taking only expressions into count):
1. tokenizer: convert symbols to individual items which have meaning and properties (this is a constant, this is an operator etc.)
2. analizer: examine correlation between tokens by semantics means (create a machine parseable flow)
3. runtime: calculates the value of the expression

So for example, 1. will convert "==" into "equals" operator. 2. can convert from infix to postfix for example, or use postfix input. 3. calculates the value of the expression (totally independent from 1. and 2.).

As for forth, it's still need precedence, it's simply leave this to the programmer by forcing prefix notation. But, as long as infix, prefix and postfix notation can be converted to each other, this does not modify the output at 2.

I hope you got what I mean: a language can specify the way how precedence is represented (with operator tables, explicit pharenteses or forcing the programmer to use a specific notation, whatever), but it cannot change the way how to interpret it. A language can specify the symbols for things (let can be "=" or ":=", greater than can be ">" or ".gt.") but it cannot change the meaning or precedence of these operators.

I understand that this seems strange most of you, because they teach you how to program in a specific language, and therefore you learn simplier definitions with as few abstraction level as possible. That's how most universities do.

For me at the university in the first 2 semesters we learned programming by only matchematical mean with paper and pencil. I'm pretty sure you cannot imagine this (or even imagine how to learn programming without a computer). You can bet most of you would cry if had to understand the matchematical formula of a simple loop (for the courious, I've attached it). So my learning language was math. Later I had courses that tought how specific languages like C, ADA, Eiffel, Java etc. implement that theory.

This way of teaching complicates things, takes long, no concrete results for years, and therefore not used by most universities. But it gaves you the understanding of programming in it's purest essential form.
Attachments
math formula of a loop
math formula of a loop
loop.gif (11.28 KiB) Viewed 2441 times
Post Reply