a new compiler built on C

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
bsag
Posts: 4
Joined: Fri Feb 24, 2006 12:00 am

a new compiler built on C

Post by bsag »

HI iam a new user
but learning programmer
i have some programming experinece in C some in C++ and also have completed a year long course on ASM
now iam currently iam building a new compiler using the C language
but people near me prefer C++ to C
could anyone suggest to me which would be the better option to use.
besides all this i have decided to do an automation project.
anyone suggestions.
User avatar
carbonBased
Member
Member
Posts: 382
Joined: Sat Nov 20, 2004 12:00 am
Location: Wellesley, Ontario, Canada
Contact:

Re: a new compiler built on C

Post by carbonBased »

I started writting my own compiler in C some time back, as well. Haven't worked on it in a while, but I see no real downfalls to using C.

There were a couple places where I would've liked to have had a C++/STL string class, but by and far, I don't see a big difference. You use what you like the best.

Cheers,
Jeff
bsag
Posts: 4
Joined: Fri Feb 24, 2006 12:00 am

Re: a new compiler built on C

Post by bsag »

ow having come along to c i have got to the start with the lexical and parser getting shape.
but my key doubt would be that could a bit of new methods be included such taht they defy the traditional.
people usually use a stack for identifying tokens. but it also introduces the prolems of involving every syntax in postfix and as a result a seperate coding for postfix conversion is essential. now i have deviced a code that could work pretty well on infix commands. but the real problem arises that the code for token identification and the pattern recognition get embedde into one another.
though seperating the two codes will not be a big deal but at the same time the time complexity involved gets increased badly, and having a single code implies that reusability is eleminated. so what would be your suggestion? to have it embedded to save time or to seperate it to increase modularity.
User avatar
carbonBased
Member
Member
Posts: 382
Joined: Sat Nov 20, 2004 12:00 am
Location: Wellesley, Ontario, Canada
Contact:

Re: a new compiler built on C

Post by carbonBased »

I'm not entirely sure I follow you... I can say that my own compiler clearly separated the tokenization, the token parser, and the compiler portions.

Essentially, the tokenizer read in strings and outputted token objects (ie, C structs).

The token parser then read those token objects and reordered then (reverse polish notation).

Finally, the compiler portion took the reordered tokens and output assembly language to a file.

Hopefully that helps.

Cheers,
Jeff
Da_Maestro
Member
Member
Posts: 144
Joined: Tue Oct 26, 2004 11:00 pm
Location: Australia

Re: a new compiler built on C

Post by Da_Maestro »

Infix notation is very hard to decode and I suggest using your previous idea of postfix notation. The main problem with postfix notation is it assumes that all operators are left-to-right precedence, while C and C++ have operators that are right-to-left ("=" for example).

Postfix is neat because it lets you use a stack for conversion from infix to postfix and all postfix expressions are evaluated by using a stack (if anyone has programmed using JVM directly they should be familiar with this).

Compiling C code is relatively easy. There is basic steps:
1. Parse the file. Sort between tokens and identifiers and reconstruct each statement using an internal postfix notation.
2. Once you have a everything in postfix notation you are half way to having machine code. Re-interpret identifiers in your statements as push commands and operators as pop-pop-operate-push commands.
3. Change identifier references to relative addresses.

My method here won't compile full programs, but it will compile individual code blocks and those can be put together to make full programs. Happy coding!
Two things are infinite: The universe and human stupidity. But I'm not quite sure about the universe.
--- Albert Einstein
bsag
Posts: 4
Joined: Fri Feb 24, 2006 12:00 am

Re: a new compiler built on C

Post by bsag »

hi again,
thanks for your reply, i did try out the coding with both infix and postfix decoding, and did find the postfix decoding easier but then again i went back to the infix format and started to modularize the entire program and believe me the program turned out just right, and no use of stack. and again the code block to convert the infix notation to postfix need not be included. thus the programs were more or less of the same size.
i do believe that the traditional approach of using a postfix would be better but i also think that a little bit of innovation won't hurt. i hope that atleast my project won't get stalled because of this so called innovation. but yet...
and now i believe i can complete my entire parser code in within a month or so. i know iam little slow.
thanks again.
Post Reply