Intermediate representation

Programming, for all ages and all languages.
Post Reply
User avatar
dc0d32
Member
Member
Posts: 69
Joined: Thu Jun 09, 2005 11:00 pm
Location: Right here

Intermediate representation

Post by dc0d32 »

Hi,

I'm _trying_ to resume my CIL/JVM bytecode -> native compiler project. I'm in indecision about the intermediate representation to use for [future] optimizer plug ins to work on, and generate target code.

Currently I'm thinking of simple 3 address codes or so called quadruples with basic blocks [which is even partially implemented], with which sub-optimal code generation looks trivial.

As an alternative, regular binary trees can be used, which; I believe will provide more flexibility. We can even use basic blocks with each basic block as regular tree.

I've never worked on a compiler project before, nor have I really analyzed source of one. [well, I have looked at gcc & lcc sources, but they are way huge]. Thats where the indecision comes from.

Please enlighten me. I need that !

Thanks in advance

[I know, there is a lot of other stuff i need to worry about at the same time, like GC, threading, generics n the rest. I'm going by "Start slow, naive" law as they say :) ]

~Prashant
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Post by Combuster »

Basically, you have the constructs of one particular language, and you need to translate those to the constructs of another (assembly)

Some theory behind compilers and other stuff
Functional starting code for the university's compiler assignment (compiles a subset of java to assembly)
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
dc0d32
Member
Member
Posts: 69
Joined: Thu Jun 09, 2005 11:00 pm
Location: Right here

Post by dc0d32 »

Thanks for the quick reply Combuster!

I know neither Haskell nor (if I'm not mistaken) Dutch ! [ will try n learn it(Haskell) in near future though :D]

Since we're reading byte code, guess there is no need to worry about parsing and all that automaton theory. After all, byte code is one intermediate representation, only thing is that it can not be readily processed.
We are actually writing just the backend of a hypothetical java->asm compiler, along with an optimizer which optimizes the IR even further; doing what the java compiler's optimizer didn't.

~Prashant
User avatar
babernat
Member
Member
Posts: 42
Joined: Tue Jul 03, 2007 6:53 am
Location: Colorado USA

Post by babernat »

One thing that could help you make your decision is whether or not you'd like to take advantage of any optimization in your process. Since jvm or clr bytecode is somewhat regular and produced for a virtual machine, it is reasonable to expect you could take advantage of some optimizations since you know which architecture you're compiling for. At the very least you would want to accomplish what something like jit compiling already does on the Java platform.

If you want to go down the optimization route, you should probably first translate things into a SSA because there are algorithms available for optimization using this form. As a hint you can look at

http://gcc.gnu.org/onlinedocs/gccint/Tr ... l#Tree-SSA

or pick up any compiler book covering optimization.
Thanks for all the fish.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

Currently I'm thinking of simple 3 address codes or so called quadruples with basic blocks [which is even partially implemented], with which sub-optimal code generation looks trivial.
I could tell you more about that, but unfortunately my employer has lots of patents in that area... :S
User avatar
Alboin
Member
Member
Posts: 1466
Joined: Thu Jan 04, 2007 3:29 pm
Location: Noricum and Pannonia

Post by Alboin »

JamesM wrote:
Currently I'm thinking of simple 3 address codes or so called quadruples with basic blocks [which is even partially implemented], with which sub-optimal code generation looks trivial.
I could tell you more about that, but unfortunately my employer has lots of patents in that area... :S
Aren't patents public? (ie. not secret.)
C8H10N4O2 | #446691 | Trust the nodes.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

Alboin wrote:
JamesM wrote:
Currently I'm thinking of simple 3 address codes or so called quadruples with basic blocks [which is even partially implemented], with which sub-optimal code generation looks trivial.
I could tell you more about that, but unfortunately my employer has lots of patents in that area... :S
Aren't patents public? (ie. not secret.)
Patents pending. And most of the docs are marked "secret", so that sort of tells me the information I need ;)
Post Reply