Page 1 of 1

A compiler in progress

Posted: Tue Sep 07, 2004 3:14 pm
by Candy
Well, I appear to have chosen a course called PTI, Programming Language (taal) Implementation. Objective is, implement a language in 10 weeks, 8 hours a week work.

Objective for week one (that ends coming thursday) was to implement a compiler for the starting expression language. This language consisted of addition, substraction, multiplication, division, negation and precedence (parentheses handling, plus multiply-over-add). Since there are lots of people here who want to know how a compiler works, and I don't consider this one a load of work (it actually cost around 10 hours, which isn't too much) I'm releasing this compiler (for expressions) under the BSD license.

Combining this with the compiler tutorial here is hard since it's going to go to a C compiler written in C++, whereas the writer of the tutorial uses other languages, both for implementation and for source. The target language is nasm-compatible assembly, which at the time of writing is very shoddy (I had 2 days, only got the assignment on monday, had to be done by thursday) and abuses the x86 as a stack machine, and constantly cleans all registers. This should be gone by the 5th or 6th week, as I'll by then have looked at enough graph coloring algorithms to allow register allocation.

For a tutorial, wait until christmas, then I'll have some day off :). Might be coming weekend. Anyway, the code is here, and it's open (just not open enough to call it PD, which only means that I'd like my name to go with it). By the later compiler iterations it'll probably go pd anyway.

The point is, people that are looking for a very simple compiler (10.5k source), try this one. It might not disappoint.

Note that of course, there are other ways to do X and this is only one way to do it. Some parts are done badly because of time constraints, and there is very little comment in the code (although I've tried to keep the code quite clear I wasn't able to add much comment. Will do so tomorrow and upload a new one).

PS: df, could we upload .tar.bz2's and .tar.gz's? .zip really sucks at code compression. The .bz2 was 66% of this size.

Re:A compiler in progress

Posted: Fri Sep 22, 2006 8:10 pm
by Kevin McGuire
Neato. Did you ever get enough time to finish it? ???
Seems like a interesting idea for someone to implement something like the project Singularity does.

Re:A compiler in progress

Posted: Sat Sep 23, 2006 9:13 am
by Candy
Kevin McGuire wrote: Neato. Did you ever get enough time to finish it? ???
Seems like a interesting idea for someone to implement something like the project Singularity does.
I got it finished enough to make the course, which wasn't too much. The course required teamwork and the guy I was making it with didn't do his part. So, all in all, if I search for it, it'll be a backend of a C-ish language that compiles to perfectly correct but very unoptimal assembly code. Since I wasn't about to write a register allocator, it treats the x86 as stack machine, and since I wanted a pretty straightforward translation it doesn't bother to optimize out any pointless moves. The frontend could use a bit of work, but the backend could as well. My main problem with the (current) front is that it's written in bison/yacc with lex/flex, which I dislike for syntax and such.

PS: the post you reacted on is very very old :).

Re:A compiler in progress

Posted: Sat Sep 23, 2006 1:05 pm
by Kevin McGuire
I like to go digging in the archives sometimes. I find all sorts of interesting questions, answers, and topics. :)

What kind of assembling method did you use for generating the x86 instructions. I am interested because a little while back I wrote one, but instead of doing a macro assembler I ended up making a list of instructions and flags almost straight from the x86 documentation. I never completely finished it, but the method did work. What method did you use?

Re:A compiler in progress

Posted: Sat Sep 23, 2006 1:35 pm
by Candy
Kevin McGuire wrote: I like to go digging in the archives sometimes. I find all sorts of interesting questions, answers, and topics. :)

What kind of assembling method did you use for generating the x86 instructions. I am interested because a little while back I wrote one, but instead of doing a macro assembler I ended up making a list of instructions and flags almost straight from the x86 documentation. I never completely finished it, but the method did work. What method did you use?
At the time, just plain using an existing assembler. Right now (as in, somehow concurrently with a bunch of other things) I'm considering making a template assembler which reads in a template for the given architecture from a file, filters on a given processor type and applies the transformations stored in that file to the input resulting in assembled output. I've got a small bit of x86 done (making the ModRM stuff into generic assembly instructions is hell) and the entire PIC16 series, to show that it's sort of independant, at least in language. For the actual assembler, I haven't got a byte of code to show. Haven't got quite as much time for it as I'd hoped and I've finally gotten back to OS dev itself, which should come in the first place.

The instructions that are in it are in the print16.asm file in the boot loader, and solely those. It should be able to assemble them. There is an assembler language rationale on my SVN that should explain the what's and why's of the language. All stuff on the svn is PD, so if you see something you like, just take at your will.

Re:A compiler in progress

Posted: Sat Sep 23, 2006 1:57 pm
by Kevin McGuire
Thanks, I will at least open it up and read a good summary. Yeah, the modRM and SIB bytes were a nightmare and tooks days to feel comfortable writting code to manipulate them. I hated the fact that they made it so complicated for the x86 instruction set.

I just looked over it and what did catch my eye was
u? encodes an unsigned type of ? bits. u1 encodes a bit, u2 encodes a value between 0 and 3, u8 encodes a byte from 0 to 255 and u32 encodes a 32-bit unsigned value. s? encodes a signed type of ? bits. s2 is from -
At which I had a very similar thought over the years for a machine independant language or meta language.

I must go get my bordom fix -- going to splice a parallel cable and put some resistors and LEDs on the end for fun. Thanks for the information. I appreciate it.

Re:A compiler in progress

Posted: Sat Sep 23, 2006 4:14 pm
by Candy
Kevin McGuire wrote: Thanks, I will at least open it up and read a good summary. Yeah, the modRM and SIB bytes were a nightmare and tooks days to feel comfortable writting code to manipulate them. I hated the fact that they made it so complicated for the x86 instruction set.
You should look at the definitions I already have for encodings that use ModR/M. SIB is a breeze, it just uses the register mappings themselves and the one exception (ESP) you can catch and do elsewhere. Provisions for that are available.

Having a mapping to map (bx+di), (bp+si), si, bx and a plain imm16 is not in any way ordered. You have to write them all out.
I just looked over it and what did catch my eye was
u? encodes an unsigned type of ? bits. u1 encodes a bit, u2 encodes a value between 0 and 3, u8 encodes a byte from 0 to 255 and u32 encodes a 32-bit unsigned value. s? encodes a signed type of ? bits. s2 is from -
At which I had a very similar thought over the years for a machine independant language or meta language.
Well... you require that. The PIC instruction set atm uses it for the register access too, since it has 128 registers (u7).

The different bit is the i7, which is all signed values plus all unsigned values. It allows you to define a "don't care what the user specifies - cram it in 7 bits if there's some way it can fit", for instance for the "add byte to other byte" instructions you can just use an i8 and let the user enter -64 or 192 and it'll pretend to do the add you specify and just encode it as a byte. The add will do the same either case and your user can use both constants.
I must go get my bordom fix -- going to splice a parallel cable and put some resistors and LEDs on the end for fun. Thanks for the information. I appreciate it.
Enjoy. If you want to discuss this stuff, I'm always willing to talk design things. My MSN and ICQ in my profile are still accurate (even though the domain doesn't exist).

Re:A compiler in progress

Posted: Sun Sep 24, 2006 7:43 pm
by df
i added some stuff so bz2 type things should work

Re:A compiler in progress

Posted: Mon Sep 25, 2006 10:09 am
by Candy
df wrote: i added some stuff so bz2 type things should work
Thanks. Do you happen to know where the attachment went?

Re:A compiler in progress

Posted: Mon Sep 25, 2006 12:04 pm
by bubach
i was wondering that too, if you take backups before deleting old attachments?

Re:A compiler in progress

Posted: Mon Sep 25, 2006 1:16 pm
by df
bubach;
i dont make backups, i just nuke the files in the directory once every few years when space runs out.

if you want em permanant, link offsite to them.

candy;
doesnt look like anything uploaded.. there is no week1.zip file in the attachment directory...

i just increased the upload space + size restrictions per file

Re:A compiler in progress

Posted: Mon Sep 25, 2006 2:48 pm
by Candy
df wrote: candy;
doesnt look like anything uploaded.. there is no week1.zip file in the attachment directory...

i just increased the upload space + size restrictions per file
<swap>
bubach;
i dont make backups, i just nuke the files in the directory once every few years when space runs out.

if you want em permanant, link offsite to them.
That explains where my upload went. It was from 2004.

Re:A compiler in progress

Posted: Fri Sep 29, 2006 10:00 am
by df
*cough* goes and looks at the posts date.. *cough*
oh. 2004. ;)

(ive only ever cleaned out the upload directory i think.. once?)

Re:A compiler in progress

Posted: Fri Sep 29, 2006 4:48 pm
by Kevin McGuire
*laughing* :)