OS-Dev Compiler

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Tux

OS-Dev Compiler

Post by Tux »

I started dev on my own compiler. I am using nasm as my assembler. Here is code that shows some commands in the language. (Even /**/ and // are implemented)

$type bin,0 /*Binary means it doesn't have the
header startup stuff, the 0 is compression (Not implemented yet) */

$pos clip,0x100000 /*Clip means the program can't be loaded to anywhere other then 0x100000*/

start(); /*Jumps to start, if it was an exe instead of a bin then the compiler would do this automatically*/

halt; /*After start is done, this stops execution if there are any variables*/

db name="Tux"; //Creates a global byte array
db address; //Creates a global double

//The top may be confusing, but it is just options :)
start(): //A function
address=name.a; //Sets it to address of name
[address].b[1]="e"; //Changes the u in Tux to e
retn; //Means return null;


This all may look a little blurry, but after some explanation, you will understand it.
Here are some things you should know:

File types:
bin : Has no exe headers and no auto jump to start
exe : Has headers and auto jumps to start();
lib : Has function tree
myt : Has no function tree

Variables:
There are 6 types of variables.
db : declare global byte
dw : declare global word
dd : declare global double
tb : make a temporary byte on the stack
tw : make a temporary word on the stack
td : make a temporary double on the stack

Variable structures:
Unlike in C, I am adding Object Oriented Variables.
A variable consists of
.a = as in the example, it gets the address of the var
.b = helps view or edit 1 byte in an array
.w = helps view or edit a word in an array
.d = helps view or edit a double in an array

Changing values:
There are different ways to change a value.
We seen in the example that we can use a pointer ([]);
We can also do this:
db stuff="Hello!";
start():
[stuff.a].b[0]="M"; //Changes Hello! to Mello!
retn;

OR

db stuff="Hello!";
start():
stuff.b[0]="M"; /*Changes Hello! to Mello! without a pointer */
retn;

OR

db stuff="Hello!";
start():
stuff="Mello!"; //Changes Hello! to Mello! directly
retn;

Bugs:
A variable is compiled as data so if I do this:
start():
db zero=0;
zero++;
retn;

Zero, would be executed! I need to fix that by adding jump statements. That will hopefuully be fixed in the next version.

The compiler is still in dev tho. I just want opinions.
Tux

Re:OS-Dev Compiler

Post by Tux »

[0] <-- messes everytrhing up

Replace The circle thingie with [ZERO]

The forum can't display 0 in []
unknown user

Re:OS-Dev Compiler

Post by unknown user »

sounds pretty cool ^^;.
Tux

Re:OS-Dev Compiler

Post by Tux »

Thanks, this is how a VERY simple malloc function would work.

$type bin,0
$pos clip,0x100000
$protect yes /*This gives the compiler the ok to put jumps so the variables are protected and not executed*/

$define ZERO 0
//Jump to start();
start();

/*Cheap version of malloc that wastes, but doesn't restore memory*/
dd pmalloc=0x200000; //Start of free store
malloc(db len): //Malloc function
pmalloc+=len; /*Upgrade memory so that the area is taken*/
retd pmalloc-len; //return double

//Start of main function
start():
dd point;
point=pmalloc(100); /*Allocates 100 bytes and saves starting address to point*/
[point].b[ZERO]="H"; //Sets the first byte to "H"
[point+1]="ello!"; /*Something new, but this sets the next sizeof("ello!") bytes after point+1 to "ello!" */
retn; //Return void

BTW, [point]="Hello!"; could be done to.
[point].b[1]; is the same thing as [point+1].b[ZERO];


Many of you are probably are asking where you can get this language. I am making it right now. Has some features installed. It is not usable right now. Most of the error checking depends on nasm tho.
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:OS-Dev Compiler

Post by Pype.Clicker »

fine. Don't forget to write down the exact grammar of your language (look at BNF grammars, it could help), so that other people can use it too.

Also make sure the goals of your langage are clearly defined and that it will really bring advantages for OS development, or you'll be simply loosing time.

Think about issues like platform-independence, local/global variables (missing local variables is what makes assembler soo hard to use), functions parameters (how much ? how to describe/check them ...), lexical scope (so that a name can be obscured by another one).

And finally, the hardest : think about useful error messages.
Tim

Re:OS-Dev Compiler

Post by Tim »

Tux wrote:Replace The circle thingie with [ZERO]
Enclose any source code you post in [ code ] blocks.

For instance:

Code: Select all

i_can_write_zeroes[0]
Tux

Re:OS-Dev Compiler

Post by Tux »

Thanks, let me try:

Code: Select all

 test 
Btw, I started redesigning the code to the compiler. Now its faster because it mainly does 3 loops.
Tim

Re:OS-Dev Compiler

Post by Tim »

Hmm. Your language looks like some of C's predecessors (BCPL, B, etc.).
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:OS-Dev Compiler

Post by Pype.Clicker »

Just a question, do you plan to support complex expressions such as

Code: Select all

[var]=(y<<6+y<<5+x)>>2+0xb800
and how exactly could i get the value of a pixel if the address of the pixel is in p_off ? is it [p_off] ? or [[p_off]] ...

try to keep in mind that issues like registers allocation for expressions and expressions optimization (such as generating y<<6 rather than y*64) are not trivial at all on x86 computers.

This doesn't mean that you should give up. Compilers are a highly interresting, but also pretty complicated ...
Tux

Re:OS-Dev Compiler

Post by Tux »

About getting a value in p_off.
p_off is just an integer.
The [] symbols make it a pointer.
[p_off] means read unknown amount of bytes from the address p_off points to.

Let's say each pixel is 1 byte for example.
So, you do this.
[p_off].b[0]; //Gets the first byte (First pixel)
[p_off].b[1]; //Gets the second byte (2nd pixel)

If each pixel was 2 bytes (a word) , then you would do this:

[p_off].w[0]; //Gets the first word (First pixel)
[p_off].w[1]; //Gets the second word (2nd pixel)

About the format.
This is how the compiler sees it

[1].2[3];

The read formlula is this:
1+(2*3)
Let's apply it to: [0x100000].d[2];
That's 0x100000 + (4*2)=0x100008
Why? A .d means a double which is 4 bytes.

Hopefully that is what you wished to know.
I really would like to support complex expressions. After years of Algebra, expect the compiler to be solving some equations. But its just 1 person with 1 project. I sometimes wish someone took my idea and made it for me. If anybody wants the language overview, I can give it to them.
Tim

Re:OS-Dev Compiler

Post by Tim »

I posted this link recently:
http://compilers.iecc.com/crenshaw/

This is a great tutorial on compiler design, as it builds you from the basics right through to a working compiler, and explains everything.
Tux

Re:OS-Dev Compiler

Post by Tux »

I finished the code organizer part of the core. When I rewrote it this time, I added some room for this preprocessor commands.

//This shows off what you can do with the xc preprocessor

$define show_cursor 0 //0=NO,1=YES
...
$ifdef show_cursor //Is show_cursor defined?
$ifdef show_cursor 0 //If show_cursor defined to 0
..
$else //If it is defined to anything else
..
$endif
$endif


Some stuff to think about in the engine:
a=b(20) could mean variable b*20 or function b(20)
I will need to make a list of functions and variables.
Then scan it to see if a variable is a variable or a function being called. Problem is, this is a lot of looping.
tom1000000

Re:OS-Dev Compiler

Post by tom1000000 »

Really well done on your compiler.

Do you have a link to your compiler's source code? Or are you keeping it closed source?
Tux

Re:OS-Dev Compiler

Post by Tux »

It's still in development tho. The source will be open source. I am redoing the whole core just so it is easier to read. BTW, I am using Rapid-Q. It's a freeware form of Basic. Don't boo. Basic has some many good string functions. Best of all, I don't have to worry about my string size. Rapid-Q is byte-intrepreted tho. If you are wondering why I didn't choose c, here is a list:

1) Despite all the speed hogging by Basic programs, I am running a P4 2.2 Ghz with only about 10 programs running. So I really don't notice the compiler taking a long time.
2) C is a language for coding something in a long time. Basic is more a 1-2-3 language. You don't have to worry about delcaration of variables, size of strings, and what a function does.

Don't worry C fans, I will port the code maybe when it's fully working. This compiler is just for educational purposes to me. But I am willing to share. :)

I have a question, I am making arguments global unlike in c/c++ in asm.

To clear up some confusion:
tb makes a temorary byte. That is a byte on the stack
db makes a always lasting bye.

If I do db empty[100];

The compiler will put this in the asm file.

empty db 0,0,0,0...0,0,0,0 (100 times)
That wastes file space! So, that's where the compression comes in. Problem is, you have two choices.
1) Let this take a 100 bytes anyway
2) Use compression and add a special type of file format. So it would look like this:
(sizeof code block)<...code...>0x07<100:0>(sizeof code block)<...<...code...> (Says a 100 0s)
If anybody has anyother way, then I will gladly look into it.
Tim

Re:OS-Dev Compiler

Post by Tim »

Tux wrote:If I do db empty[100];

The compiler will put this in the asm file.

empty db 0,0,0,0...0,0,0,0 (100 times)
That wastes file space! So, that's where the compression comes in. Problem is, you have two choices.
1) Let this take a 100 bytes anyway
2) Use compression and add a special type of file format. So it would look like this:
(sizeof code block)<...code...>0x07<100:0>(sizeof code block)<...<...code...> (Says a 100 0s)
If anybody has anyother way, then I will gladly look into it.
No compression needed; this is what the .bss section is for. All uninitialised variables go into the .bss section instead of the .data section; the .bss section doesn't exist on disk, but it does have an address and a length. The .bss section is zeroed by the OS at load time.
Post Reply