Page 2 of 2

Re: compiled or interpreted

Posted: Tue May 01, 2012 2:33 pm
by AndrewAPrice
berkus wrote:The idea of different frontends compiling into the same bytecode supported by the backend still holds for the DIY if you want to avoid the proliferation of "different compilers".
I agree with you there. I just don't want to use LLVM - it has different goals and considerations to mine - particularly in regards to memory safety.

Re: compiled or interpreted

Posted: Tue May 01, 2012 2:45 pm
by Rudster816
MessiahAndrw wrote:
berkus wrote:Enter LLVM.

Where different frontends (C, Lua, Fortran, whatever) compile into the same bytecode representation, from which a single optimizing translator can generate code for nearly any architecture.
But that's less fun than DIY.
But DIY is entirely impractical. You'll learn a lot from doing it yourself, and that is invaluable, but from a usability standpoint, porting LLVM or even GCC is a by far the superior option. If you're developing a new ISA, porting GCC (and to a slightly lesser extend LLVM) to it opens up a million doors that creating your own compiler doesn't. Once you've got it ported, you pretty much just have to sit back and relax as thousands of open source projects develop for your platform courtesy of GCC\LLVM compatibly. You'll of course have to figure out the whole OS situation, but porting Linux isn't to a hardware's ISA isn't difficult, so you can take care of that too in that case if you want to go that route.

In short, if you actually want the possibility of other people using your OS with a managed userland, proving a backend for LLVM is the obvious way to go. If you can emulate or provide basic POSIX calls, you'll make your job a lot easier. But that pretty much goes without saying.

Re: compiled or interpreted

Posted: Tue May 01, 2012 3:18 pm
by AndrewAPrice
Rudster816 wrote:In short, if you actually want the possibility of other people using your OS with a managed userland, proving a backend for LLVM is the obvious way to go.
Choosing LLVM bytecode would go against the core architecture of my OS and the reason I choose to use bytecode in the first place. I'm not trying to attract as many users as possible, I want to develop the kind of OS I dream about.

LLVM's bytecode is about being a common instruction set for front ends to compile to, allowing the backend to deal with platform dependent code generation and optimisation.

My bytecode is about aiming for memory safety and inter process compatibility (shared interfaces and objects), as is more akin to Java or .Net bytecode.
Rudster816 wrote:If you can emulate or provide basic POSIX calls, you'll make your job a lot easier. But that pretty much goes without saying.
I could always implement a userland POSIX compatibility layer that allows you to run binutils, GCC, bash. But at this stage, I really don't care about making Yet Another POSIX Clone.

I got into OS development for the beauty of starting from scratch - for building the perfect software platform. My aim isn't to get people running Bash, KDE, Firefox on top of my OS to the point where it looks, behaves, and feels like any other UNIX compatible system.

Re: compiled or interpreted

Posted: Tue May 01, 2012 4:50 pm
by Rudster816
MessiahAndrw wrote:
Rudster816 wrote:In short, if you actually want the possibility of other people using your OS with a managed userland, proving a backend for LLVM is the obvious way to go.
Choosing LLVM bytecode would go against the core architecture of my OS and the reason I choose to use bytecode in the first place. I'm not trying to attract as many users as possible, I want to develop the kind of OS I dream about.

LLVM's bytecode is about being a common instruction set for front ends to compile to, allowing the backend to deal with platform dependent code generation and optimisation.

My bytecode is about aiming for memory safety and inter process compatibility (shared interfaces and objects), as is more akin to Java or .Net bytecode.
Rudster816 wrote:If you can emulate or provide basic POSIX calls, you'll make your job a lot easier. But that pretty much goes without saying.
I could always implement a userland POSIX compatibility layer that allows you to run binutils, GCC, bash. But at this stage, I really don't care about making Yet Another POSIX Clone.

I got into OS development for the beauty of starting from scratch - for building the perfect software platform. My aim isn't to get people running Bash, KDE, Firefox on top of my OS to the point where it looks, behaves, and feels like any other UNIX compatible system.
I never said that you should use the LLVM bytecode for your userland. What I meant was that you create an LLVM backend for YOUR bytecode\ISA. That way users can write code in any LLVM frontend language that has your OS's headers\class libraries\whatever with just the work of writing the "include" files and mapping them to your OS's API. You can then use LLVM to statically compile the source into your bytecode, and execute it on your VM.

Re: compiled or interpreted

Posted: Tue May 01, 2012 5:44 pm
by AndrewAPrice
Rudster816 wrote:I never said that you should use the LLVM bytecode for your userland. What I meant was that you create an LLVM backend for YOUR bytecode\ISA. That way users can write code in any LLVM frontend language that has your OS's headers\class libraries\whatever with just the work of writing the "include" files and mapping them to your OS's API. You can then use LLVM to statically compile the source into your bytecode, and execute it on your VM.
I'm sorry, I misinterpreted what you were trying to say. That certainly is possible. There are many C to JVM compilers out there. Some convert C straight into Java, while others compile C to MIPS binaries, then translate it to Java bytecode.

But outputting LLVM (which allows arbitrary memory access) to my bytecode (which does not) would be an interesting challenge. I've seen some Java compilers get around this by treating memory as a large array, and pointers as indices into the array. As long as you provided your own malloc/free implementation that could grow or shrink this array during run time, this would be fairly flexible, albeit slow.

It would be even more interesting to see how you would handle pointer arithmetic on functions, self modifying code, or another JIT that tries to output x86 code.