ByteCode Based Apps

essial · Post by **essial** » Sat May 30, 2009 4:28 pm

Now that my OS has a few video drivers, keyboard/mouse drivers, ATA/ATAPI, SCSI and SATA drivers, memory manager, etc; I'm getting to the point where I am going to work on my bytecode system, and I wanted to run it by you guys and see what you think before I get too far. I've got a LOT of details written up but I'll keep it as brief as possible.

On my system, I plan on ditching hardware multitasking for software. This includes timer interrupts for switching states and such. My goal is to minimize the context switches as much as possible. On top of that, 100% of the non-OS apps will be ran as managed bytecode (which is specced out partially by me). Which means apps do NOT run natively on the processor but instead are managed by an interpreter of sorts. I've spent a lot of time trying to design things to run as efficiently as possible of course.

Another major thing that my OS does is that applications do *not* run in type "listening" loops. If the app has nothing to do my OS won't even process information from it. Instead, applications will have callback hooks. For example, if you create a simple dialog applications with a label called "Hello World" and a button called "Quit", the window would show, and then there will be no more application processing until the button in pressed. Of course, applications can request details such as mouse movements and such, but it is not sent by default (again, there is no message loop). This is done by something I call "event entry points" (which you may call delegates or simply "event handlers"). The idea is that applications that want to be object oriented (aka, listen in a generic tight loop) CAN be, without wasting CPU time. Of course, batch-style apps will simply run from start to end without stopping just fine.

Also, the entire windowing/control subsystem is embedded in the kernel itself. With this, creating widgets is very quick and efficient as, for example, 6 bytes of byte-code for creating a window. This way the entire desktop can have a unified look and feel (I HATE having to run Gtk and Qt apps side by..), as well as lower the extra processing required by 3rd party control libraries. Of course, you CAN create your own controls but by default the built in ones will be used.

And lastly (and hopefully most obviously), since all apps will be bytecode, it's easy enough to write a virtual machine of sorts that can run my OS apps on Windows and Linux (I don't have a Mac yet); which could allow someone who wants to help out, be able to do so without having to develop in an incomplete OS.

Anyway, I just want to know if anyone has any words of wisdom for me before I start working on this. And, of course, any questions or ideas that you may have to add to this. I plan on starting work on the bytecode compiler, non-native interpreter, and the os interpreter at the same time (keeping them in sync feature-wise) -- on monday after i finish speccing out the bytecode this weekend.

Thanks for taking the time to read this!

NickJohnson · Post by **NickJohnson** » Sat May 30, 2009 8:54 pm

Sounds pretty interesting, having the entire OS run as bytecode. The issue is that you're not going to get very good performance out of it, regardless of how well you optimize. Something like Java is only able to reach maybe 1/3 native speed, and that's after having very complex JIT compilation. That's probably going to outweigh the userspace multitasking and event driven design many times over. The issue is that with userspace multitasking, you can't really use machine code (i.e. using JIT or other dynamic compilation). I would instead use normal multitasking and at least build in the possibility for dynamic compilation. There's nothing wrong with the design conceptually, but I think it would be kind of less fun if it made a Core 2 run like a Pentium II.

You should also look at the Inferno operating system - it's based on a bytecode interpreted userland as well.

JohnnyTheDon · Post by **JohnnyTheDon** » Sun May 31, 2009 12:05 am

NickJohnson wrote:Sounds pretty interesting, having the entire OS run as bytecode. The issue is that you're not going to get very good performance out of it, regardless of how well you optimize. Something like Java is only able to reach maybe 1/3 native speed, and that's after having very complex JIT compilation. That's probably going to outweigh the userspace multitasking and event driven design many times over. The issue is that with userspace multitasking, you can't really use machine code (i.e. using JIT or other dynamic compilation). I would instead use normal multitasking and at least build in the possibility for dynamic compilation. There's nothing wrong with the design conceptually, but I think it would be kind of less fun if it made a Core 2 run like a Pentium II.

One of the issues with Java is that it's JIT is so complex. The HotSpot VM only JIT compiles some of the code and interprets the rest to improve load times. You could quite easily make virtual machine run as fast as conventionally compiled code, it would just do final compilation with all optimizations to native code at runtime (and take forever to load).

If you use caching, you can get the best of both worlds: the load times of quick and dirty JIT, along with the runtime performance of effective and lengthy JIT. At some point you might want to make a JIT compiler that can be adjusted to balance load times and performance. For example, a relatively small application loaded from a website would be run so that it loads quickly, and a more complex desktop application would compile itself when it is installed and then just run from the cached native code.

AndrewAPrice · Post by **AndrewAPrice** » Sun Aug 09, 2009 12:46 am

Why not translate the byte code to native machine code as it loads and achieve native performance? Then execute the native byte code in ring 0. You can filter out the bad instructions (the program being loaded is not a driver, so if outp is detected, report error), aswell as memory checks (and ban pointer arithmetic, do run-time bound checks on arrays, much like in .Net).

geomagas · Post by **geomagas** » Fri Sep 04, 2009 2:07 am

Hi,
I'm planning to go a similar way for my os, so I thought I'd add my 2c.
I am assuming that you will use a JIT approach and not an interpreter one.
Using bytecode for your _userland_ code has advantages, most notably portability (any third-party app as well as a great deal of the os itself will run on any machine your kernel supports).
Others include:
- Runtime optimizations, as your compiler will be able to produce optimal native code for the arch/cpu features your process is about to run (mmx being a good example)
- Security, as you will be able to run everything in ring0, being certain that your compiler does not produce "naughty" code (but that also depends on your bytecode implementation). This will also make your kernel more simple and fast.
- The thing you mention about implementing a virtual machine for another os, to be able (you and others) to test your bytecode is a good option too.
On the other hand:
Of course, the bigger downside is having to implement a compiler, which can be painful. But IMHO, its a gooood drill.
Another disadvantage is the obvious -- having to compile everything before it runs dramatically increases load-times which could eliminate the advantages of the optimizations above. Thats why my plan is to maintain a cache (list) of compiled processes. That way (a) you only have to JIT a process once, and (b) you could initialize the list, at startup, with a certain number of processes that, for your own reasons, you think they must be precompiled. Note, however, that this can prove highly memory-consuming...
But some parts cannot be bytecode -- obviously the compiler for one. The "deepest" part of your kernel too, the one that is machine-specific and needs to be as fast as possible, and also other things (ie the compiler itself) depends on. For example, your compiler will have to allocate memory. The memory allocation routine cannot be written in bytecode, because the compiler will probably need to allocate memory in order to compile the memory manager (infinite recursion).
IMHO the above includes things like the scheduler, but not things like the GUI that you mention you plan to include in the kernel. I find this rather contradictory, but maybe I didn't understand correctly, so would you mind elaborating a little on this?
Nevertheless, I'd be willing to test anything you have ready, when you have it. Keep up.
Regards

ru2aqare · Post by **ru2aqare** » Fri Sep 04, 2009 5:39 am

geomagas wrote: - Security, as you will be able to run everything in ring0, being certain that your compiler does not produce "naughty" code (but that also depends on your bytecode implementation). This will also make your kernel more simple and fast.
...
Thats why my plan is to maintain a cache (list) of compiled processes. That way (a) you only have to JIT a process once, and (b) you could initialize the list, at startup, with a certain number of processes that, for your own reasons, you think they must be precompiled.

How do you prevent the user from tampering with precompiled processes (executables)? How do you prevent the user from taking the HDD out, replacing or hacking the executables on another machine, and inserting the HDD in?

JohnnyTheDon · Post by **JohnnyTheDon** » Fri Sep 04, 2009 8:43 am

ru2aqare wrote:
geomagas wrote: - Security, as you will be able to run everything in ring0, being certain that your compiler does not produce "naughty" code (but that also depends on your bytecode implementation). This will also make your kernel more simple and fast.
...
Thats why my plan is to maintain a cache (list) of compiled processes. That way (a) you only have to JIT a process once, and (b) you could initialize the list, at startup, with a certain number of processes that, for your own reasons, you think they must be precompiled.
How do you prevent the user from tampering with precompiled processes (executables)? How do you prevent the user from taking the HDD out, replacing or hacking the executables on another machine, and inserting the HDD in?

How do you prevent anyone with physical access to any computer from editing executable code, even when it is statically compiled? That has nothing to do with dynamic translation or byte code.

Craze Frog · Post by **Craze Frog** » Fri Sep 04, 2009 8:58 am

MessiahAndrw wrote:Why not translate the byte code to native machine code as it loads and achieve native performance?

Because if loading a program takes 1 minute instead of 3 seconds then you haven't achieved native performance in any way. Generating native code that is as optimized as GCC -O3 takes a lot of time.

geppyfx · Post by **geppyfx** » Sat Sep 05, 2009 1:01 pm

ru2aqare wrote:How do you prevent the user from tampering with precompiled processes (executables)? How do you prevent the user from taking the HDD out, replacing or hacking the executables on another machine, and inserting the HDD in?

Encryption and hashing seems like the the only solution to prevent modification of native executable.

xvedejas · Post by **xvedejas** » Sat Sep 05, 2009 1:34 pm

geomagas wrote:Hi,
I'm planning to go a similar way for my os, so I thought I'd add my 2c.
I am assuming that you will use a JIT approach and not an interpreter one.

You assume wrong. I've talked with him and he is definitely not going with the JIT approach. I would like to point out that he is currently busy (bought a new house, lot of work to be done) and hasn't been working on his kernel. He plans on releasing the source code when he gets more time.

I'm doing a project extremely similar to his and have emailed him about collaborating. He hasn't yet responded but I think it would be interesting for the three of us to get together. You can find out more about my project (which is not quite as far along as essial's) by visiting the link in my signature.

geomagas · Post by **geomagas** » Sun Sep 06, 2009 11:56 pm

Hi,

If my assumption was wrong then I guess you can discard pretty much my whole post...

JohnyTheDon, you took the words right out of my mouth.

Furthermore, about security:
When I said security I was referring to the security provided by a bytecode compiler, in terms of code production and definitely not total and overall security about everything. For example, if there's no in/out equivalent command in your bytecode, how can a compiler produce such machine code? Or, if your native code is loaded and executed from isolated memory spaces, how can it tamper with other portions of memory? That falls in a "portion of the very basic security that a kernel must provide" category, but security as a whole is a much bigger chapter, which includes i.e. ACLs, protection against, say, viruses etc.

As for the scenario about a user unplugging a disk and screwing with it on another machine, locking your (apparently important) server in a room and throwing away all extra keys is an administrator's job, not an osdever's one.

Regards

ru2aqare · Post by **ru2aqare** » Mon Sep 07, 2009 3:54 am

geomagas wrote:For example, if there's no in/out equivalent command in your bytecode, how can a compiler produce such machine code?

The answer is bugs in the compiler (although I admit it's unlikely). This is somewhat similar to the reason why some of the OS testers won't accept an OS image that contains an ATA driver. Even if the driver is programmed to only read the disk, there is no proof that it won't go haywire and overwrite your harddisk.

geomagas wrote: As for the scenario about a user unplugging a disk and screwing with it on another machine, locking your (apparently important) server in a room and throwing away all extra keys is an administrator's job, not an osdever's one.

I only brought that up to highlight that the issue exists, and you need to have some kind of solution to it. Of course the entire PC architecture is insecure from this viewpoint, and there is nothing you can do to secure it without going the XBOX/PS3/etc route.

geomagas · Post by **geomagas** » Mon Sep 07, 2009 4:19 am

Hi,

Well, imho starting a conversation about bugs would be irrelevant. Every piece of s/w can have bugs, and a lot of people will tell you that it's highly probable, too. But when you discuss a way to implement things, you try to figure out what would work best, all ways being bug-free.

the issue exists

True

and you need to have some kind of solution to it

False. I still think it's someone else's job, and thinking about how to do another person's job is just a waste of braincells.

Regards

fronty · Post by **fronty** » Mon Sep 07, 2009 9:16 am

geomagas wrote:But some parts cannot be bytecode -- obviously the compiler for one.

Why not? Ever heard of bootstrapping? You can write a compiler for a subset of your language in some other language, then write a compiler for your language and compile it with your first compiler. Ta-da, you have a compiler for your language in bytecode.

JohnnyTheDon · Post by **JohnnyTheDon** » Mon Sep 07, 2009 2:41 pm

fronty wrote:
geomagas wrote:But some parts cannot be bytecode -- obviously the compiler for one.
Why not? Ever heard of bootstrapping? You can write a compiler for a subset of your language in some other language, then write a compiler for your language and compile it with your first compiler. Ta-da, you have a compiler for your language in bytecode.

I do something similar to this. I wrote my compiler in my own language, and also wrote a program that converts it to Ruby. It's kind of slow, but it is only used for compiling the compiler.

OSDev.org

ByteCode Based Apps

ByteCode Based Apps

Re: ByteCode Based Apps

Re: ByteCode Based Apps

Re: ByteCode Based Apps

Re: ByteCode Based Apps

Re: ByteCode Based Apps

Re: ByteCode Based Apps

Re: ByteCode Based Apps

Re: ByteCode Based Apps

Re: ByteCode Based Apps

Re: ByteCode Based Apps

Re: ByteCode Based Apps

Re: ByteCode Based Apps

Re: ByteCode Based Apps

Re: ByteCode Based Apps