Page 1 of 2

Multiaddress space message passing virtual machine

Posted: Mon Jul 15, 2013 11:07 pm
by zeitue
Would it be possible to make a Multiaddress space message passing virtual machine?
I've looked at AROS and Java's JVM.
The JVM can only run one program per JVM and runs in a single global address space. It uses a stack rather than registers to keep track of memory and loads classes as its libraries.
AROS is an exokernel when running on bare metal and a virtual machine when hosted on another OS.
AROS is also a single address space OS when running in either state.
On AROS the dynamic link libraries are relocatable ELF objects. The first time a library is opened, it is loaded from disk and relocated with the start address it was loaded to. On AROS and Amiga-like systems, memory is shared between all code running on the system as a single big memory region. This approach allows all programs to use the library loaded at the memory it was loaded to.

Other systems, including Windows and Unix, have a different virtual address space for each process. Here too the OS tries to load the shared library only once and it then maps the same library in the address space of each of the processes using it. The library may thus be located at different addresses in the different spaces and the OS has to handle this.

Windows will first try to locate the shared library at a single location in memory and tries to map it to the same memory region in each process that uses the library. If this is not possible the library will be duplicated in memory. On most Unix systems this problem is avoided by letting the compiler generate position independent code, e.g. code that works at any position in memory without having to relocate the code. Depending on the architecture this type of code may have less or more impact on the speed of the generated code.
My idea is to use multiple address spaces and to keep the programs and services isolated from each other.
I'm thinking of something like microkernel message passing between address spaces.
This could allow many extendable features such as file systems and other things to be added without to much trouble?
Selection_018.png
In a sense I'm trying to replace the idea of relocatable ELF files or Java class files with a standard Dynamically linked ELF or bytecode executable each having their own address space.
would this be possible to have a multi-address space virtual machine with a message passing interface?
Final note: one of the major goals of this is to be able to use certain servers from my microkernel like system in this hosted virtual machine.

Re: Multiaddress space message passing virtual machine

Posted: Tue Jul 16, 2013 12:29 am
by Mikemk
Do you mean something similar to Paging?

Re: Multiaddress space message passing virtual machine

Posted: Tue Jul 16, 2013 12:57 am
by zeitue
m12 wrote:Do you mean something similar to Paging?
Thanks for the reply.
Technically yes I want virtual memory and paging in a virtual machine that would run on Linux, Mac OS X, FreeBSD.
I'm only wanting to get what I need from the host. but what I want to know is if my Virtual machine can be in multiple address spaces on these host operating systems? and whether or not I can use a microkernel like design?
this is all just theory at the moment

Re: Multiaddress space message passing virtual machine

Posted: Fri Jul 19, 2013 4:21 pm
by Casm
zeitue wrote:
m12 wrote:Do you mean something similar to Paging?
Thanks for the reply.
Technically yes I want virtual memory and paging in a virtual machine that would run on Linux, Mac OS X, FreeBSD.
I'm only wanting to get what I need from the host. but what I want to know is if my Virtual machine can be in multiple address spaces on these host operating systems? and whether or not I can use a microkernel like design?
this is all just theory at the moment
Programs like Virtual Box have to do a pretty convincing emulation of the hardware for Windows or Linux to run on them. So in principle paging can be implemented on a virtual machine. It all comes down to whether you are familiar enough with x86 processors to be able to pull it off.

Re: Multiaddress space message passing virtual machine

Posted: Fri Jul 19, 2013 4:52 pm
by zeitue
Casm wrote:
zeitue wrote:
m12 wrote:Do you mean something similar to Paging?
Thanks for the reply.
Technically yes I want virtual memory and paging in a virtual machine that would run on Linux, Mac OS X, FreeBSD.
I'm only wanting to get what I need from the host. but what I want to know is if my Virtual machine can be in multiple address spaces on these host operating systems? and whether or not I can use a microkernel like design?
this is all just theory at the moment
Programs like Virtual Box have to do a pretty convincing emulation of the hardware for Windows or Linux to run on them. So in principle paging can be implemented on a virtual machine. It all comes down to whether you are familiar enough with x86 processors to be able to pull it off.
I plan this to be muli arch but what I'm wanting to know is, can I use separate memory addresses from the host OS as my internal virtual memory and use a micro kernel like message passing method.
In a way I'm trying to let the guest/myVM to be able to have more open access to things be less locked down in a way.
also does virtual box allow for this or am I off?

Re: Multiaddress space message passing virtual machine

Posted: Fri Jul 19, 2013 5:14 pm
by zeitue
Maybe what I'm thinking of is a hyper visor that runs on each platform that is merged with a micro kernel and then allow it to run micro kernel services on top of it.
If a memory manager ran on top of the hyper visor on top of Linux/BSD would I require support in the hyper visor for the memory manager, or could I get a RPC to connect to the host and request memory from the host OS.
I'm willing to learn anything that I need to to be able to make this project work.

Re: Multiaddress space message passing virtual machine

Posted: Fri Jul 19, 2013 9:48 pm
by Brendan
Hi,
zeitue wrote:Would it be possible to make a Multiaddress space message passing virtual machine?
Yes.

The only problem I see here is that you need to decide on more specific goals. Here's a list of things for you to decide:
  • Will it be hosted (run on top of another OS), or run on bare metal, or support both.
  • What will the virtual machine execute. Will it be some sort of byte-code (e.g. like Java), or native (80x86, ARM?) instructions, or something else. If byte-code is involved, will you interpret or JIT, or something else.
  • How will message passing behave. Will messages be fixed size (and how big) or variable size (with which max. size). Will they be moved or copied. How will sending/receiving messages effect scheduling (e.g. synchronous, asynchronous). Will each thread get one "message port" or will threads be able to allocate an arbitrary number of message ports. Can multiple threads share the same message port/s. Can a thread broadcast a message to all message ports, or broadcast a message to a subset of all message ports. How will security work (can any thread send to any message port, or is there restrictions).
Depending on all of the above; you could end up with (e.g.) an OS that executes native code, that uses "fork()" to create address spaces, with the kernel in shared memory and sockets for message passing; or a single address space OS (that uses a single multi-threaded process on the host OS) that interprets byte-code, emulates multiple address spaces for guests and uses "pthread_cond_signal" (and buffers) for message passing; or ....


Cheers,

Brendan

Re: Multiaddress space message passing virtual machine

Posted: Sat Jul 20, 2013 12:48 am
by zeitue
Thanks for the Reply Brendan :D
Will it be hosted (run on top of another OS), or run on bare metal, or support both.
I planned to do both one a implementation of L4 with a user space memory, VFS, and process server and the other a hosted virtual machine.
What will the virtual machine execute. Will it be some sort of byte-code (e.g. like Java), or native (80x86, ARM?) instructions, or something else. If byte-code is involved, will you interpret or JIT, or something else.
I planned for a bit of both. I planned to have two servers that would run on both ELF-loader(Native(whatever the host CPU is)) and BFF-loader(a JIT bytecode)
How will message passing behave. Will messages be fixed size (and how big) or variable size (with which max. size). Will they be moved or copied. How will sending/receiving messages effect scheduling (e.g. synchronous, asynchronous). Will each thread get one "message port" or will threads be able to allocate an arbitrary number of message ports. Can multiple threads share the same message port/s. Can a thread broadcast a message to all message ports, or broadcast a message to a subset of all message ports. How will security work (can any thread send to any message port, or is there restrictions).
I plan to implement L4 like message passing.
small messages passed in the registers.
synchronous with an asynchronous framework on top.
they can be passed by copying or mapping.
messages will be variable length.


Would the VM when hosted on Linux or BSD for example, have to be in a single address space or can I have multiple address spaces on these platforms?
also when hosted I'm trying for very little abstraction so it will be a bit like an extension to the host :mrgreen:
I kinda like AROS but multiple address spaced.

Re: Multiaddress space message passing virtual machine

Posted: Sat Jul 20, 2013 7:02 pm
by Brendan
Hi,
zeitue wrote:Would the VM when hosted on Linux or BSD for example, have to be in a single address space or can I have multiple address spaces on these platforms?
For ELF/native executables running "hosted", you can't emulate multiple address spaces while only actually using one. Therefore you'd need to use "fork()" to create multiple address spaces. Your kernel would be implemented as a shared library that uses shared memory for global data. You'd also have a process (for the host OS) that takes care of "booting", which loads your "kernel" shared library. When a new process is started, your kernel would "fork()", clean up the copy of the old address space, load the new process' executable into the address space, etc. In this case you'd have to rely on the host OS to do the actual scheduling; which means that for messaging you'd need to find something that causes similar behaviour (e.g. possibly datagram sockets where your kernel/library doesn't provide a simple "send" and only provides "send, then block until reply is received").

For BFF JIT bytecode running "hosted", it'd be the same as above; except that the ELF executable would be the BFF virtual machine (which would load a "BFF executable file" and execute it).

On top of that you'd need pretend device drivers. For example, rather than having a video card device driver that responds to your messages that talks to the video card, you'd have a device driver that responds to your messages that talks to the host OS.

For the "bare metal" version, your boot code and kernel would need to be very different (and do everything itself, including creating/managing virtual address spaces, scheduling, etc); and you'd need device drivers that talk to hardware. All normal executable files would run on both "hosted" and "bare metal" versions of the OS though; including native/ELF executables and BFF bytecode executables and the "BFF virtual machine", and things like file systems, VFS, network stack, etc.

Note: a long time ago I tried doing something slightly similar, to allow applications designed for my OS to be recompiled and run on Linux. It's not easy and there are a lot of little details where it's hard to get the behaviour your executables expect and suppress things that your applications don't expect. For example; I couldn't get scheduling to behave right, so I decided I'd ignore my OS's thread priorities and stop caring about getting it right; and signals don't exist on my OS and messages are used instead, so I had to write a bunch of signal handlers (and gave up when that turned into huge nightmare - e.g. trying to get the signal handler for "SIGHUP" to send a message when the signal occurs while you're in the middle of sending a message and have already acquired mutexes, etc).


Cheers,

Brendan

Re: Multiaddress space message passing virtual machine

Posted: Mon Jul 22, 2013 1:08 am
by zeitue
I've found a few things that might be interesting to this idea
something called a green process and green thread.
These features come from Erlang and Inferno OS.
do you think this might work?
some bits of information
Green Threads
Green Processes
There are several contributing factors:

Erlang processes use a lightweight cooperative threading model (preemptive at the Erlang level, but under the control of a cooperatively scheduled runtime). This means that it is much cheaper to switch context, because they only switch at known, controlled points and therefore don't have to save the entire CPU state (normal, SSE and FPU registers, address space mapping, etc.).
Erlang processes use dynamically allocated stacks, which start very small and grow as necessary. This permits the spawning of many thousands (even millions) of Erlang processes without sucking up all available RAM.
Erlang used to be single-threaded, meaning that there was no requirement to ensure thread-safety between processes. It now supports SMP, but the interaction between Erlang processes on the same scheduler/core is still very lightweight (there are separate run queues per core).

Re: Multiaddress space message passing virtual machine

Posted: Mon Jul 22, 2013 11:23 pm
by zeitue
Brendan wrote: For ELF/native executables running "hosted", you can't emulate multiple address spaces while only actually using one. Therefore you'd need to use "fork()" to create multiple address spaces. Your kernel would be implemented as a shared library that uses shared memory for global data. You'd also have a process (for the host OS) that takes care of "booting", which loads your "kernel" shared library. When a new process is started, your kernel would "fork()", clean up the copy of the old address space, load the new process' executable into the address space, etc. In this case you'd have to rely on the host OS to do the actual scheduling; which means that for messaging you'd need to find something that causes similar behaviour (e.g. possibly datagram sockets where your kernel/library doesn't provide a simple "send" and only provides "send, then block until reply is received").

For BFF JIT bytecode running "hosted", it'd be the same as above; except that the ELF executable would be the BFF virtual machine (which would load a "BFF executable file" and execute it).

On top of that you'd need pretend device drivers. For example, rather than having a video card device driver that responds to your messages that talks to the video card, you'd have a device driver that responds to your messages that talks to the host OS.
Okay this seems like it won't work for me. may I ask a few questions please?
If the virtual machine runs inside a single address space
  • Would it be limit to running a single process inside the Virtual machine or could I run millions?
  • Would it be possible to have protection between the multiple processes in the virtual machine similar to what virtual memory memory protection gives gives or no?
  • Would processes be able to have multiple threads?
  • Would Inter Process Communication(IPC)/message passing be possible?
  • Would the ELF files loaded by the machine have to be Libraries or object files like AROS does or could they be full ELF binary executables?
  • Could the virtual machine be extended with server like processes running inside the virtual machine or would they have to be libraries?
  • Could BFF Bytecode File Format executables be run in the same Virtual machine along side the native ELF executables?
  • Can I send commands to the Virtual machine by use of a daemon or some kind of launcher running on the host?
  • Would remote procedure call be possible inside the virtual machine?
  • Would remote procedure call be possible from inside the virtual machine to the host?
  • Would I be able to schedule the processes in the virtual machine?
  • Would paging be possible inside the virtual machine? (perhaps paging could be done by emulating a page table?)
  • Would swapping be possible inside the virtual machine?

Re: Multiaddress space message passing virtual machine

Posted: Tue Jul 23, 2013 12:12 am
by Brendan
Hi,
zeitue wrote:I've found a few things that might be interesting to this idea
something called a green process and green thread.
These features come from Erlang and Inferno OS.
do you think this might work?
It works; but the real question is how well it will work.

Normally I'm against green threads because it makes it impossible for anything to make good decisions about where CPU time is used. For example; imagine 2 normal processes that both have a high priority green thread and a low priority green thread. When the high priority green thread is idle and only the low priority green thread is running, how does one process know if it should switch to the other process or not? Each process doesn't know about the priority and state of other processes's threads and can't decide, and the kernel doesn't know about the priority and state of any green thread and can't decide, therefore nothing can make good decisions and it's fundamentally broken (CPU/s end up wasting time doing unimportant things when there's much more important work to do).

However; your specific case is not the normal case, and my "nothing can make good decisions" objection may not apply to your specific case.
zeitue wrote:If the virtual machine runs inside a single address space
  • Would it be limit to running a single process inside the Virtual machine or could I run millions?
  • Would it be possible to have protection between the multiple processes in the virtual machine similar to what virtual memory memory protection gives gives or no?
  • Would processes be able to have multiple threads?
  • Would Inter Process Communication(IPC)/message passing be possible?
  • Would the ELF files loaded by the machine have to be Libraries or object files like AROS does or could they be full ELF binary executables?
  • Could the virtual machine be extended with server like processes running inside the virtual machine or would they have to be libraries?
  • Could BFF Bytecode File Format executables be run in the same Virtual machine along side the native ELF executables?
  • Can I send commands to the Virtual machine by use of a daemon or some kind of launcher running on the host?
  • Would remote procedure call be possible inside the virtual machine?
  • Would remote procedure call be possible from inside the virtual machine to the host?
  • Would I be able to schedule the processes in the virtual machine?
It's easy to prove that all of these things are possible, simply by considering something like Bochs - anything that is possible when running on "bare metal" is possible inside a virtual machine.


Cheers,

Brendan

Re: Multiaddress space message passing virtual machine

Posted: Tue Jul 23, 2013 12:24 am
by zeitue
Hi, thanks for more information :D
Brendan wrote:Hi,
zeitue wrote:If the virtual machine runs inside a single address space
  • Would it be limit to running a single process inside the Virtual machine or could I run millions?
  • Would it be possible to have protection between the multiple processes in the virtual machine similar to what virtual memory memory protection gives gives or no?
  • Would processes be able to have multiple threads?
  • Would Inter Process Communication(IPC)/message passing be possible?
  • Would the ELF files loaded by the machine have to be Libraries or object files like AROS does or could they be full ELF binary executables?
  • Could the virtual machine be extended with server like processes running inside the virtual machine or would they have to be libraries?
  • Could BFF Bytecode File Format executables be run in the same Virtual machine along side the native ELF executables?
  • Can I send commands to the Virtual machine by use of a daemon or some kind of launcher running on the host?
  • Would remote procedure call be possible inside the virtual machine?
  • Would remote procedure call be possible from inside the virtual machine to the host?
  • Would I be able to schedule the processes in the virtual machine?
It's easy to prove that all of these things are possible, simply by considering something like Bochs - anything that is possible when running on "bare metal" is possible inside a virtual machine.

Cheers,

Brendan
OK, so all this is possible, but does that mean I have to emulated the Host's CPU like Bochs?
Or would there be other ways to implement and use the CPU of the host?
also I assume my design to be a register based virtual machine.

Re: Multiaddress space message passing virtual machine

Posted: Tue Jul 23, 2013 12:53 am
by Brendan
Hi,
zeitue wrote:OK, so all this is possible, but does that mean I have to emulated the Host's CPU like Bochs?
Or would there be other ways to implement and use the CPU of the host?
also I assume my design to be a register based virtual machine.
Bochs emulates 80x86 CPU/s regardless of what the host CPU is (e.g. if you run Bochs on an ARM, Itanium or PowerPC, then it doesn't emulate the host CPU).

For your "BFF executables" you will need to emulate the entire guest CPU (your "BFF machine").

For your "native 80x86 ELF executables", you may or may not need to emulate some or all of the 80x86 guest CPUs when running on 80x86 host CPUs, depending on how you implement it. I'd expect that if you use single host address space for multiple guests executables, then:
  • you will need to use position independent executables and have no protection, or
  • you will need to use position independent executables and use some sort of managed code (possibly including inventing your own tool-chain to generate special "native 80x86 ELF executables"), or
  • you will need to emulate at least most of the CPU
I'd also expect that if you use one host address space for each guest executable, then you can have protection without emulating any of the CPU, and without requiring special executables or tools.


Cheers,

Brendan

Re: Multiaddress space message passing virtual machine

Posted: Tue Jul 23, 2013 1:35 am
by zeitue
Brendan wrote:Hi,

Cheers,

Brendan
Thank you for helping me and giving me direction otherwise I'd be lost on this.
I'd also expect that if you use one host address space for each guest executable, then you can have protection without emulating any of the CPU, and without requiring special executables or tools.
would this be like before when you mentioned the kernel would have to be a shared library?

  • you will need to use position independent executables and have no protection, or
  • you will need to use position independent executables and use some sort of managed code (possibly including inventing your own tool-chain to generate special "native 80x86 ELF executables"), or
  • you will need to emulate at least most of the CPU
I think I'd rather implement an emulated CPU over making a tool chain for this
For your "native 80x86 ELF executables", you may or may not need to emulate some or all of the 80x86 guest CPUs when running on 80x86 host CPUs, depending on how you implement it. I'd expect that if you use single host address space for multiple guests executables,
this would be the same for Arm, PowerPC or other architectures right?

For your "BFF executables" you will need to emulate the entire guest CPU (your "BFF machine").
maybe the best design is to implement a virtual host CPU and also a virtual BFF CPU next to it in the same machine? though that might be just too strange.

* if the CPU is emulated and I wish to have virtual addresses/memory protection do I need a virtual MMU?
* Do you think it's possible or a good idea to link ELF binaries/libraries to BFF binaries/libraries?