Single address space OS design: please review my thoughts

Brendan · Post by **Brendan** » Thu Apr 17, 2014 8:01 pm

Hi,

ababo wrote:
To get a consistent snapshot you will need to stop the world (otherwise things are changing while you're trying to save to disk).
I think domains (namespaces) should be treated separately. This means to make a consistent incremental snapshot we should stop the entire domain (not the whole system).

This only works if there's no communication (where "no communication" implies "nobody would notice if it didn't exist"). For example, imagine there's 2 things called Foo and Bar. Foo is waiting for something to do and you save a snapshot of Foo. Then Bar sends a message to Foo and starts waiting for a reply, and you save a snapshot of Bar. The system is rebooted. You load the snapshot for Foo (from before the message was sent) and the snapshot for Bar (from after the message was sent), and now Bar will wait for the reply forever because the snapshots weren't synchronised.

Cheers,

Brendan

Owen · Post by **Owen** » Thu Apr 17, 2014 9:41 pm

My biggest problem with any persistent OS is this:

I've written MyDatabase v1. I've loaded 1GB of data into its' persistent process.

Now I've made MyDatabase v2. How do I upgrade?

Rusky · Post by **Rusky** » Fri Apr 18, 2014 8:15 am

You import the data into MyDatabase into using shared memory and close MyDatabase v1?

SpyderTL · Post by **SpyderTL** » Fri Apr 18, 2014 2:40 pm

Mapping external device memory to a single 64-bit memory address has got me thinking, since I've just started a new 64-bit OS project myself.

So, if I understand correctly, you could map a 1 TB hard drive into your address space starting at, say 0x100000h. (probably much higher....)

So, obviously, the CPU can't "see" the data at this address directly. It would be expecting there to be RAM or a memory mapped device located at that address. So this implies that you are using virtual addresses (pages) which, themselves are mapped to physical memory so that the CPU can actually see the data. But it seems like you now have a chicken vs. egg scenario, where virtual memory is copied to physical memory, which can be paged out to the hard drive, which is memory mapped to virtual memory, which is paged into physical memory, and so on.

I'm not even sure if I can wrap my head around this enough to determine if this is a valid problem or not, so maybe someone else can elaborate on what I'm trying to say...

Still, an interesting topic though.

ababo · Post by **ababo** » Fri Apr 18, 2014 3:45 pm

To Brendan:

This only works if there's no communication (where "no communication" implies "nobody would notice if it didn't exist"). For example, imagine there's 2 things called Foo and Bar. Foo is waiting for something to do and you save a snapshot of Foo. Then Bar sends a message to Foo and starts waiting for a reply, and you save a snapshot of Bar. The system is rebooted. You load the snapshot for Foo (from before the message was sent) and the snapshot for Bar (from after the message was sent), and now Bar will wait for the reply forever because the snapshots weren't synchronised.

It depends on implementation of inter-domain IPC calls. It should support a configurable timeout period. I mean if a domain will not be ready in N microseconds the given IPC call will fail with timeout. This principle being extrapolated to higher level distributed computing will enable to move domains between different physical machines (let's assume that domains are identified by UUID).

To SpyderTL:

So, obviously, the CPU can't "see" the data at this address directly. It would be expecting there to be RAM or a memory mapped device located at that address. So this implies that you are using virtual addresses (pages) which, themselves are mapped to physical memory so that the CPU can actually see the data. But it seems like you now have a chicken vs. egg scenario, where virtual memory is copied to physical memory, which can be paged out to the hard drive, which is memory mapped to virtual memory, which is paged into physical memory, and so on.

You should read about page fault interrupt (http://en.wikipedia.org/wiki/Page_fault).

SpyderTL · Post by **SpyderTL** » Mon Apr 21, 2014 10:24 am

I understand how page faults work. My issue is with the recursive nature of mapping physical memory to virtual memory, and device storage "memory" to physical memory. After posting my last post, I've been thinking about how to correctly word my concern. This is the best that I've come up with so far:

The way I see it, the VMM functionality was created to allow you to use hard drive storage to "virtually" increase your physical memory. Your concept is flipping this around, and using physical memory to "virtually" access hard drive storage. By itself, this seems like it should be feasible.

However, the problem comes if/when you also want to use the same mechanism for both swapping physical memory to disk, and "swapping" disk storage to physical memory. At some point, you are probably going to run into a situation where the data that started on the hard drive is going to be swapped back to the hard drive in a different location. Or your physical memory is going to get swapped to the hard drive, and then back into physical memory at a different location, and I can imagine this cascading into a possible endless loop.

You would need some way to prevent this endless loop, but I can't seem to think of a way to pull this off without getting dizzy and light headed.

The only way I can see this working would be to either abandon the virtual memory approach (no swapping to disk), or to, say, partition the disk into data/swapfile partitions, and don't map the swapfile area into your single address space.

I'm not saying that it's impossible. I'm just saying that it makes my head hurt the more I think about it...

Rusky · Post by **Rusky** » Mon Apr 21, 2014 6:42 pm

Just look at normal RAM usage as virtually accessing the swap file, a specific case of accessing any old file. Linux essentially does this- mmap is used both to map in files lazily (including executables) and to allocate memory.

max · Post by **max** » Tue Apr 22, 2014 2:19 am

Hey ababo,

I haven't read everything in this thread yet^^ but I can tell you about my experiences, with a - I think quite similar - approach.
My OS also has a flat memory model. Theres a (java) virtual machine running inside of it, too - therefore all processes run in kernel mode. Paging is enabled, but all processes use the same page directory. The lower megabyte is identity-mapped for the kernel code and data, and everything above it is a contigious area of free memory used as the system heap.

One problem I came up lately was malloc. Imagine the following scenario:
- theres a process running, doing some stuff and mallocing a block
- then, a interrupt comes up (for example, a key hit)
- the interrupt handler calls some java method in the virtual machine
- the VM tries to do malloc, too - problem: the thread is already doing malloc
If i use a mutex here, this would be a deadlock.

My solution for this problem was using software interrupts for system calls. Then it looks like this:
- theres a process running, doing some stuff and performing a system call to malloc
- the system is now interrupted, therefore all other interrupts are waiting for the current interrupt to finish
- once malloc has finished, the interrupt returns
- the waiting keyhit-interrupt can simply call malloc, because all other calls to malloc have finished

Well this is just one thing I had, maybe it helps you. It's a little bit complicated to deal with the entire reentrancy stuff when having a completely flat memory model, because all processes operate on the same methods with the same global variables. Keep that in mind

Peace

SpyderTL · Post by **SpyderTL** » Tue Apr 22, 2014 11:01 am

Rusky wrote:Just look at normal RAM usage as virtually accessing the swap file, a specific case of accessing any old file. Linux essentially does this- mmap is used both to map in files lazily (including executables) and to allocate memory.

That makes more sense. (Assuming you identity map your kernel memory)

The kernel swap file code doesn't use the virtual address space, internally, which is what was confusing me. So the worst that could happen is that malloc'd memory could be paged out to the swapfile on the hard drive, and that could be swapped in if it was ever read by an application through its virtual address, but that would pretty much be the end of it.

SpyderTL · Post by **SpyderTL** » Tue Apr 22, 2014 11:09 am

This thread makes this statement from the Wiki make sense...

Memory management

In concept, CPU caches and RAM simply become cache layers on top of your hard drive, which represents your "real" memory limitation.

embryo · Post by **embryo** » Wed Apr 23, 2014 3:47 am

max wrote:My OS also has a flat memory model. Theres a (java) virtual machine running inside of it

Interesting! Do you have a Java OS? Or just JVM as an OS process? Why the Java is there? What for?

max · Post by **max** » Wed Apr 23, 2014 4:21 am

embryo wrote:
max wrote:My OS also has a flat memory model. Theres a (java) virtual machine running inside of it
Interesting! Do you have a Java OS? Or just JVM as an OS process? Why the Java is there? What for?

Hey embryo,
I've seen your work, but I haven't found the time to read your entire description by now, but I'll do that soon.

We have developed a JVM and the kernel is there to serve the very lowlevel needs of the VM. On startup, paging, memory allocation and the scheduler are set up, then the control is handed to the VM. Threads in the VM are equal to system processes. Then, the "ghost.Startup" class is loaded, which opens a console. All the drivers are written in Java, too. This is done by providing some JNI methods (CPU.writePort, CPU.readPort, Memory.write, Memory.read, Memory.copyArrayToMemory and so on) that allow lowlevel accesses (soon we'll add security features by restricting the classloaders to deny access to these classes for standard classes

). Also, there is a interrupt bridge - when an interrupt needs to be handled, a Java class is called (namely "ghost.system.Interrupts") via JNI calls. This class then redirects the interrupt to the right handler, for example to map a key interrupt to a real key and add it to the key buffer.

Your approach is very interesting too, i'll read a little more about it and then I can give you more feedback

embryo · Post by **embryo** » Wed Apr 23, 2014 5:52 am

Hey max,

max wrote:We have developed a JVM and the kernel is there to serve the very lowlevel needs of the VM...

Do you have a thread here about your project? It is better to talk there for not to derail this thread.

However, your approach is interesting and I still have some questions

max · Post by **max** » Wed Apr 23, 2014 6:06 am

embryo wrote:Hey max,
max wrote:We have developed a JVM and the kernel is there to serve the very lowlevel needs of the VM...
Do you have a thread here about your project? It is better to talk there for not to derail this thread.

However, your approach is interesting and I still have some questions

I'll make a thread and notify you via PM

OSDev.org

Single address space OS design: please review my thoughts

Re: Single address space OS design: please review my thought

Re: Single address space OS design: please review my thought

Re: Single address space OS design: please review my thought

Re: Single address space OS design: please review my thought

Re: Single address space OS design: please review my thought

Re: Single address space OS design: please review my thought

Re: Single address space OS design: please review my thought

Re: Single address space OS design: please review my thought

Re: Single address space OS design: please review my thought

Re: Single address space OS design: please review my thought

Re: Single address space OS design: please review my thought

Re: Single address space OS design: please review my thought

Re: Single address space OS design: please review my thought

Re: Single address space OS design: please review my thought