Seriously simple VMM design I came up yesterday
Posted: Tue May 09, 2006 4:52 am
First of all, sorry for the seriously long post (the long poster is back, please panic). I hope that the text is relatively easy to read though.
Ok, this thing assumes there's some IPC facility, because it's going to be built on top of microkernel, but same thing could be used for macrokernels, in which case the whole thing would be even more simple.
I'm going to implement this hopefully today, but thought I'd throw it here, so people can review if it has some stupid flaws.
Terminology I gonna use:
- frame: about 4k chunk of physical memory
- page: about 4k chunk of virtual memory
- object:, something that can be sent messages
- (page)directory: "hardware" table from virtual to physical
Design is (basicly) this:
1. There's big array with one entry per frame. Each entry has a map_count and an object*.
2. Each "present" directory entry points to a frame. Obviously.
Each "not present" directory entry contains an object*.
3. The magical object is the ethernal representation of a page. This object knows if the page is resident on some frame. It also knows how many times the page is mapped. Finally, it knows whether the resident page is dirty, and knows where it has backing store.
So starting from virtual address, we have either pointer to object, or a frame number, and the object knows how to give us a frame with the page, and the frame array tells us what page is on a given frame.
The object knows how many times it's mapped (that is, virtual mappings) and a frame knows how many time it's mapped (that is physical mapping). There's no way to figure out WHERE a given page (or frame) is mapped, though.
So there's a clock-collector, which has access to a process list, where processes are popped to top of list each time they either visit a CPU, or they have been scanned. The clock then just scans through the page directories one process at a time, starting from the process that's on the bottom of the list (that is, longest time not run and not scanned).
For each accessed page, just reset access.
For each unaccessed page, notify page object if it's dirty, and unmap the frame. Normal clock policy. For added bonus, one can claim the process as "collected" after a given number of frames have been unmapped, to avoid stealing all pages at once.
Once a frame has zero mapping count, it is either freed (if it's clean) or sent to backing store if it's dirty (and free once written, if nobody hasn't needed it yet).
Shared memory can be done simply by mapping the same page object in several address spaces (or even several places in one address space) and copy-on-write is trivially done by having the object know it's a copy of something else.
Since everything is done with single pages/frames, there is no complicated datastructures or anything like that to care about. Once the page has been unmapped (mapcount zero) the page object can be destroyed.
If backing store is in userspace, page object simply need to message it with the detail of "read me this" and "write this" and maybe "I don't need you anymore".
As long as the page is on some frame, the same frame can be mapped elsewhere. As long as the frame is mapped somewhere, we never write it to store. So there is (slow) userspace traffic only when the operation would be slow anyway, namely, loading page from store, and writing page to store.
Finally, demand loading needs page objects created when a file is mapped, but then proceeds exactly as normal page in.
Disadvantages:
1. Microkernel purists won't like the central paging policy.
2. Doing everything for each pages consumes some memory for each frame, and some memory for each page. Some memory will be wasted anyway, and I think a few dozen bytes for each virtual page is ok. (I gonna keep them in non-pageable kernel heap for now, possibly moving them elsewhere if it seems it's a problem).
3. There's no easy way to unmap a specific frame, other than scanning directories until nobody has it mapped. I don't think I want to support that anyway.
----
The big question now, is why haven't I done this before?
(Somebody here probably tells me in a minute )
PS: has the post length limit finally been raised?
Ok, this thing assumes there's some IPC facility, because it's going to be built on top of microkernel, but same thing could be used for macrokernels, in which case the whole thing would be even more simple.
I'm going to implement this hopefully today, but thought I'd throw it here, so people can review if it has some stupid flaws.
Terminology I gonna use:
- frame: about 4k chunk of physical memory
- page: about 4k chunk of virtual memory
- object:, something that can be sent messages
- (page)directory: "hardware" table from virtual to physical
Design is (basicly) this:
1. There's big array with one entry per frame. Each entry has a map_count and an object*.
2. Each "present" directory entry points to a frame. Obviously.
Each "not present" directory entry contains an object*.
3. The magical object is the ethernal representation of a page. This object knows if the page is resident on some frame. It also knows how many times the page is mapped. Finally, it knows whether the resident page is dirty, and knows where it has backing store.
So starting from virtual address, we have either pointer to object, or a frame number, and the object knows how to give us a frame with the page, and the frame array tells us what page is on a given frame.
The object knows how many times it's mapped (that is, virtual mappings) and a frame knows how many time it's mapped (that is physical mapping). There's no way to figure out WHERE a given page (or frame) is mapped, though.
So there's a clock-collector, which has access to a process list, where processes are popped to top of list each time they either visit a CPU, or they have been scanned. The clock then just scans through the page directories one process at a time, starting from the process that's on the bottom of the list (that is, longest time not run and not scanned).
For each accessed page, just reset access.
For each unaccessed page, notify page object if it's dirty, and unmap the frame. Normal clock policy. For added bonus, one can claim the process as "collected" after a given number of frames have been unmapped, to avoid stealing all pages at once.
Once a frame has zero mapping count, it is either freed (if it's clean) or sent to backing store if it's dirty (and free once written, if nobody hasn't needed it yet).
Shared memory can be done simply by mapping the same page object in several address spaces (or even several places in one address space) and copy-on-write is trivially done by having the object know it's a copy of something else.
Since everything is done with single pages/frames, there is no complicated datastructures or anything like that to care about. Once the page has been unmapped (mapcount zero) the page object can be destroyed.
If backing store is in userspace, page object simply need to message it with the detail of "read me this" and "write this" and maybe "I don't need you anymore".
As long as the page is on some frame, the same frame can be mapped elsewhere. As long as the frame is mapped somewhere, we never write it to store. So there is (slow) userspace traffic only when the operation would be slow anyway, namely, loading page from store, and writing page to store.
Finally, demand loading needs page objects created when a file is mapped, but then proceeds exactly as normal page in.
Disadvantages:
1. Microkernel purists won't like the central paging policy.
2. Doing everything for each pages consumes some memory for each frame, and some memory for each page. Some memory will be wasted anyway, and I think a few dozen bytes for each virtual page is ok. (I gonna keep them in non-pageable kernel heap for now, possibly moving them elsewhere if it seems it's a problem).
3. There's no easy way to unmap a specific frame, other than scanning directories until nobody has it mapped. I don't think I want to support that anyway.
----
The big question now, is why haven't I done this before?
(Somebody here probably tells me in a minute )
PS: has the post length limit finally been raised?