Hi,
wlnirvana wrote:Hi Brendan, thank you very much for the detailed response. However, I am still not entirely clear about it.
Brendan wrote:Then I had a function to find an unused process ID, and once a process ID was assigned I could do "new_page_directory_virtual_addr = new_process_ID << 12 + page_directory_mapping_area_start;" to figure out where the page directory for that process was mapped and copy the kernel's page directory entries into it (which would "auto-allocate" the physical page if it wasn't already allocated because the whole area is "allocate on write").
Does this implies that all page directories are assigned with virtual addresses (with the kernel memory space) that end with
page_directory_mapping_area_start as the 12 offset bits?
If the page directory mapping area is 4KiB (i.e. of the same size with a page), and for simplicity assume the lower 12 bits of
page_directory_mapping_area_start are all zeros (i.e.
page_directory_mapping_area_start && 0xFFFFF000 == 0), then the page directory of process x would take exactly the whole 4KiB virtual page located at
x<<12. (If I may, although not very relevant, I would like to confirm that this address
x<<12 itself will be converted to a physical one by looking up the kernel page directory.)
If my understanding above is correct, then to avoid the use of a heap (or whatever dynamic memory management utilities) in the virtual memory manager, a lot of virtual pages in the kernel space have to be reserved for potential paging usage. This is because the process ID can range from 0 to, say 65535. Consequently:
The page directory of process 0 will be held at virtual address 0x00000000, and it is going to take up to 0x00000FFF.
The page directory of process 1 will be held at virtual address 0x00001000, and it is going to take up to 0x00001FFF.
...
The page directory of process 65535 will be held at virtual address 0x0FFFF000, and it is going to take up to 0x0FFFFFFF.
To support up to 65536 processes without paging's dependency on a heap, all the kernel RAM between 0x00000000 to 0x0FFFFFFF must be reserved so that the page directory of process x can be mapped with guarantee!!
That's mostly correct; except that you're forgetting that there's a "page_directory_mapping_area_base", so:
- The page directory of process 0 will be held at virtual address "page_directory_mapping_area_base"
The page directory of process 1 will be held at virtual address "page_directory_mapping_area_base + 0x00001000"
...
The page directory of process 65535 will be held at virtual address "page_directory_mapping_area_base + 0x0FFFF000"
In other words, you can do:
Code: Select all
page_directory_virtual_address = page_directory_mapping_area_base + (process_ID << 12);
Note that I use the "
recursive mapping trick" so that the page table entries for the current virtual address space are also accessible; and because kernel space is always the same in all virtual address spaces this means that the page table entries for the kernel's pages are always accessible. In that case, you can fetch the physical address of a page directory from the page tables, like:
Code: Select all
page_directory_physical_address = *((page_directory_virtual_address >> 12) * 4 + recursive_mapping_base) & 0xFFFFF000;
And if you don't need the virtual address you can combine both of these to get:
Code: Select all
K = (page_directory_mapping_area_base >> 12) * 4 + recursive_mapping_base;
page_directory_physical_address = *(K + process_ID * 4) & 0xFFFFF000;
Note that K is a constant that can be calculated at compile time; so this ends up being two instructions - e.g. "mov eax,[K + eax*4]" followed by "and eax,0xFFFFF000".
In other words; to determine the virtual address of a page directory from a process ID it takes two instructions and no memory accesses (and no potential cache misses); and to determine the physical address of a page directory from a process ID it takes two instructions and one memory access (and one potential cache miss). Of course you'd also cache the physical address of a process' page directory in each of its thread's "thread data" so that it's a little more convenient to set CR3 during task switches.
I also use a similar approach for other things. For example; each process has a "process data area" (to store the process' name, how much CPU time it consumed, which threads belong to it, if it has special access to range/s of IO ports or memory mapped IO areas, etc), and each thread has a "thread data area" (to contain the thread's state used during task switches, for its kernel stack, "link" fields used for scheduler's queues, etc).
For message queues; in my case, each message queue is implemented as "linked list of 4 KiB blocks, where each block contains one or more variable length entries (one variable length entry per message)", and where memory management is done with an O(1) "stack of 4 KiB message queue blocks". A large area of kernel space is set aside for these message queue blocks (e.g. from 0xC0000000 to 0xD0000000); and that area uses the same "allocate on write" approach that's used for the "page directory mapping area" (and the "process data mapping area" and the "thread data mapping area").
Of course a micro-kernel doesn't contain much more than this (other than a few smaller things in the ".bss", etc). In other words; the physical memory management mostly uses free page stacks and doesn't need a heap; the virtual memory management is mostly done with page tables (and the "page directory mapping area") and doesn't need a heap; all of the process and thread management and scheduling is done with "process/thread data mapping areas" that don't need a heap; messaging/message queues have their own "stack of 4 KiB blocks" allocator that is much simpler and faster than a heap; and because it's a micro-kernel there's almost nothing else in kernel space so the kernel has no need for a heap.
There is one thing that I haven't mentioned (and probably should). Because almost everything uses "allocate on write", over time the amount of physical RAM the kernel is using grows. For example, if you create 100 processes it will allocate about 1200 KiB of physical RAM (for page directories, process data structures and one thread data structure per process) and when you destroy those 100 processes the physical RAM that was allocated will remain allocated (to improve performance by avoiding the need to re-allocate it next time its used). If/when the kernel is running low on physical RAM it asks all of the pieces to free some; and all the pieces (virtual memory manager, scheduler, messaging code, etc) have code to find "previously allocated but currently unused" pages and convert them back to "allocate on write" (causing the physical RAM to be freed). This is typically relatively simple - e.g. if you have code to find a free process ID (or find a free thread ID, or...) then it's not that hard to find all of the free process IDs (or all of the free thread IDs, or...) and tell the virtual memory manager to make sure the corresponding pages are "allocate on write" (even if they already are). Once that's done the virtual memory manager may scan kernel space looking for page tables that can also be freed. Of course this is typically done in a more progressive way (e.g. if kernel is only slightly worried about free physical memory then only free some pages and stop, without freeing as much as possible) and is sensitive to CPU load (e.g. be more aggressive about freeing pages if kernel has nothing better to do with CPU time), in the hope of avoiding/reducing "garbage collection stalls" under load.
Cheers,
Brendan