Virtual Address Allocation
Posted: Fri Jun 13, 2014 12:06 pm
In my OS, I'm starting to implement basic memory management components. I've currently created a free list manager, which passes out free pages from a linked list (where there are lists per-cache color per-NUMA domain), and paging code, which creates a new page directory for the kernel, modules, and data structures and switches to it. Now I want to implement a kernel heap, since most other components of the kernel will depend on it. This made me step back a little bit and think about how the heap is allocated, not in physical memory (I have that down already), but in virtual memory. I'm not satisfied with just placing the heap at some static location, I want to be able to manage the kernel address space dynamically. Furthermore, other components like the module loader will need this support as well in order to place modules somewhere in the address space. This got me wondering about how I would actually go about implementing a virtual address allocator.
For simplicity, I could implement a virtual address allocator by just keeping a pointer to free virtual addresses and incrementing that pointer every time an allocation is made. I actually do this early on in my bootloader, since it needs to allocate several data structures for the kernel (namely the page frame database, per-CPU and NUMA domain areas, Local APIC MMIO region). Such a method is fine in a bootloader, since those virtual addresses will never be reclaimed, but it's not ideal in a kernel where you can free regions and create holes in the address space. Say the kernel allocates 4 pages, a module allocates 4 pages, the module frees all those pages, and then the kernel allocates a page, you'd be ignoring 4 virtual address space pages.
Another idea (which I got from reavengrey on the IRC) is managing the virtual address space using a linked list of free regions. Although it seems like you'd require a working kernel heap for this, I realized (or so I thought) that you actually don't. You can start out with one large, statically-allocated, region entry covering the whole address space. Each time you allocate a virtual address, you go through the linked list, find the first entry with a suitable size, and modify it to account for the fact that you allocated some pages. Allocating a new region entry is only required if you free some virtual addresses and end up creating a hole between two other free regions.
However, I then realized that upon allocating a virtual address range, you'd want to move it into a different data structure to keep track of used memory. This would require allocating new region entries upon allocations, creating a deadlocked dependency between the virtual address allocator and the kernel heap. Say you allocate 0x1000 bytes of kernel address space. This requires allocating a new region entry, so the memory manager asks the kernel heap for some memory for that. Let's say that the heap is out of virtual address space and needs more, so it decides to ask the kernel for more. Now it's impossible to continue and the kernel deadlocks. This scenario actually seems pretty likely, since the only way to expand the kernel heap is to get more space for virtual addresses.
So anyway, my question is what people think of these 2 mechanisms, how could they potentially be improved, or what are other methods I could use?
For simplicity, I could implement a virtual address allocator by just keeping a pointer to free virtual addresses and incrementing that pointer every time an allocation is made. I actually do this early on in my bootloader, since it needs to allocate several data structures for the kernel (namely the page frame database, per-CPU and NUMA domain areas, Local APIC MMIO region). Such a method is fine in a bootloader, since those virtual addresses will never be reclaimed, but it's not ideal in a kernel where you can free regions and create holes in the address space. Say the kernel allocates 4 pages, a module allocates 4 pages, the module frees all those pages, and then the kernel allocates a page, you'd be ignoring 4 virtual address space pages.
Another idea (which I got from reavengrey on the IRC) is managing the virtual address space using a linked list of free regions. Although it seems like you'd require a working kernel heap for this, I realized (or so I thought) that you actually don't. You can start out with one large, statically-allocated, region entry covering the whole address space. Each time you allocate a virtual address, you go through the linked list, find the first entry with a suitable size, and modify it to account for the fact that you allocated some pages. Allocating a new region entry is only required if you free some virtual addresses and end up creating a hole between two other free regions.
However, I then realized that upon allocating a virtual address range, you'd want to move it into a different data structure to keep track of used memory. This would require allocating new region entries upon allocations, creating a deadlocked dependency between the virtual address allocator and the kernel heap. Say you allocate 0x1000 bytes of kernel address space. This requires allocating a new region entry, so the memory manager asks the kernel heap for some memory for that. Let's say that the heap is out of virtual address space and needs more, so it decides to ask the kernel for more. Now it's impossible to continue and the kernel deadlocks. This scenario actually seems pretty likely, since the only way to expand the kernel heap is to get more space for virtual addresses.
So anyway, my question is what people think of these 2 mechanisms, how could they potentially be improved, or what are other methods I could use?