Implementing a Virtual Address Allocator with a Balanced Binary Tree

zerone015 · Post by **zerone015** » Thu Mar 13, 2025 6:26 am

Hello,

I have finished implementing a physical frame allocator and am now working on a virtual address allocator. I plan to implement it using a balanced binary tree because, depending on the choice of key, allocation and merging can be faster than a flat list.

However, my plan might have issues, as I lack extensive kernel development experience and find it difficult to consider various scenarios. So, I’d like to ask for advice.

Maintaining Only Available Spaces:
I am considering maintaining only nodes for available memory spaces. To determine the size of an allocated space, I plan to use one of the bits from positions 9–11 in the page table entry as a contiguous bit.
When freeing memory, I will traverse the page table using the virtual address to determine the total size and then free it. Because of this, I don’t see the need to include additional nodes for allocated spaces.
Of course, maintaining those nodes would allow for very fast size retrieval, but it would also consume more memory and increase N, which seems unnecessary. Does this approach seem reasonable?

Choosing the Key:
I am debating whether to use size or base address as the key.
If I use size, allocation will be faster.
If I use base address, merging during deallocation will be faster.
However, since deallocation always follows allocation and most memory (except the heap) will not be freed until the process exits, it seems better to prioritize fast allocation. Is this the right approach?

Efficiently Reclaiming Frames on Process Exit:
I have not yet implemented process-related functionality, but in the next step, when processes are introduced, the kernel will need to reclaim physical frames when a process calls exit().
My current idea is to iterate through the page table of the exiting process and call frame_free() for every entry where the Present bit is set. However, if the process was using the full 3GB of virtual address space, this would require scanning 768KB entries, which seems inefficient. How can I improve this process?

Any insights or suggestions would be greatly appreciated! Thank you.

nullplan · Post by **nullplan** » Thu Mar 13, 2025 10:00 am

Honestly, I would make two virtual allocators, one for the kernel and one for user space.

See, I'm building a Unix-like system, and in such systems, all of userspace is either free or a file mapping. Some of it is mapping anonymous files, granted, (so they have no name and only exist in memory) but files nonetheless.

But in kernel space, virtual addresses work completely differently. Maintaining only available spaces actually makes sense there. It doesn't make sense in user space, where you need to be able to react to a page fault by loading the necessary part of the underlying file.

In both cases, I wouldn't start with "I wanna use a tree", but more with "I need a collection, let's see which one is best here." I think for the kernel side you will need a collection of free space nodes keyed by size, so when an allocation request comes in you can quickly identify the best fit. So you'd need a collection that quickly allows you to identify the smallest free chunk above the requested size. And look at that, a tree can do that. A tree keyed by size!

For deallocation you will want to look up the adjacent addresses, yes, so you can merge and prevent memory fragmentation. So here's my heretical proposal: Have two trees.

For the userspace allocator, I would have a similar structure, since you can be asked to allocate virtual memory by size. But for managing the space on process exit, I would suggest a different design: You use a different allocator for each virtual space. So you have your two trees for the kernel, and your structures as needed for each virtual space. Once the use count of the virtual space drops to zero, you go through all the file maps in the file mapping tree, and unmap each of them. If that makes a file's refcount drop to zero, you close and free the file (and if the file was anonymous, it will cease to be at that point).

zerone015 · Post by **zerone015** » Thu Mar 13, 2025 11:18 am

My goal is to implement a kernel similar to UNIX. I knew that UNIX manages everything as files, but I wasn't familiar with the details, so the information you provided about user space was very helpful. Thank you.

However, there is something I find puzzling. Is it really effective to have two trees in kernel space?

I was thinking about having one tree with the size as the key and another with the base address as the key.
The idea is that when allocating, we would search the tree with the size as the key, and when deallocating, we would search the tree with the base address as the key.

This seems like the best approach, but the problem is that when allocating, we need to remove the node from both trees, and similarly, when deallocating, we need to add the node back to both trees and merge them. So, it seems like using two trees would double the memory usage and slow things down compared to just using one tree. Am I misunderstanding this?

nullplan · Post by **nullplan** » Thu Mar 13, 2025 11:52 am

zerone015 wrote: ↑Thu Mar 13, 2025 11:18 am This seems like the best approach, but the problem is that when allocating, we need to remove the node from both trees, and similarly, when deallocating, we need to add the node back to both trees and merge them. So, it seems like using two trees would double the memory usage and slow things down compared to just using one tree. Am I misunderstanding this?

Probably. If you only have one tree, then doing the thing that tree is not good at requires a complete search of the entire tree. If you only had one tree keyed on the base address, then finding a node in the tree that has at least a given size requires complete traversal. Similarly, if you only have one tree keyed on size, then finding the nodes for the adjacent addresses also requires complete traversal.

With the two trees approach, you look up the node you want in the tree that is good for that operation, and then you get the other datum you need to look up the same node in the other tree. For example at allocation time, you look up a node in the size tree, and then from that you get the base address to find (and delete) the node in the address tree.

Actually, now that I think about it, a simple tree might not be the best data structure for the size tree after all. The problem is that when you go left, you have no guarantee that won't take you below the limit you were looking for. And once below, you have no guarantee any amount of going right will take you above it again. You could implement an interval tree for the size tree (which is where all nodes also contain the maximum of any node below that node), but that requires more reading.

zerone015 · Post by **zerone015** » Thu Mar 13, 2025 12:07 pm

nullplan wrote: ↑Thu Mar 13, 2025 11:52 am With the two trees approach, you look up the node you want in the tree that is good for that operation, and then you get the other datum you need to look up the same node in the other tree. For example at allocation time, you look up a node in the size tree, and then from that you get the base address to find (and delete) the node in the address tree.

Oh my goodness! How did I not think of first finding a node in the tree where size is the key, and then using the base address of that node to search in the tree where base address is the key? I was so foolish. Your advice was absolutely brilliant. I'll also give deep thought to the issue you mentioned and study interval trees as well. Thank you!

thewrongchristian · Post by **thewrongchristian** » Thu Mar 13, 2025 7:18 pm

zerone015 wrote: ↑Thu Mar 13, 2025 6:26 am Hello,

I have finished implementing a physical frame allocator and am now working on a virtual address allocator. I plan to implement it using a balanced binary tree because, depending on the choice of key, allocation and merging can be faster than a flat list.

However, my plan might have issues, as I lack extensive kernel development experience and find it difficult to consider various scenarios. So, I’d like to ask for advice.

Maintaining Only Available Spaces:
I am considering maintaining only nodes for available memory spaces. To determine the size of an allocated space, I plan to use one of the bits from positions 9–11 in the page table entry as a contiguous bit.
When freeing memory, I will traverse the page table using the virtual address to determine the total size and then free it. Because of this, I don’t see the need to include additional nodes for allocated spaces.
Of course, maintaining those nodes would allow for very fast size retrieval, but it would also consume more memory and increase N, which seems unnecessary. Does this approach seem reasonable?

I mostly maintain only the address space I've allocated.

I do track kernel address space gaps, but on reflection, I don't think I need to do that, as I can derive all that information from the allocated address space as I do in the per process user address space.

zerone015 wrote: ↑Thu Mar 13, 2025 6:26 am Choosing the Key:
I am debating whether to use size or base address as the key.
If I use size, allocation will be faster.
If I use base address, merging during deallocation will be faster.
However, since deallocation always follows allocation and most memory (except the heap) will not be freed until the process exits, it seems better to prioritize fast allocation. Is this the right approach?

In my address allocator, new allocations are located by finding the first free gap between segments that will fit the new allocation. I don't keep a separate list of free regions by size, everything is done using virtual address based linear scans.

For reference, my kernel address space allocation routine, bearing in mind it's doing a linear search for the first fit hole, does not register AT ALL in my profiling tests.

zerone015 wrote: ↑Thu Mar 13, 2025 6:26 am Efficiently Reclaiming Frames on Process Exit:
I have not yet implemented process-related functionality, but in the next step, when processes are introduced, the kernel will need to reclaim physical frames when a process calls exit().
My current idea is to iterate through the page table of the exiting process and call frame_free() for every entry where the Present bit is set. However, if the process was using the full 3GB of virtual address space, this would require scanning 768KB entries, which seems inefficient. How can I improve this process?

Any insights or suggestions would be greatly appreciated! Thank you.

I manage both my kernel address space (top of address space) and the user address space each using a BST keyed by virtual address. The tree represents the map from base address of a segment of memory to a description of that segment (size of the region, memory backing information, permissions).

The kernel address space map is a single global map, shared by all processes.

The user address space map is per process.

In short, I would recommend doing the minimum you need to get it working. Keep it simple.

Don't over-optimize trying to anticipate bottlenecks. Measure to find bottlenecks, then address them as you find them.

zerone015 · Post by **zerone015** » Fri Mar 14, 2025 2:40 am

thewrongchristian wrote: ↑Thu Mar 13, 2025 7:18 pm

In my address allocator, new allocations are located by finding the first free gap between segments that will fit the new allocation. I don't keep a separate list of free regions by size, everything is done using virtual address based linear scans.

For reference, my kernel address space allocation routine, bearing in mind it's doing a linear search for the first fit hole, does not register AT ALL in my profiling tests.

I initially considered optimization because I saw the statement on the wiki: "If virtual memory gets fragmented and the list gets longer, that may become an issue." This made me concerned about potential performance degradation if the address space became heavily fragmented. But now that I think about it again, I’m not sure fragmentation would actually be severe enough to cause a bottleneck.

thewrongchristian wrote: ↑Thu Mar 13, 2025 7:18 pm
In short, I would recommend doing the minimum you need to get it working. Keep it simple.

Don't over-optimize trying to anticipate bottlenecks. Measure to find bottlenecks, then address them as you find them.

I’ll stop overthinking and focus on implementing a simple version first, then optimize after identifying any issues. Thanks for the advice!

rdos · Post by **rdos** » Fri Mar 14, 2025 8:15 am

There are two types of allocations that are relevant for both userspace and kernel: Byte aligned allocators and page aligned. I tihink you are discussing the byte aligned here, but the page aligned is important too. I implement the byte aligned allocator with a control block mechanism (I know this isn't the most effective), and the page aligned allocator in the page tables by using free bits in the page table entries. In kernel, I have separate functions for allocating byte aligned and page aligned virtual memory, and in user space the runtime library will determine method, which builds on a heap and will not use the page aligned allocator. Still, user space can allocate page aligned memory through syscalls.

zerone015 · Post by **zerone015** » Sat Mar 15, 2025 4:15 am

rdos wrote: ↑Fri Mar 14, 2025 8:15 am There are two types of allocations that are relevant for both userspace and kernel: Byte aligned allocators and page aligned. I tihink you are discussing the byte aligned here, but the page aligned is important too. I implement the byte aligned allocator with a control block mechanism (I know this isn't the most effective), and the page aligned allocator in the page tables by using free bits in the page table entries. In kernel, I have separate functions for allocating byte aligned and page aligned virtual memory, and in user space the runtime library will determine method, which builds on a heap and will not use the page aligned allocator. Still, user space can allocate page aligned memory through syscalls.

I'm actually talking about the page-aligned allocator. I also plan to implement the page-aligned allocator using free bits in the page table entries.

By "control block mechanism," do you mean a method where additional metadata is stored along with allocated memory? If so, I think that approach is quite effective. I've looked into the source code of ptmalloc2, and it also manages the heap in a similar way.

Thanks for sharing your experience!

OSDev.org

Implementing a Virtual Address Allocator with a Balanced Binary Tree

Implementing a Virtual Address Allocator with a Balanced Binary Tree

Re: Implementing a Virtual Address Allocator with a Balanced Binary Tree

Re: Implementing a Virtual Address Allocator with a Balanced Binary Tree

Re: Implementing a Virtual Address Allocator with a Balanced Binary Tree

Re: Implementing a Virtual Address Allocator with a Balanced Binary Tree

Re: Implementing a Virtual Address Allocator with a Balanced Binary Tree

Re: Implementing a Virtual Address Allocator with a Balanced Binary Tree

Re: Implementing a Virtual Address Allocator with a Balanced Binary Tree

Re: Implementing a Virtual Address Allocator with a Balanced Binary Tree