OSDev.org

Posted: **Wed Nov 25, 2009 2:26 pm**

I've been redesigning my memory manager lately, and have come up with two binary trees, sorted by size. The free and used blocks are each represented in one of these trees, making deallocation a simple matter of removing a node from one tree and adding it to another. I can also bring it closer to best fit, by inspecting the left and right nodes.

I've run into a small roadblock when I try to contract the 'heap' of available pages. I simply can't figure out how to practically remove the nodes making up a page from the free list. The only safe way to do this that I can think of is to start at the smallest free address, and look for nodes with a start address of previousNode->Address+previousNode->Size. When the total size of that continuous block is 0x1000 blocks, I remove all of them from the free list and release the pages they use. I do this as many times as I need to.

But in practice, this concept seems to elude me. More precisely, I don't know how to make it run quickly. Something that I've also been considering is adding an extra field to the memory block descriptor which provides a pointer to the next block in the page. But adding an extra integer to every block of memory seems wasteful, and it seems as though it would slow down the addition of nodes to the tree. This would change the contraction to just looking through the nodes until the 'chain' that it creates has a total size of 0x1000.

Is there any other way of contracting a binary tree heap that I can't think of, or any way to make this faster?

Posted: **Fri Nov 27, 2009 6:21 am**

The word shrinking is a bit more appropriate for the topic. Anyway, two things worth mentioning:

1. Allocating/deallocating memory should not be a bottleneck for any OS as it is supposed to allocate large blocks for applications, not small ones (keep in mind that the runtime environments of applications do memory allocation internally, they use large memory blocks from the OS). If it does small ones, it is too involved in the memory management as different applications use memory differently. If an OS allocates memory for an application every 5 minutes, the 5ms lost because of using a slower algorithm really does not matter (unless we are talking RTOS which you won't be running such a wide variety of applications on in anyway)

2. Directly related to your question, speed is a tradeoff for fragmentation. Making the contracting faster will sacrifice fragmenting. The cause of this is not the algorithm you use to keep track of memory, it’s a function of the allocation patterns of memory which is not in your control, but determined by the applications that run on the OS.

Why don't you defer looking for consecutive blocks to when the CPU is idle, i.e. when an app requests a block, give it the first block that is big enough, sizing it as needed and at a later stage when the cpu is idle, you merge any illegible blocks in the tree (reducing future search times). When you get an allocation request that wants memory that is not big enough, you revert to another algorithm that will give you a block the size requested at the price of waiting for the memory manager to sort out the fragmentation. If the machine is idle enough to avoid the worst case scenario, you win. If it hits the worst case scenario too much, chances are that the machine is so busy working, that you either need more physical RAM or you are running too many applications at once (the OS cant purchase RAM or convince users to run less apps ).

Posted: **Fri Nov 27, 2009 10:08 am**

In a typical OS, applications don't use the system heap, since they always allocate whole pages. The OP is probably talking about allocations which the OS itself and the drivers make.

Posted: **Fri Nov 27, 2009 11:07 am**

Thank you for your reply. I've since made a minor change to my memory manager design which makes things a lot easier (basically the memory structures are now at the front of the memory block they represent - it's far more efficient to release N pages now). I've been using 'contracting' because my first operating system was built from JamesM's tutorial code. Although I've since completely rewritten it several times, the terminology has stuck.

mybura wrote:1. Allocating/deallocating memory should not be a bottleneck for any OS as it is supposed to allocate large blocks for applications, not small ones (keep in mind that the runtime environments of applications do memory allocation internally, they use large memory blocks from the OS). If it does small ones, it is too involved in the memory management as different applications use memory differently.

I understand what you mean, but this is just the internal allocator. I only use it to set up the timer, scheduler, linker and other boot-only stuff. For the user-space stuff, I've got a simple page allocator which will allocate a user-accessible page, and pass it back to a malloc()/operator new implementation.

mybura wrote:at a later stage when the cpu is idle, you merge any illegible blocks in the tree

I've got this planned, and my current heap already has a basic method to do this. But it's just a nicety to improve performance; for now, I need to get the heap itself working, so that I can have some nodes to reorganise

OSDev.org

Contracting a binary tree heap

Contracting a binary tree heap

Re: Contracting a binary tree heap

Re: Contracting a binary tree heap

Re: Contracting a binary tree heap