O(1) Page Allocation, Freeing and State Checking?

tharkun · Post by **tharkun** » Fri May 10, 2013 2:26 pm

I've recently gotten back into OSDev and have been working on my kernel's page frame allocator. Originally I was going to simply use a bitmap, but I decided pretty quickly that it seemed like a bit of a kludge. The design I've come up with is a hybrid version of a stack and bitmap. The entries are stored in memory as an array and can be indexed as such, but each entry also includes a pointer to the next entry with free pages. When an entry runs out of free pages, it's popped from the stack.

Code: Select all

struct entry {
        struct entry *next;
        uintptr_t bitmap;
};

This is the allocation function, it finds the first free bit in the entry at the top of stack. Almost O(1), as I'm not sure how GCC implements __builtin_ffsll on various different architectures, though even if it's not O(1) it has a maximum number of 64 iterations on x86-64 and 32 on x86 and ARM.

Code: Select all

page_t alloc_page(void)
{
        int bit = __builtin_ffsll(~head->bitmap) - 1;

        head->bitmap |= (1 << i);

        struct entry *tmp = head;

        if (~head->bitmap == 0) {
                head = head->next;
                tmp->next = NULL;
        }

        return ((tmp - bitmap_base) * PAGES_PER_ENTRY + bit) * PAGE_SIZE;
}

To free a page, the allocator unsets the bit and if needed place the entry back onto the stack. This is O(1).

Code: Select all

void free_page(page_t page)
{
        struct entry *e = &bitmap_base[(page / PAGE_SIZE) / PAGES_PER_ENTRY];
        int bit = (page / PAGE_SIZE) % PAGES_PER_ENTRY;
        bool was_full = (~e->bitmap == 0);
        e->bitmap &= ~(1 << bit);
        /* Push back onto the stack if needed */
        if (e->next == NULL && was_full) {
                e->next = head;
                head = entry;
        }
}

To check the state of the page, the allocator can find the correct entry by indexing the entries as an array. Again, should be O(1)

Code: Select all

bool test_page(page_t page)
{
        struct entry *e = &bitmap_base[(page / PAGE_SIZE) / PAGES_PER_ENTRY];
        int bit = (page / PAGE_SIZE) % PAGES_PER_ENTRY;
        return e->bitmap & (1 << bit);
}

AFAICT and unless there is something I am missing this page allocator should run in constant time, with the advantages of both a stack and bitmap, while using only twice the space of the latter.

What are the pros and cons of this approach compared to other allocators?

OSwhatever · Post by **OSwhatever** » Fri May 10, 2013 3:55 pm

The method is constant time and it would work quite well I think. In practice you can also make this method lock free, if you pop the first element, clear/set bits and then put it back in again if necessary. Stacks are nice as they can easily be made lock free.

Drawback would be that finding consecutive free pages of any desired length would just like with a bitmap, be harder.

Kevin · Post by **Kevin** » Mon May 13, 2013 8:07 am

tharkun wrote:Almost O(1), as I'm not sure how GCC implements __builtin_ffsll on various different architectures, though even if it's not O(1) it has a maximum number of 64 iterations on x86-64 and 32 on x86 and ARM.

A maximum of 64 iterations is O(1) by definition.

What are the pros and cons of this approach compared to other allocators?

The obvious cost is that you're trading memory for performance.

Mikemk · Post by **Mikemk** » Mon May 13, 2013 8:20 am

OSwhatever wrote:Drawback would be that finding consecutive free pages of any desired length would just like with a bitmap, be harder.

In theory, an os could, at boot, make a stack from a list of all possible pages. When searching for x pages to allocate, it could pop x numbers off that stack. freeing pages would simply push them on the stack, and run invlpg [n] for each page.

gravaera · Post by **gravaera** » Mon May 13, 2013 10:33 am

Yo:

This is actually pretty good :O

The thing that really makes it really good is that you can build a very fast contiguous allocator on top of this using some extra metadata.

--Peace out,
gravaera

Mikemk · Post by **Mikemk** » Mon May 13, 2013 11:25 am

One potential improvement on my idea is to have multiple allocation stacks for different sizes of contiguous memory - for example, 1, 2, 4, 8, and 16 pages, etc. up to whatever max you choose to have.
Then, in malloc() or equivalent, divide the size by 4096 and round up
Then the OS has two choices - allocate one from the smallest size greater (faster) or allocate several based on the bits (less ram used)
If the needed stack is empty, you can use two of the smaller size units. If stack 1 is empty, you can pick a larger one and split it up.
For example, consider the starting stacks

Code: Select all

|        |------|------|------|------|------|------|------|------|
|Stack   |    1 |    2 |    4 |    8 |   16 |   32 |   64 |  128 |
|        |------|------|------|------|------|------|------|------|
|Entries |   19 |    1 |    4 |   62 |    0 |    2 |    0 |    1 |
|        |------|------|------|------|------|------|------|------|

and code

Code: Select all

malloc(160262) ; allocate 160262 bytes
malloc(978256) ; This program uses a lot of memory.

160262 / 4096 rounded up = 40. If the OS goes for speed, it can allocate a single 64 page entry. Since there's not one, try 2 32 page entries, which exist. Then it gets a request for 239 pages. Since 128 is the largest size, it tries allocating two 128-page entries. Only one is available, so it takes it and tries for 2 64s. None, it tries for 4 32s. sorry, lets get 8 16s. Or maybe 16 8s, which is available, so it finishes.

If it goes for ram usage, it can look and see that bits 32 and eight are set, and allocate one of each. Then comes a request for 239 pages. going by the bits, it takes one 128, then since 64 isn't available, it tries for three 32s. one one, lets get two 16s. none, lets get five eights, one four, one two, and one 1.

FallenAvatar · Post by **FallenAvatar** » Mon May 13, 2013 4:38 pm

m12 wrote:...

Stop trying to thread-jack...

OT: This is a very cool setup. And it looks like you are right about it being O(1) in each of those cases. Do you have any way of handling marking a certain page as used tho? Like if you need to use the below 1MB boundrary for some hardware, or any other special cases like that?

- Monk

greyOne · Post by **greyOne** » Tue May 21, 2013 9:46 am

OSwhatever wrote:Drawback would be that finding consecutive free pages of any desired length would just like with a bitmap, be harder.

On the other hand,
How often do you really need to find consecutive page frames?

Most of the time - unless you're actually dealing with physical addresses -
The address translation of paging itself (by mapping them consecutively) can make any page frames virtually consecutive.

OSwhatever · Post by **OSwhatever** » Tue May 21, 2013 12:34 pm

greyOne wrote:On the other hand,
How often do you really need to find consecutive page frames?

Most of the time - unless you're actually dealing with physical addresses -
The address translation of paging itself (by mapping them consecutively) can make any page frames virtually consecutive.

There are several design choices you make here in order to make your kernel to work with only one or arbitrary page sizes. In many contemporary kernels you simply work with one page granularity, for example demand paging, swapping virtual area expansion etc work by 4K page basis and that will make things much simpler. However, I know that some kernels add some extra pages on the first memory exception of a program of both text and data area. This in order to make the program appear to start faster. Here you suddenly have the possibility to use a larger page than only the smallest one. You can for example expand virtual heap area (sbrk) in 16KB or 64KB chunks. As you notice, there many opportunities to play around with different page sizes but you will get the penalty with higher complexity. With ever increasing amount of memory, the more benefit of larger page sizes as the TLB will more quickly saturated and I think the trend will be that kernels will try to work with larger page sizes if possible.

bluemoon · Post by **bluemoon** » Tue May 21, 2013 3:35 pm

OSwhatever wrote:However, I know that some kernels add some extra pages on the first memory exception of a program of both text and data area. This in order to make the program appear to start faster. Here you suddenly have the possibility to use a larger page than only the smallest one. You can for example expand virtual heap area (sbrk) in 16KB or 64KB chunks.

My solution is to provide API to explicitly allocate a chunk of memory instead of demand paging. My system still works with 4K pages, but without #PF overhead this is acceptably quick even with 16 calls to page_alloc just to get 64KB chunk - which resolve to less than hundred of operations.

OSwhatever · Post by **OSwhatever** » Tue May 21, 2013 3:53 pm

bluemoon wrote:My solution is to provide API to explicitly allocate a chunk of memory instead of demand paging. My system still works with 4K pages, but without #PF overhead this is acceptably quick even with 16 calls to page_alloc just to get 64KB chunk - which resolve to less than hundred of operations.

The overhead of the page allocator is not that of a problem, also there is so much other things going on during a virtual space allocation that not much time is spent there compared to other things unless the PF algorithm is really bad. The greatest benefit of larger pages is less TLB congestion which is noticeable during execution.

OSDev.org

O(1) Page Allocation, Freeing and State Checking?

O(1) Page Allocation, Freeing and State Checking?

Re: O(1) Page Allocation, Freeing and State Checking?

Re: O(1) Page Allocation, Freeing and State Checking?

Re: O(1) Page Allocation, Freeing and State Checking?

Re: O(1) Page Allocation, Freeing and State Checking?

Re: O(1) Page Allocation, Freeing and State Checking?

Re: O(1) Page Allocation, Freeing and State Checking?

Re: O(1) Page Allocation, Freeing and State Checking?

Re: O(1) Page Allocation, Freeing and State Checking?

Re: O(1) Page Allocation, Freeing and State Checking?

Re: O(1) Page Allocation, Freeing and State Checking?