Page 1 of 1

Implementing a stack-based page frame allocator

Posted: Mon Mar 29, 2021 4:26 pm
by SHZhang
I have recently been trying to write a page frame allocator, using information from sources such as Page Frame Allocation on the wiki. Looking at the common algorithms, it seemed that keeping the list of free pages in a stack is the simplest one that has good time complexity (constant time per allocation and free).

According to a forum post (I don't remember which), the way this is usually done is: for every free physical page, a pointer is stored in the page (say, at the beginning) providing the physical address of the page immediately below it on the stack. As long as the kernel keeps track of where the top of the stack is, pages can be pushed to or popped from the stack.

The problem I now face is when paging is enabled and one can no longer access memory simply with a physical address, which means that we cannot read the "next page" pointer (because we know only the physical address of the top of the stack), or write a pointer into a newly freed page (as only the physical address of the page being freed is available).

Disabling paging is obviously not a reasonable solution. Another possibility that I have considered is: whenever one needs to read/write a physical address in this process, temporarily map the address to a dedicated location in the address space (using the recursive mapping trick to modify the paging structures), and perform the needed operation from there. However, this does not seem to be a commonly used approach, and I have concerns about its efficiency, since one has to map the page into the address space before modifying it.

What is the proper way to deal with this situation? Or should I store the allocator's stack in some other way?

Re: Implementing a stack-based page frame allocator

Posted: Mon Mar 29, 2021 7:38 pm
by sj95126
I was the one who suggested that approach. The pointers in the free pages need to be linear addresses, which means you have to identity map at least some of your RAM. This probably won't work well in a 32-bit environment (or even 32-bit with PAE).

But in long mode, the linear space is massive. I identity map all free RAM in the 0xffffe+ range, so e.g. linear page 0xffffe00000000000 is physical page 0 (this is in addition to the kernel's usual space at 0xffff8+). Then I use those linear addresses to build the free list. I also use the addresses in that range to update page tables, including adding new PTs, etc. There may be other uses for it I haven't encountered yet.

It does come at a cost of about 0.2% of your RAM for extra page tables, but you have to pick your tradeoffs.

Re: Implementing a stack-based page frame allocator

Posted: Mon Mar 29, 2021 8:53 pm
by SHZhang
Thanks for the reply! I am developing in a 32-bit environment, which rules out the possibility of identity mapping all of RAM.

Do you think that temporarily mapping the physical page to some virtual address on each allocation/free is a reasonable way to work around this?

Re: Implementing a stack-based page frame allocator

Posted: Tue Mar 30, 2021 1:34 pm
by sj95126
SHZhang wrote:Do you think that temporarily mapping the physical page to some virtual address on each allocation/free is a reasonable way to work around this?
That's actually what I used to do to add a new PT to a page table, in case I was in a situation where the page table had to reference itself. I used one fixed linear address to map and unmap a physical page during the time it needed updating. I didn't like it; it just felt clunky. Among other things, you have to invalidate the entry in the TLB every time you reuse the address, to make sure it's pointing to the correct page.

There is another way. You could do something similar to filesystem cluster chaining. Create a consecutive chunk of memory (with valid linear mapping) that's an array of physical page addresses. Each location would contain the address of the next free page. When you need one, take it just like the linked-list stack approach:

newpage = HEAD
HEAD = HEAD->next

This would have about 0.1% overhead, e.g. for a system with 4GB of RAM, you'd need an array of 1 million 32-bit pointers.

Re: Implementing a stack-based page frame allocator

Posted: Tue Mar 30, 2021 5:58 pm
by AndrewAPrice
What I do is keep a page table mapped to virtual memory that I use as my working space. This gives me multiple virtual addresses where I can map any arbitrary physical address when I want to work on it.

Then I only have to keep up with the physical addresses of page directories, etc. and map them when needed.

I also use a stack of free pages.

Re: Implementing a stack-based page frame allocator

Posted: Wed Mar 31, 2021 3:12 pm
by thewrongchristian
sj95126 wrote:
SHZhang wrote:Do you think that temporarily mapping the physical page to some virtual address on each allocation/free is a reasonable way to work around this?
That's actually what I used to do to add a new PT to a page table, in case I was in a situation where the page table had to reference itself. I used one fixed linear address to map and unmap a physical page during the time it needed updating. I didn't like it; it just felt clunky. Among other things, you have to invalidate the entry in the TLB every time you reuse the address, to make sure it's pointing to the correct page.
So long as your temporary mapping is protected by a lock, you don't need to invalidate anything.

You take the lock, ensuring nothing else is doing PMM, safe in the knowledge you can map/unmap the temporary mapping without anyone else referencing that memory area. Your page alloc then becomes:

Code: Select all

// Physical page number
typedef uint32_t page_t;

spin_t pmmlock;
static page_t first_free;
page_t * tempmap;

page_t page_alloc()
{
	spinlock(pmmlock);
	const page_t page = first_free;
	// If we have a page available, remove it from the stack
	if (page) {
		// map the new page into VM
		mmu_map(newpage, tempmap);

		// tempmap now points to a page_t, which is the next free page after this one
		first_free = *tempmap;
	}
	spinunlock(pmmlock);

	// At this point, newpage is our newly allocated page, or 0 if stack empty
	return page;
}

void page_free(page_t page)
{
	spinlock(pmmlock);
	// map the page being freed into VM
	mmu_map(page, tempmap);

	// put the old stack top page into newly freed page
	*tempmap = first_free;
	first_free = page;

	// done - unlock
	spinunlock(pmmlock);
}
Disclaimer - knocked out code, uncompiled, untested.

The beauty of this though is that the lock prevents concurrent access to the virtual memory address, from all CPUs, and the first thing alloc and free does once before updating the stack is remap the temporary mapping as required, overwriting the previous temporary mapping. So we don't even need to unmap the data once we're done, we can just leave the temporary mapping there, avoiding any TLB flush (and TLB shoot down IPI).

sj95126 wrote: There is another way. You could do something similar to filesystem cluster chaining. Create a consecutive chunk of memory (with valid linear mapping) that's an array of physical page addresses. Each location would contain the address of the next free page. When you need one, take it just like the linked-list stack approach:

newpage = HEAD
HEAD = HEAD->next

This would have about 0.1% overhead, e.g. for a system with 4GB of RAM, you'd need an array of 1 million 32-bit pointers.

You'd probably want some per-page meta data anyway, for use in your VMM. This data can contain information such as page status (locked, dirty, accessed, age, usage count), but it can also union with your free list chain.

Code: Select all

struct page_info_t {
	union {
		struct vmpage {
			// flags for dirty, locked, referenced etc.
			unsigned int flags;
			// aging information for page eviction
			unsigned age;
			// COW usage
			unsigned copies;
			// reverse mapping information, an entry for each mapping to this page
			unsigned nrmap;
			struct {
				asid as;
				void * p;
			} * rmap;
		};
		page_t nextfree;
	} u;
};
You can allocate an array of the above at kernel bootstrap, with an entry per physical page, and sequentially free each page, which will also initialize the array as a side effect.

Re: Implementing a stack-based page frame allocator

Posted: Wed Mar 31, 2021 4:05 pm
by sj95126
thewrongchristian wrote:So long as your temporary mapping is protected by a lock, you don't need to invalidate anything.

You take the lock, ensuring nothing else is doing PMM, safe in the knowledge you can map/unmap the temporary mapping without anyone else referencing that memory area.
In the context of the question about reusing a single linear mapping, it's not an issue of exclusive use.

In the original suggestion, you use the free pages themselves to build a linked list. Allocating a page means taking the first entry off the list and changing HEAD to HEAD->next.

When you add a page to the list, you have to write a "next" pointer in the first 4 or 8 bytes of the page, which means you need a valid linear mapping. Since identity mapping all free memory in 32-bit mode isn't practical, the OP's question was about using a single linear mapping which would temporarily point to whichever free page needs its pointer written.

So if you rapidly add many pages to the list (like setting up the list for the first time) and you keep changing the single temporary mapping to point to different physical pages, isn't the TLB going to cache the old entry? Why wouldn't you have to invalidate after each mapping change?

I may be missing something in your answer.

Re: Implementing a stack-based page frame allocator

Posted: Wed Mar 31, 2021 4:48 pm
by thewrongchristian
sj95126 wrote:
thewrongchristian wrote:So long as your temporary mapping is protected by a lock, you don't need to invalidate anything.

You take the lock, ensuring nothing else is doing PMM, safe in the knowledge you can map/unmap the temporary mapping without anyone else referencing that memory area.
In the context of the question about reusing a single linear mapping, it's not an issue of exclusive use.

In the original suggestion, you use the free pages themselves to build a linked list. Allocating a page means taking the first entry off the list and changing HEAD to HEAD->next.

When you add a page to the list, you have to write a "next" pointer in the first 4 or 8 bytes of the page, which means you need a valid linear mapping. Since identity mapping all free memory in 32-bit mode isn't practical, the OP's question was about using a single linear mapping which would temporarily point to whichever free page needs its pointer written.

So if you rapidly add many pages to the list (like setting up the list for the first time) and you keep changing the single temporary mapping to point to different physical pages, isn't the TLB going to cache the old entry? Why wouldn't you have to invalidate after each mapping change?

I may be missing something in your answer.

The TLB will have only a single mapping per virtual address. When you map your page to the temporary mapping address, in my example using mmu_map(), that function will of course have to invalidate any existing mapping that might be cached in the TLB. In the case of a hardware managed TLB, such as used in x86, that will be in the form of a local "invlpg <address>". I neglected to mention that, sorry for the confusion.

I was more making the point that after use, the temporary mapping doesn't need to be invalidated, because it is no longer used, and will be overwritten next time some page is allocated or freed. This avoids the slow case of invalidating page mappings in cases like SMP, where an IPI is required to indicate to other CPUs that might have this mapping to remove that mapping from their TLB. That is slow, and never necessary in this case.