Idea for virtual memory allocator: yea or nay? [verdict:nay]

austanss · Post by **austanss** » Sat Jan 30, 2021 8:34 pm

I've come up with a detailed plan for a virtual memory allocator: and I wanted to garner your thoughts and answer your questions.

So, first of all, what happens when you call `malloc` or 'kmalloc'?

0. If the caller wants more than 4 kilobytes, return nullptr: request pages instead
1. The allocator checks the first page, if it does not exist, request one and format it.
2. The allocator views the first page's header for some attributes:
- The "full" bit: if it is set, proceed to the next page, if it does not exist, request one and format it.
- The "kernel" bit: if it is set, and the allocator is not the kernel function, proceed to the next page, if it does not exist, request one and format it.
- The "locked" bit: if it is set, don't touch it: something important is going on here, proceed to the next page, if it does not exist, request one and format it.
- Next 5 bits are reserved for possible future use.
- Next 7 bytes are padding to keep 8-byte alignment.
3. Once a suitable page is found, the allocator will check the bitmap of the page for a free block large enough to accommodate the allocation, if one does not exist, repeat step 2.
4. Once all checks are passed, the allocator will set the corresponding bits of the newly-allocated block to reflect the allocation
5. Once the bits are set, the allocator will return the starting address of the allocation.

OK, what happens when you call 'free' or 'kfree'?

Lets refer to the opposite of the allocator, or the free-er, as the "liberator".

0. If the supplied address is nowhere to be found in the allocation pages, do nothing and return.
1. The liberator will locate the corresponding allocation page for the address.
2. The allocator views the page's header for some attributes:
- The "kernel" bit: if it is set, and the liberator is not the kernel function, do nothing and return.
- The "full" bit: if it is set, unset it.
- The "locked" bit: if it is set, don't touch it: do nothing and return.
3. If all checks are passed, the liberator will proceed to unset the corresponding bit and return. *(see footnote)

I'm wondering what you think of this design, and I'll be answering any questions you may have.

**footnote: I must ask: how do I know if a previously allocated address takes up more than one allocatable chunk of space, or bit in the bitmap?

Octocontrabass · Post by **Octocontrabass** » Sat Jan 30, 2021 8:56 pm

Why are you putting the user-mode heap in your kernel?

Have you considered using a linked list for your kernel heap?

austanss · Post by **austanss** » Sat Jan 30, 2021 9:16 pm

Octocontrabass wrote:Why are you putting the user-mode heap in your kernel?

Have you considered using a linked list for your kernel heap?

What do you mean the "putting the user-mode heap in your kernel"?

If you are referring to the user-mode malloc, I plan on using that for malloc syscalls. Or, should the kernel only provide page allocation methods, and it's up to the language to deal with malloc?

Octocontrabass · Post by **Octocontrabass** » Sat Jan 30, 2021 9:29 pm

The kernel should only provide page allocation system calls. Your C runtime library's malloc will use the page allocation system calls as necessary. Other language runtimes may use those system calls to implement a completely different allocator that isn't related to malloc.

neon · Post by **neon** » Sat Jan 30, 2021 9:50 pm

Hi,

To me it just sounds like you are trying to combine the functionality of multiple allocators into one. You would typically want to reserve a system pool and have a dedicated heap allocator use it -- no need to worry about "locked" or "kernel" bits. The allocator should only ever be used by the kernel and/or kernel services -- don't worry about user mode. The heap allocator itself can be just a simple free list or free stack.

austanss · Post by **austanss** » Sat Jan 30, 2021 9:53 pm

Octocontrabass wrote:The kernel should only provide page allocation system calls. Your C runtime library's malloc will use the page allocation system calls as necessary. Other language runtimes may use those system calls to implement a completely different allocator that isn't related to malloc.

Quick question: is there a C runtime library function for page allocation? If so, what is it's signature?

austanss · Post by **austanss** » Sat Jan 30, 2021 9:53 pm

neon wrote:Hi,

To me it just sounds like you are trying to combine the functionality of multiple allocators into one. You would typically want to reserve a system pool and have a dedicated heap allocator use it -- no need to worry about "locked" or "kernel" bits. The allocator should only ever be used by the kernel and/or kernel services -- don't worry about user mode. The heap allocator itself can be just a simple free list or free stack.

OK, that makes things a lot simpler!

austanss · Post by **austanss** » Sat Jan 30, 2021 9:54 pm

So, I've taken nays, and the nays have spoken. What is the best approach for a virtual memory allocator, kernel-only?

Octocontrabass · Post by **Octocontrabass** » Sat Jan 30, 2021 10:20 pm

rizxt wrote:Quick question: is there a C runtime library function for page allocation?

No.

rizxt wrote:What is the best approach for a virtual memory allocator, kernel-only?

Worry about that when you have enough of a kernel to measure different implementations. For now, pick something simple like a free list.

neon · Post by **neon** » Sat Jan 30, 2021 10:46 pm

Hi,

You already have 2 of the layers recognized (PFN bitmap and the need for a heap allocator) -- the heap allocator would need a way to reserve system PTE's to map when expanding (hence the need for a system pool.)

I typically recommend 3 layers:
Frame allocator -- PFN free stack, PFN free list, PFN bitmap, et. al.
System pool -- PTE free list, zone allocator, et. al.
Heap -- Free stack, free list, slab, et. al.

The heap expands by reserving system PTE's for its use. Mapping IO space would also reserve system PTE's so the size of the system pool is limited. The heap allocator should only ever use it to expand (not allocate) and free it if the system needs more resources. The system pool represents a dedicated area of the kernel address space.

Mix and match whatever methods you think would be easiest or best. Our boot loader uses (PFN free stack, zone allocator, free list) whereas our executive uses (PFN free stack, PTE free list, slab). Would be interesting to see what others have done though.

Any case, free list is one of the simpler methods when writing a heap allocator. You'll need a general memory allocator method anyways.

nullplan · Post by **nullplan** » Sat Jan 30, 2021 10:57 pm

rizxt wrote:Quick question: is there a C runtime library function for page allocation? If so, what is it's signature?

In POSIX systems, the call most commonly used for that is mmap.

Code: Select all

void *mmap(void *hint, size_t length, int prot, int flags, int fd, off_t offset);

This function is used to map files into memory. However, there is typically a flag MAP_ANONYMOUS that allows mapping pages from no file at all. And when there isn't, you can typically map /dev/zero for the same effect. The meaning of the "hint" address depends on MAP_FIXED being set in the flags. If it isn't, hint is just a loose lower bound on the address, and can be ignored entirely. If it is set, then hint is the binding address where the memory has to be mapped, replacing whatever mappings were at that address before. Not only is it common to use MAP_FIXED on already-allocated addresses, doing so is the only safe way to use it. For example, I have written shared memory functions that create guard pages around the memory mapping. The way I do that is to allocate anonymous PROT_NONE memory that is two pages larger than the mapping is supposed to be, and then mapping with MAP_FIXED from the file to one page into the area.

Code: Select all

void *base = mmap(0, shmlen + 2 * PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
if (base != MAP_FAILED)
  p = mmap((char*)base + PAGE_SIZE, shmlen, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED, fd, 0);

This is better than mapping the shared memory somewhere and then using MAP_FIXED to put a guard page in front of it, since this way I know there is definitely room for a guard page.

There is also brk(), but let's not go there. brk() makes assumptions about address space layout that just don't hold anymore. As soon as dynamic libraries enter the mix, all bets are off.

Anyway, userspace malloc can use this call to enlarge its heap, and then just manage it. Most malloc implementations never shrink the heap. If you free() some memory, that memory is available for further malloc() in the same process, but not returned to the system. See, what mmap() does is two things: allocation of virtual addresses, and allocation of physical memory. The virtual addresses don't burden any other process, and for a lot of reasons most malloc() implementations want to keep them around and reuse them. The physical memory does burden other processes. So there is also a system call madvise(), that you can use to advise the kernel you won't be needed some memory for the time being. This allows the kernel to deallocate the physical memory backing the region. If it is accessed again, fresh memory can be allocated and zeroed (that is part of madvise() contract).

For the kernel, I have so far found it far better to use caches. malloc() implementations with a shrinking heap are hard to do, so I try to see how far I can get without any malloc() implementation in kernel. Caches go a bit in a different direction: Each cache can only allocate a single type of object. The cache knows how to construct elements and how to destroy them. Since there is only one type of object in the cache, I essentially have fixed length array on my hands. This (and the linked list structure of the thing) make it easy to deallocate physical backing memory, and shrink the virtual kernel memory, too. So if I ever do run low on memory, I can start freeing physical memory that is not needed right now, so it can be used for other tasks.

nexos · Post by **nexos** » Sun Jan 31, 2021 7:12 am

A mediocre VMM could be done, but why not do a good one? Simply put, a good VMM is very complex. You need multiple layers:
The page frame allocator. This part works with physical memory. It allocates and frees chunks of physical memory. It initializes certain regions based on whether they are usable or not. For finding free pages, you can use a simple free stack or bitmap. A free stack is better IMO, as a bitmap is very slow, although you can optimize it. In my system, I plan on having to PMMs. One for the kernel, one for user space. The reason for this is because I am making a microkernel, and the servers should have more control. The user space allocator in my OS will work with zones, i.e, a user zone, a kernel zone, an ISA DMA zone, and a PCI DMA zone. Also, for a good system, every page should be represented in some array (like 4BSD's cmap or Windows' PFN database). This will contain some info, namely, the address of its PTE, so you can back map a physical address to a virtual one. That is uesful for some page swapping.

Also, you will need an MMU layer. This part creates page tables and address space, and maps and unmaps things into it. It manages TLB consistency (which is quite hard on SMP), and it handles allocating the CPU's address space structures for every process. This is probably the simplest layer

Next comes a part that could be mind blowing in complexity. The VMM's operation depends wildly on whether or not you are using tricks such as demand paging, page swapping, shared memory, and copy on write, or if you have a simple mediocore one. The user VMM's operation allocates pages. Generally, there is some high level data structure (i.e. a b-tree or a two level array plus a free list, which is what I plan on doing), that manages pages. It contains info to high level for the MMU structures. For example, info about shared memory, demand paged or swapping status, and many other things. How you do this will probably decide memory use and system speed. So think carefully about, and read the related theory. The kernel VMM will probably just be a free list managing pages. You could use a complex approach for the kernel, but I wouldn't

Finally comes the heap allocator. The heap allocator for the kernel will contains several free lists, probably one for each object type, like Solaris's slab. You could also implement a b-tree based allocator that would allow for allocations of any size. For user mode, you generally will use a b-tree, but note that the user mode heap goes in the C runtime. This C runtime uses mmap or brk to allocate the pages it needs.

Finally, you also need to think about optimizations, like O(1) algorithms, page coloring, and lazy TLB shootdowns just to name a few.

A good starting guide might be https://wiki.osdev.org/Brendan%27s_Memo ... ment_Guide

austanss · Post by **austanss** » Sun Jan 31, 2021 8:09 am

nexos wrote:think about optimizations, like O(1) algorithms

That's a software unicorn.

nexos · Post by **nexos** » Sun Jan 31, 2021 8:14 am

rizxt wrote:
nexos wrote:think about optimizations, like O(1) algorithms
That's a software unicorn.

What do you mean by that? Just graph the function O(1), and you'll see why speed hungry devs like myself are crazy about! Memory allocation happens a lot. It must be fast, else the system is slow.

austanss · Post by **austanss** » Sun Jan 31, 2021 8:19 am

nexos wrote:
rizxt wrote:
nexos wrote:think about optimizations, like O(1) algorithms
That's a software unicorn.
What do you mean by that? Just graph the function O(1), and you'll see why speed hungry devs like myself are crazy about! Memory allocation happens a lot. It must be fast, else the system is slow.

I never understood the reasoning behind O(1) algorithms.

When you throw more parameters at it, it still takes the same amount of processing time?

That doesn't make any sense, unless you were discarding extra parameters.

OSDev.org

Idea for virtual memory allocator: yea or nay? [verdict:nay]

Idea for virtual memory allocator: yea or nay? [verdict:nay]

Re: Design idea for virtual memory allocator: yea or nay?

Re: Design idea for virtual memory allocator: yea or nay?

Re: Design idea for virtual memory allocator: yea or nay?

Re: Design idea for virtual memory allocator: yea or nay?

Re: Design idea for virtual memory allocator: yea or nay?

Re: Design idea for virtual memory allocator: yea or nay?

Re: Design idea for virtual memory allocator: yea or nay?

Re: Design idea for virtual memory allocator: yea or nay?

Re: Design idea for virtual memory allocator: yea or nay?

Re: Design idea for virtual memory allocator: yea or nay?

Re: Idea for virtual memory allocator: yea or nay? [verdict:

Re: Idea for virtual memory allocator: yea or nay? [verdict:

Re: Idea for virtual memory allocator: yea or nay? [verdict:

Re: Idea for virtual memory allocator: yea or nay? [verdict: