Page 5 of 5

Re: Page Frame Allocation

Posted: Sun Mar 14, 2021 3:23 am
by rpio
...

Re: Page Frame Allocation

Posted: Sun Mar 14, 2021 6:41 am
by thewrongchristian
ngx wrote: And should the frames be zeroes on pfree(), because otherwise I think it is a security risk that exposes the memory contents of previous user?
You could clear page frames on pfree().

But, it might be better to clear page frames on allocation, depending on what the allocation is for. If you're allocating memory for an I/O read from a device, your memory is going to be overwritten anyway, so there's no point clearing it. The device DMA will do the job for you.

If you're getting a page frame that will be mapped to user space, then you'll absolutely have to initialize it (either from a device like above, or with zero if this will be anonymous memory.)

This is where a linked list allocator comes in useful. You can have multiple linked lists. The default list would be like your current bitmap allocator, doling out uninitialized pages. Then you can have a separate list, which tracks free pages that are already zeroed out. Then you can have an idle task which just takes uninitialized pages, zeros them, and deposits them on the zero page list. If your machine is idle anyway, you might as well be doing useful work.

Of course, you can also have a hybrid allocator, which tracks uninitialized pages using your bitmap, but caches zero pages in a list.

Re: Page Frame Allocation

Posted: Sun Mar 14, 2021 8:59 am
by rpio
...

Re: Page Frame Allocation

Posted: Sun Mar 14, 2021 9:32 am
by sj95126
ngx wrote:I like the algorithm of managing pages with a linked list that acts like a stack. But the problem I have with it is that I think it will be extremely slow to initialize(it is only theory I haven't conducted any tests) if I run my OS on a real PC(oldest PC I can find has 8GB of RAM) or anything with like a GB of RAM or more, am I wrong somewhere? - Inserting address of next free frame in to each frame in the system would take a LONG time.
Well, I do my real hardware testing on a 10+ year old laptop, and initializing 4GB of RAM as a linked list slows down the boot by at most about half a second. It does require more reads and writes than a bitmap, including updating the pointer to the head of the list. Some detailed testing would need to be done to determine if the speed of allocating frames by popping the first item off the stack makes up for the increased memory activity.

I feel like I opened a can of worms suggesting the linked list. It does have its advantages, like zero search time to return a free frame, and requires no additional memory. I put it out there as an alternative idea. You know what it replaced? Literally a fixed array that "allocated" the frames in the 1MB-2MB range so the kernel could hand out a few pages to get userspace up and running. Implementing a linked list took all of about 10 minutes. Working out the algorithm for locating the correct bit offset would have taken (me) longer. And I had just recently fixed some major design flaw with the way I was high-mapping addresses in 4-level paging tables. I was in no mood for another round of the chicken-and-egg problems inherent in a memory manager bootstrapping itself. I was close to getting a major piece of things done and I just wanted something that WORKED.

Re: Page Frame Allocation

Posted: Sun Mar 14, 2021 9:38 am
by rpio
...

Re: Page Frame Allocation

Posted: Sun Mar 14, 2021 9:48 am
by thewrongchristian
ngx wrote: I like the algorithm of managing pages with a linked list that acts like a stack. But the problem I have with it is that I think it will be extremely slow to initialize(it is only theory I haven't conducted any tests) if I run my OS on a real PC(oldest PC I can find has 8GB of RAM) or anything with like a GB of RAM or more, am I wrong somewhere? - Inserting address of next free frame in to each frame in the system would take a LONG time.
LONG time is relative. To a GHz CPU, yes, you'll have to write on the order of 20,000,000 words, all of which will probably result in a cache miss, and in CPU terms, it will take an age.

But in human terms, on that same GHz CPU, writing to DDR3 memory, how long do you realistically think it will take? It might be a significant fraction of a second, but I'd be surprised if it took more than a second. And it'll be done once, at boot up, when there is lots of other initialization going on as well, that all take several seconds.

But you can also fill the stack in parallel, like the idle task suggestion to clear pages.
ngx wrote: Another problem I have is that what if a free frame is freed - then it will go into the linked list twice(or even more times) and could possibly get used by two different programs at the same time causing serious problems?
LOL, ha, welcome to C and systems programming.

I personally don't have that problem. My kernel is GC based, so page frames are only freed when they are no longer referenced.
ngx wrote: Also - how do I allocate several consecutive frames(for huge pages...)?
That's also another argument for a hybrid system.

At the lowest level, manage pages using bitmap, which is easier to use for multiple page allocations.

But for zero pages, these tend to be used for anonymous mapped memory anyway, where you get no benefit from contiguous page allocations (page translation takes care of that)

Re: Page Frame Allocation

Posted: Sun Mar 14, 2021 10:20 am
by rpio
...

Re: Page Frame Allocation

Posted: Sun Mar 14, 2021 10:23 am
by thewrongchristian
ngx wrote: If the boot time is fast at even 4GB of RAM then it is pretty good and pushes me even more to using this algorithm. But still I have can't think of a way of giving continuous frames, all other things are very good - like requiring no additional memory which also would need to be mapped to virtual memory
Ask yourself why you particularly want contiguous allocations.

If it's for hardware I/O, most devices can do scatter/gather I/O in a single transfer, so you don't need contiguous allocations for that.

If it's for user memory, paging hardware protects you from needing contiguous allocations. The only benefit of contiguous allocations for user memory is the possibility of using huge page mapping, which benefits some applications like database management systems which manage their own caching.

If it's to use huge page mapping for none-user pages, then it's either for mapping kernel text/data, which will already be contiguous, or it'll be to map devices like a framebuffer, which is already contiguous and not managed by you page allocation anyway.

TL ; DR

I'm not sure it's worth optimising for user cases you're unlikely to need in the short or medium term. First, get it working. Then get it working fast.

Re: Page Frame Allocation

Posted: Sun Mar 14, 2021 10:45 am
by rpio
...

Re: Page Frame Allocation

Posted: Sun Mar 14, 2021 11:05 am
by sj95126
ngx wrote:Actually I agree with you here and probably I would go with this algorithm and just forget about huge pages for now. But the thing that I still think about is how would I stop free pages from being freed - I thought of maybe having some sort of checksum or something like that, but I don't know how to implement it
Well, it is the kernel's responsibility to manage this. Nothing can or should allocate memory if you don't give it to them.

But as a general idea, think of memory allocations as two types: simple and not-simple.

Simple memory allocations are, well, simple. Something asks for a frame. Maybe it's to create a stack page for a user process. Maybe it's to extend the kernel heap. Either way, it's allocated to a single purpose and mapped to a linear address. If and when you free that address, you can return the frame to the free list.

Not-simple allocations are more difficult. Shared memory is one example. You don't want to free a frame just because a process stops using a shared memory segment. Maybe it's needed by another process, or a device driver. So you'll probably need a secondary method to track not-simple allocations.

But how to tell them apart when you're freeing memory? Well, here's one way. You have three bits (9,10,11) in each page table entry you can use for anything you want. Set those bits accordingly when you create a linear address mapping to a newly allocated frame. When you free an address, check the bits. Is this entry used for shared memory? MMIO? A stack barrier? You have 3 bits and 7 combinations to work with. The eighth (000) means simple allocation, so just return the frame to the free list. If not, consult your secondary source.

Re: Page Frame Allocation

Posted: Sun Mar 14, 2021 11:23 am
by thewrongchristian
ngx wrote:
thewrongchristian wrote: I'm not sure it's worth optimizing for user cases you're unlikely to need in the short or medium term. First, get it working. Then get it working fast.
Actually I agree with you here and probably I would go with this algorithm and just forget about huge pages for now. But the thing that I still think about is how would I stop free pages from being freed - I thought of maybe having some sort of checksum or something like that, but I don't know how to implement it
My own VM implementation, all page references are wrapped by a vmpage_t structure, which contains:

- The physical page number represented by this object.
- Flags, to track details like whether the page is dirty, pinned, has been referenced, whether it's managed by the PMM (some pages, like MMIO, are not.)
- A count of COW mappings.
- A back pointer to any virtual mappings to this page (used when invalidating memory mappings.)

So the lifecycle of physical page references comes down to the lifecycle of the above structure. The COW mappings acts as a reference counter to determine whether a page is shared (the count is number of copies, so is 0 when the page is exclusively referenced by a single virtual mapping.)

So, if you use a similar scheme, you don't release your page until the referencing VM structure is also released. And so long as you can correctly track VM usage, you should also be able to reliably track physical page usage.

My GC scheme helps me with that, so the vmpage_t won't be released until there are no more references to it in any of the VM structures. But I could just as easily handled it using the reference counts and careful programming.

Re: Page Frame Allocation

Posted: Sun Mar 14, 2021 11:08 pm
by Octocontrabass
ngx wrote:Isn't the linked list very slow to initialize(in this case)?
No, because the linked list will track regions instead of individual pages. You might need to split or merge regions as drivers start and stop, but it will usually be smaller than a bitmap covering the same range of addresses. You won't track ordinary memory allocations in the linked list, only MMIO. (But you might still add ordinary memory to the linked list, flagged as "not MMIO" so drivers can't try to access it.)
ngx wrote:Also I wanted to ask - how do I make sure that the bitmap is mapped into virtual memory if I just put it in the largest free area, because I don't have any allocator when the OS starts(and I still don't have a VMM at all)?
Once you have a VMM, you'll tell it to map the physical addresses belonging to the bitmap to a reasonable virtual address.

Re: Page Frame Allocation

Posted: Mon Mar 15, 2021 1:51 am
by rdos
sj95126 wrote: Well, here's one way. You have three bits (9,10,11) in each page table entry you can use for anything you want. Set those bits accordingly when you create a linear address mapping to a newly allocated frame. When you free an address, check the bits. Is this entry used for shared memory? MMIO? A stack barrier? You have 3 bits and 7 combinations to work with. The eighth (000) means simple allocation, so just return the frame to the free list. If not, consult your secondary source.
I use bit 11 to indicate that a page is a duplicate/alias and thus that physical memory should not be freed when the linear page is freed. I use this bit for aliasing but also for MMIO, so if the linear area of the MMIO is freed the physical memory will not be added as free. I use bit 10 for copy-on-write (cow), and when a page has this bit set and a write is attempted, a new copy of the page is allocated and the cow bits are cleared. However, cow requires complicated logic, and so there is a need to manage other things too here. Since I typically won't use fork to create new address spaces, rather launch applications with spawn or directly replace the forked environment with exec(), I don't feel there is a need for extensive reference counting.