Questions about paging / where to put page directory

mangaluve · Post by **mangaluve** » Wed Mar 03, 2010 6:58 am

I haven't been doing any OS development lately but now I'm about to take it up again. For once, I want to do it right from the beginning. So I've read up on some stuff and I am about to start coding. My main problem now is where to store everything in the memory, and paging. I'm gonna use a self-mapping page directory (haven't done that before but it seems to have great benefits). So I think I'll have a "stage 2" boot loader, that will set up the page directory (self mapped) and enable paging, before moving to the real kernel.

I thought about reserving space at the bottom of the memory for the page directory, and a long array that represents free/used pages. I cannot really have this "free pages mask" among the kernel code, since it must exist before I load the kernel. Is this a good way of doing this? I want everything to be as flexible as possible, and I want to have as few "hard-coded" physical locations as possible. My hope is to get paging up and running, so the actual kernel can be placed anywhere in memory (physical of course). I want to avoid to identity-map the memory for the kernel. Also, how should I handle the kernel stack? Should I allocate a special page for this? Where should I preferably put all this in the memory (the page directory / mask of free pages and so on). It feels like those things should have pre-determined physical addresses.

I also suppose I have to temporary identity-map the code for this loader (so everything wont break when I enable paging).?

gravaera · Post by **gravaera** » Wed Mar 03, 2010 7:49 am

Hi:

These are all really basic questions that get asked many times on this forum. I think you'd get (more than) ample results from perusing old threads

My main problem now is where to store everything in the memory

You may do this wherever you please. It's an implmentation detail constrained only by the chipset and firmware you're writing for. In your case, anywhere above 1MB should do.

I'm gonna use a self-mapping page directory (haven't done that before but it seems to have great benefits).

This trick is really nice and seems great, as long as your kernel remains fully synchronous. But if you delve into asynchronous syscalls, or asynchronous event notifications to userspace, good design would most likely make you have to abandon it, or use it in conjunction with something else.

and a long array that represents free/used pages. I cannot really have this "free pages mask" among the kernel code, since it must exist before I load the kernel. Is this a good way of doing this?

As a rule, every kernel developer on the PC encounters this circular dependency on a memory map. Some use GrUB as a solution. Others load the Memory Manager in stages, etc. It's really just up to you.

As you can see, a lot of it is design related, and generally kernel specific. With nothing more than a bit of reading, you could easily decide most of it for yourself.

--Good luck,
gravaera

mangaluve · Post by **mangaluve** » Wed Mar 03, 2010 7:52 am

Thanks a lot!

Just a short follow up, what about the 1 MB limit, you said I should put my kernel above the first 1 MB of memory. Why is that?

Brendan · Post by **Brendan** » Wed Mar 03, 2010 8:55 am

Hi,

mangaluve wrote:Just a short follow up, what about the 1 MB limit, you said I should put my kernel above the first 1 MB of memory. Why is that?

Mostly because for the first 1 MiB of about half is used by firmware (EBDA, video, ROMs, etc), and most kernels are larger than 512 KiB.

Cheers,

Brendan

mangaluve · Post by **mangaluve** » Wed Mar 03, 2010 3:13 pm

Thanks a lot! I guess I could just identity-map the first MB and not touch it after that..?

The main thing Im wondering is where to store page-information, such as free pages. I thought about having a big array, with one bit for each page, saying whether its free or not, at a pre-determined location. Is this good enough? As I said, I want to minimize the number of "hardcoded" physical addresses, I'd prefer to fetch all memory from a page allocator as soon as possible.

Maybe I could use some space just above the 1 MB mark (physical) to store the page directory and the free-pages-array, and map it to the top of my virtual memory? Of course, it would be nice to have this free-pages-array on a dynamic kernel heap, but maybe that's not good (since I need a page allocator for my heap)? Im just interested in different ways, pros/cons and so on...

Brendan · Post by **Brendan** » Thu Mar 04, 2010 3:31 am

Hi,

mangaluve wrote:Thanks a lot! I guess I could just identity-map the first MB and not touch it after that..?

That's one option...

mangaluve wrote:The main thing Im wondering is where to store page-information, such as free pages. I thought about having a big array, with one bit for each page, saying whether its free or not, at a pre-determined location. Is this good enough? As I said, I want to minimize the number of "hardcoded" physical addresses, I'd prefer to fetch all memory from a page allocator as soon as possible.

A bitmap (with one used/free bit for each page) typically involves searching for free bits, which can be more expensive (slower) than alternative methods. Of course I don't know how good "good enough" is (my idea of "good enough for my OS" can be very different to your idea of "good enough" for your OS).

I usually use a bitmap for memory below 16 MiB (because "free page stacks" can't be used to allocate physically contiguous pages), then one group of "free page stacks" for memory between 16 MiB and 4 GiB, and a second group of "free page stacks" for memory above 4 GiB.

mangaluve wrote:Maybe I could use some space just above the 1 MB mark (physical) to store the page directory and the free-pages-array, and map it to the top of my virtual memory? Of course, it would be nice to have this free-pages-array on a dynamic kernel heap, but maybe that's not good (since I need a page allocator for my heap)? Im just interested in different ways, pros/cons and so on...

These sorts of decisions depend on the OS design, which mostly depends on you.

For my OS, the "first stage" of the boot code gets a "physical address space map" (e.g. from the BIOS - int 0x15, eax = 0xE820) an uses it to initialise a physical memory manager. Then the boot code (the rest of the fist stage, and all later stages) use the physical memory manager to dynamically allocate everything, including any page directories, etc the kernel uses. The kernel itself doesn't really do much initialisation (most of the hard work is done before the kernel is started). This is probably the most flexible/powerful way to do it, but it's also probably the most complex/messy way too.

Cheers,

Brendan

mangaluve · Post by **mangaluve** » Thu Mar 04, 2010 3:55 am

Thanks! I think I'll have some kind of list of free pages, it seems fun to implement. I want to separate the physical memory management from the rest of the OS. One thing I was thinking of though..suppose the first 4 bytes in every free page points to the next free, as a linked list (I'll probably use a buddy system). Now suppose I want to manipulate this, allocating pages and change those pointers. Well, the problem is that I have paging enabled, so I havent mapped any virtual address to those physical pages. Should I switch of paging temporarily when my physical memory manager is called? Or is there a better way, you always got such elegant ideas

Isnt it correct that when I enable/disable paging, I _MUST_ do it from identity-mapped code?

I've been thinking some more about it, and I see quite a few problems. I'd like to load the physical page allocator before the kernel. Fine, I load it somewhere. But then (in another stage, or in the kernel), I want to enable paging. But I still need to reach the page-allocator code somehow. So I have to map the page-allocator to virtual memory. But since the page-allocator must be able to run without paging (when I boot), it has to be identity mapped? Idealy, the page allocator shouldnt know anything at all about paging.

Selenic · Post by **Selenic** » Thu Mar 04, 2010 11:46 am

One point about free space bitmaps is that you can check 32 pages (or 64 on a 64-bit machine) in one instruction by comparing each word of the bitmap to the 'all allocated' value. You can also combine the two and use a free list for 4K (and larger non-contiguous) allocations and the bitmap for multi-page contiguous ones, which could improve performance.

mangaluve wrote:I was thinking of though..suppose the first 4 bytes in every free page points to the next free, as a linked list (I'll probably use a buddy system). Now suppose I want to manipulate this, allocating pages and change those pointers. Well, the problem is that I have paging enabled, so I havent mapped any virtual address to those physical pages.

What I'd do to solve this is to have a block of reserved virtual memory where you keep a giant stack of free pages (other data is stored elsewhere). Notice that you can map and unmap pages as they are needed, so when you empty a page you can allocate that one next and when you need to extend the list you can map in the just-freed page at the top. You therefore have no physical overhead when all memory is in use, and only 1/4096 of the size of physical memory in virtual overhead.

mangaluve wrote:Isnt it correct that when I enable/disable paging, I _MUST_ do it from identity-mapped code?

Yes. It also has about the same overhead as a context-switch, as it has to flush the TLB and any virtual caches (which I think AMD's L1 caches are, but I don't know about Intel's)

mangaluve wrote:I'd like to load the physical page allocator before the kernel. Fine, I load it somewhere. But then (in another stage, or in the kernel), I want to enable paging. But I still need to reach the page-allocator code somehow. So I have to map the page-allocator to virtual memory. But since the page-allocator must be able to run without paging (when I boot), it has to be identity mapped? Idealy, the page allocator shouldnt know anything at all about paging.

There are a couple of solutions I can think of off the top of my head:
- Everything pre-paging uses a separate allocator (required for bootloaders such as GRUB which don't necessarily boot your kernel)
- Use the Higher Half With GDT trick

mangaluve · Post by **mangaluve** » Thu Mar 04, 2010 2:08 pm

Thanks! Im still not sure how to do it though... I'd really like to know a way to store the free pages in a linked list though, isnt that the way linux does it? Im mostly just curious about how it can be done, even if I chose the stack-approach.

osdnlo · Post by **osdnlo** » Thu Mar 04, 2010 5:57 pm

Wouldn't a linked list just make performance lower?

Brendan · Post by **Brendan** » Fri Mar 05, 2010 1:00 am

Hi,

osdnlo wrote:Wouldn't a linked list just make performance lower?

Depending on how it's done, no.

To free a page, you do "new_free_page->next = top_free_page; top_free_page = new_free_page;" and to allocate a page you do "allocated_page = top_free_page; top_free_page = allocated_page->next;". There's never any searching/sorting (it's "O(1)" for single pages and "O(n)" for n pages), it can be done with as little as one variable ("top_free_page") and doesn't touch many cache lines.

The problem is that they're physical pages, which makes it tricky to access them using virtual addressing. If you map all free pages into kernel space (for e.g.) then this problem disappears. Otherwise there's several tricks that could be used.

For example:

Code: Select all

// Page must remain mapped into the virtual address space until after it's freed

void free_phys_page(void *virtual_address, void *physical_address) {
    // Acquire re-entrancy lock if necessary
    *(void **)virtual_address = top_free_page;
    top_free_page = physical_address;
    // Release re-entrancy lock if necessary
}


// Page must be mapped into the virtual address space, then allocated

void *start_alloc_phys_page(void) {
    // Acquire re-entrancy lock if necessary
    return top_free_page;
}

void *complete_alloc_phys_page(void *virtual_address, void *physical_address) {
    top_free_page = *(void **)virtual_address;
    // Release re-entrancy lock if necessary
}

This complicates things a little, and causes an unavoidable TLB miss when a page is allocated (although often pages are accessed soon after they're allocated so the TLB miss would happen a little later anyway).

Cheers,

Brendan

mangaluve · Post by **mangaluve** » Fri Mar 05, 2010 2:01 am

Thanks!

Brendan, in your OS, you mentioned that you first set up a physical page allocator, then you add paging in the next stage and so on. The physical page allocator has to be used later on in your code (for instance when you want to add a page to the kernel heap). But how do you call this code, that was created before paging "existed", from paged code? Are you identity mapping it in the kernel space? Are there any parts of an OS that _HAS_ to be identity mapped? Since it's recommended that you put the kernel in the upper part of the (virtual memory)...

As for the linked list, perhaps it sufficient to map the first free page to a fixed virtual location? And when I manipulate the page-pointers, I could temporarily map them or something...

OSDev.org

Questions about paging / where to put page directory

Questions about paging / where to put page directory

Re: Questions about paging / where to put page directory

Re: Questions about paging / where to put page directory

Re: Questions about paging / where to put page directory

Re: Questions about paging / where to put page directory

Re: Questions about paging / where to put page directory

Re: Questions about paging / where to put page directory

Re: Questions about paging / where to put page directory

Re: Questions about paging / where to put page directory

Re: Questions about paging / where to put page directory

Re: Questions about paging / where to put page directory

Re: Questions about paging / where to put page directory