Page 1 of 2
physical memory manager
Posted: Tue Mar 04, 2014 2:53 am
by teodori
Hello I have a question concerning the physical memory manager. After entering 64 bits long mode, my kernel reads the memory map made by INT 0x15 EAX e820. I extract the addresses of each usable physical memory pages, but I don't know where to put them. I want to use the stack based approach. Should I set a pointer at a given place in memory, then add an address and increase the pointer?
Re: physical memory manager
Posted: Tue Mar 04, 2014 3:17 am
by Brendan
Hi,
teodori wrote:Hello I have a question concerning the physical memory manager. After entering 64 bits long mode, my kernel reads the memory map made by INT 0x15 EAX e820. I extract the addresses of each usable physical memory pages, but I don't know where to put them. I want to use the stack based approach. Should I set a pointer at a given place in memory, then add an address and increase the pointer?
Typically for the stack method; it's actually a singly linked list. You store the physical address for the first page of the list somewhere (e.g. a "top of stack" variable in the kernel's data); and the first (4 or 8 ) bytes of a free physical page contains the physical address for the next physical page in the linked list (or some sort of end-of-list marker if it's the last page, like 0xFFFFFFFF, because 0x00000000 is a valid physical address).
Once paging is enabled; to allocate a page you get its physical address from the "top of stack" variable and map the physical page into the virtual address space; then fetch the "next" field from the page and update the "top of stack" variable. To free a page you do the reverse - you store the previous "top of stack" variable in the page, then unmap the page, then set the "top of stack" variable to the physical address of the page you just freed.
Also note that; because some PCI devices can only work with 32-bit physical addresses; you're probably going to need a minimum of 2 different free page stacks. Where possible you'd use the stack for RAM above 0x0000000100000000 (and if this stack is empty, or if a device driver requires a page with a 32-bit physical address, then you use the other stack).
Of course there are reasons (scalability, NUMA support, etc) to have many free page stacks. For example, I normally end up with more than 1024 free page stacks (mostly due to "
page colouring").
Cheers,
Brendan
Re: physical memory manager
Posted: Tue Mar 04, 2014 4:03 am
by teodori
Thank you. But if I use a linked list and store the next entry in a physical page, I can't use DMA because it will overwrite an entry and break my linked list. The linked list you are speaking of is it stored on reserved memory space? I thought about 2 different address types, below 16 MiB and all the rest above. So I need at least another for PCI devices that have no more den 32 bits for memory addressing
When my kernel is loaded 64 bits long mode is already active. First 32 MiB are mapped to the first 32 MiB real memory. All paging tables are stored from 0x1800000.
Re: physical memory manager
Posted: Tue Mar 04, 2014 4:36 am
by Brendan
Hi,
teodori wrote:Thank you. But if I use a linked list and store the next entry in a physical page, I can't use DMA because it will overwrite an entry and break my linked list.
If you're using DMA to transfer data to/from
free pages then your code has serious bugs (e.g. you've forgotten to allocate the pages, which would remove them from the free page stack).
teodori wrote:The linked list you are speaking of is it stored on reserved memory space?
The "next" links are stored in the free pages themselves. There's no need to have a special reserved area for this (and also no need to have the entire list mapped into the virtual address space because you're only using the "top of stack" and never inserting pages into the middle).
teodori wrote:I thought about 2 different address types, below 16 MiB and all the rest above. So I need at least another for PCI devices that have no more den 32 bits for memory addressing
Because of ISA DMA restrictions (e.g. where you need to search for "n physically contiguous pages that don't cross a 64 KiB boundary"); I'd use a bitmap or something for RAM below 16 MiB. I'd use the same bitmap for all cases where you need physically contiguous pages (fortunately these cases are rare - sane PCI hardware has scatter-gather to avoid the need for physically contiguous pages).
teodori wrote:When my kernel is loaded 64 bits long mode is already active. First 32 MiB are mapped to the first 32 MiB real memory. All paging tables are stored from 0x1800000.
That will make building the free page stacks a tiny little bit harder - e.g. you'd need to map the pages somewhere somehow (e.g. using a temporary page table), then (for each page) call your normal "freePage()" function to unmap the page and add it to the appropriate free page stack (or bitmap).
Of course if your boot code is yours, for physical pages with 32-bit addresses it's easier to do it before starting your kernel (e.g. build the free page stack/s for pages with 32-bit physical address while in protected mode). Typically this is something I'd do early anyway; so that I can use the free page stack to allocate pages that are used for paging tables, the kernel, etc. That way I know I'm allocating "known good" RAM and (e.g.) if the firmware says that an area of RAM is faulty I'm not blindly loading the kernel into faulty RAM.
Cheers,
Brendan
Re: physical memory manager
Posted: Tue Mar 04, 2014 5:58 am
by teodori
Hm I am completly lost... Don't you have some code examples?
Here is my function displaying the memory map:
Code: Select all
void mm_init(){
uint8_t i, str[32];
uint32_t* mm_bios_smap = MM_BIOS_SMAP_ADDR;
struct mm_map_t mm_map[32];
for(i = 0; i < 32; i++){
mm_map[i].base = (mm_bios_smap[1] << 32) | mm_bios_smap[0];
mm_map[i].length = (mm_bios_smap[3] << 32) | mm_bios_smap[2];
mm_map[i].type = mm_bios_smap[4];
if( (mm_map[i].base == 0) && (mm_map[i].length == 0) )
break;
mm_bios_smap += 6;
string_from_uint64_hex(mm_map[i].base, str);
vga_write(0x0, 0xf, str, 80 + i * 80);
string_from_uint64_hex(mm_map[i].length, str);
vga_write(0x0, 0xf, str, 80 + i * 80 + 15);
if(mm_map[i].type == 1)
vga_write(0x0, 0xf, "Memory", 80 + i * 80 + 30);
else if(mm_map[i].type == 2)
vga_write(0x0, 0xf, "Reserved", 80 + i * 80 + 30);
else if(mm_map[i].type == 3)
vga_write(0x0, 0xf, "ACPI", 80 + i * 80 + 30);
else if(mm_map[i].type == 4)
vga_write(0x0, 0xf, "NVS", 80 + i * 80 + 30);
else if(mm_map[i].type == 5)
vga_write(0x0, 0xf, "Unusuable", 80 + i * 80 + 30);
else if(mm_map[i].type == 6)
vga_write(0x0, 0xf, "Disabled", 80 + i * 80 + 30);
else
vga_write(0x0, 0xf, "Undefined", 80 + i * 80 + 30);
}
}
Here is the part before:
And here before main gets called:
Code: Select all
.code32
# Clear Memory for Paging Data
clrl %eax
movl $0x00006400, %ecx
movl $0x01800000, %edi
rep stosl
# B Register contains Page Table Flags
# Set Entry Present (Bit 0)
# Set Entry Read & Write (Bit 1)
# Set Entry Page-Level Writethrough (Bit 3)
# Set Entry Page-Level Cache Disable (Bit 4)
clrl %ebx
orl $(1<<0), %ebx
orl $(1<<1), %ebx
orl $(1<<3), %ebx
orl $(1<<4), %ebx
# Create 1 PML4 Entry
movl $0x01800000, %edi
movl %edi, %eax
orl %ebx, %eax
addl $0x00001000, %eax
movl %eax, (%edi)
# Create 1 PDP Entry
addl $0x00001000, %edi
movl %edi, %eax
orl %ebx, %eax
addl $0x00001000, %eax
movl %eax, (%edi)
# Create 16 PD Entries
addl $0x00001000, %edi
movl %edi, %eax
orl %ebx, %eax
addl $0x00001000, %eax
movl $0x00000010, %ecx
movl %edi, %edx
create_pd_entry:
movl %eax, (%edi)
addl $0x00001000, %eax
addl $0x00000008, %edi
loop create_pd_entry
movl %edx, %edi
# Create 8192 PT Entries
addl $0x00001000, %edi
clrl %eax
orl %ebx, %eax
movl $0x00002000, %ecx
create_pt_entry:
movl %eax, (%edi)
addl $0x00001000, %eax
addl $0x00000008, %edi
loop create_pt_entry
# Use Model Specific Register 0xc0000080
# Read from Model Specific Register
# Enable System-Call Extension (Bit 0)
# Enable Long Mode (Bit 8)
# Enable No Execute (Bit 11)
# Write to Model Specific Register
movl $0xc0000080, %ecx
rdmsr
orl $(1<<0), %eax
orl $(1<<8), %eax
orl $(1<<11), %eax
wrmsr
# Read from Control Register 4
# Enable Physical Address Extension (Bit 5)
# Enable Page Global Enabled (Bit 7)
# Write to Control Register 4
movl %cr4, %eax
orl $(1<<5), %eax
orl $(1<<7), %eax
movl %eax, %cr4
# Set PML4 address
# Set Page-Level Writethrough (Bit 3)
# Set Page-Level Cache Disable (Bit 4)
# Write to Control Register 3
movl $0x01800000, %eax
orl $(1<<3), %eax
orl $(1<<4), %eax
movl %eax, %cr3
# Read from Control Register 0
# Enable Paging (Bit 31)
# Write to Control Register 0
movl %cr0, %eax
orl $(1<<31), %eax
movl %eax, %cr0
# Load Global Descriptor Table
# Load Interrupt Descriptor Table
lgdt gdt64ptr
lidt idt64ptr
# Setup all Segment Registers
# and reload Code Segment, Instruction Pointer
movw $0x0020, %ax
movw %ax, %ds
movw %ax, %es
movw %ax, %fs
movw %ax, %gs
movw %ax, %ss
ljmp $0x0008, $jump_to_lm
jump_to_lm:
.code64
# Setup the stack
movq $0x7e00, %rbp
movq %rbp, %rsp
# Call Main Function
call main
Re: physical memory manager
Posted: Tue Mar 04, 2014 6:31 am
by Brendan
Hi,
Ok; what I'm suggesting is...
Before this part:
Code: Select all
.code32
# Clear Memory for Paging Data
clrl %eax
movl $0x00006400, %ecx
movl $0x01800000, %edi
rep stosl
..while you're in protected mode you'd setup the free page stack/s for physical memory with 32-bit addresses. You'd also implement yourself a little "allocPhysicalPage()" function that allocates pages while you're in 32-bit protected mode.
Then, during the part that sets up paging tables you'd allocate pages (using the "allocPhysicalPage()" function that allocates pages while you're in 32-bit protected mode) rather than using hard-coded physical addresses for those.
Then (after that, not before), you'd have code that loads the kernel. Basically; (in a loop to load the entire kernel), switch to real mode, load "n pages" of the kernel into a buffer, switch back to protected mode, allocate "n pages" (using the "allocPhysicalPage()" function again) and copy from the buffer into those pages and map them into the paging tables.
Also, do the same for anything else that you've used hard-coded physical addresses for (e.g. kernel stack, kernel's ".bss", the memory map itself, etc). Now you've got a system where most things are dynamically allocated rather than hard-coded; that can tolerate things like faulty RAM at an unfortunate address without crashing and/or refusing to boot.
When you start the kernel you'd pass the physical address of the first page on the free page stack to it. That way the kernel can use it to allocate physical pages (with 32-bit addresses) as soon as its started.
For pages below 0x01000000 (if you're using a bitmap for those) and pages above 0x0000000100000000; you'd implement functions to free a page; then you'd have another "freePhysicalPage()" function that uses the page's physical address to decide if the page should be put in the bitmap or on the "32-bit free page stack" or on the "64-bit free page stack"; and then call that allocator's function to free the page. Then you can implement a "freeVirtualPage()" function (which calls the "freePhysicalPage()" function).
Once all that's done; you'd scan the memory map and find pages that aren't already in the "32-bit free page stack" (e.g. usable pages below 0x01000000 and usable pages above 0x0000000100000000). For each of these pages; you'd map them temporarily into a the virtual address space, then call your "freeVirtualPage()" function (which will free the pages and add them to whichever free physical memory manager is suitable). This makes it easy to initialise both the bitmap and the "64-bit free page stack" with very little code (given that you've already got code to free physical pages that you will need anyway).
Please note that:
- I don't care much about your existing code - it can be changed/deleted and replaced with better code if you want.
- I don't have example code (I have code, but none is suitable as a simple example). If a programmer needs example code then they aren't a programmer (a description of how the software would work should be enough).
- I'm only suggesting one way of doing things. You may choose any other way if you wish.
Cheers,
Brendan
Re: physical memory manager
Posted: Wed Mar 05, 2014 1:23 am
by teodori
Sorry but I am still lost. I don't know how to use a linked list, while I don't have a malloc function and in this case where do I store a linked list entry? At the beginning of a physical page or at a reserved memory space? If I store it at the beginning of each page, I don't have continuous memory, there will be a tiny space every 0x1000 bytes used. And where do I save my page tables, in a continuous space or dispersed throughout the whole memory? How do I allocate memory of smaller size than 0x1000 bytes when page table flags apply to an entire page. I know how paging works, but not how to implement it correctly.
Re: physical memory manager
Posted: Wed Mar 05, 2014 2:05 am
by Velko
teodori wrote:I don't know how to use a linked list, while I don't have a malloc function and in this case where do I store a linked list entry? At the beginning of a physical page or at a reserved memory space?
At the beginning of a physical page.
If I store it at the beginning of each page, I don't have continuous memory, there will be a tiny space every 0x1000 bytes used.
Linked list entry should be placed there only for
free pages - the pages which are not used for anything else. Once you allocate a page, you remove it from the
free-list and linked list entry is not needed anymore. Then you can use the whole page as you see fit.
And where do I save my page tables, in a continuous space or dispersed throughout the whole memory?
I'm using dispersed physical, continuous virtual, but that is just my preference.
How do I allocate memory of smaller size than 0x1000 bytes when page table flags apply to an entire page. I know how paging works, but not how to implement it correctly.
You allocate whole pages and then hand them over to some implementation of malloc(). In turn malloc() should take care of memory chunks smaller than 4K.
Re: physical memory manager
Posted: Wed Mar 05, 2014 2:20 am
by Brendan
Hi,
teodori wrote:Sorry but I am still lost. I don't know how to use a linked list, while I don't have a malloc function and in this case where do I store a linked list entry? At the beginning of a physical page or at a reserved memory space?
You'd store a "physical address of next page on the list" at the beginning of each free physical page on the list.
teodori wrote:If I store it at the beginning of each page, I don't have continuous memory, there will be a tiny space every 0x1000 bytes used.
You don't need physically contiguous memory for 99% of everything. For the remaining 1% (device drivers for old devices that do want contiguous memory) you'd use the bitmap (or something else) for memory below 0x01000000 that you needed for ISA DMA anyway.
For each free page there will be a 4 or 8 byte "physical address of next page on the list", plus 4092 or 4088 bytes of unused RAM. This is fine because the page is free. If you actually wanted to use the page then it wouldn't be on the free page stack to begin with (basically, instead of not using any of that free RAM you'd be using a tiny piece of it while its free).
teodori wrote:And where do I save my page tables, in a continuous space or dispersed throughout the whole memory?
There's never any need for page tables to be in a physically contiguous area for any reason whatsoever.
teodori wrote:How do I allocate memory of smaller size than 0x1000 bytes when page table flags apply to an entire page. I know how paging works, but not how to implement it correctly.
There's typically 3 layers of memory managers. The first is the physical memory manager (which is what we're talking about here). The second layer is the virtual memory manager (which uses the physical memory manager to allocate/free physical pages).
The third layer is "some sort of heap". It could be "malloc()/free()" and/or "new()/delete()" and/or objstacks and/or whatever else (and may or may not involve a garbage collector). This relies on the virtual memory management (e.g. it might use things like "sbrk()" or "mmap()" to request virtual pages and then split them up into smaller pieces). For normal processes, this has nothing to do with the kernel at all - e.g. different languages have different run-time libraries and do their own thing.
For your kernel's own "some sort of heap", it's your kernel so you can implement whatever you like. For example, you could have a general purpose "kmalloc()" and "kfree()" that works like "malloc()/free()". However, these suck a lot (even in user-space) because you get very poor control over cache locality, and it can be a lot better (and less convenient) to design "special purpose allocators" for each special purpose.
Cheers,
Brendan
Re: physical memory manager
Posted: Wed Mar 05, 2014 2:32 am
by teodori
@ both thank you. That helped a lot
I begin now writing my physical memory manager.
Re: physical memory manager
Posted: Tue Mar 11, 2014 5:58 pm
by teodori
I have another question. Should my virtual memory map my physical memory for the memory manger's part? When I browse through my page tables, the addresses, I retrieve from an page table entry, are all physical address. Without that I cannot browse through the page tables, because the addresses, I get, do not match the virtual addresses. For me this seem like an chicken and egg problem....
Re: physical memory manager
Posted: Tue Mar 11, 2014 9:51 pm
by Brendan
Hi,
teodori wrote:I have another question. Should my virtual memory map my physical memory for the memory manger's part?
As far as I'm concerned; mapping "all of" physical memory into (a part of) the virtual address space is a beginner mistake; because:
- it's not necessary for anything
- it may not fit (e.g. computer with 8 GiB of memory and kernel using PAE with only 4 GiB of virtual address space). Note: this isn't likely to be a problem for long mode very soon
- kernels that do this tend to rely on it, leading to a kernel that has lost any/all benefits that could've been gained from using "paging tricks" in kernel space
teodori wrote:When I browse through my page tables, the addresses, I retrieve from an page table entry, are all physical address. Without that I cannot browse through the page tables, because the addresses, I get, do not match the virtual addresses. For me this seem like an chicken and egg problem....
You can map the paging tables (for a virtual address space) directly into that virtual address space; so that you can access everything without caring about any of the physical addresses. There is a tricky way of doing this.
Imagine if you have a physical page of RAM that's used for the PML4, and the last entry in the PLM4 points to the physical address of the PLM4 itself (so that the CPU thinks the PLM4 is a PDPT). In this case:
- For virtual addresses 0xFFFFFFFFFFFFF000 to 0xFFFFFFFFFFFFFFFF:
- the CPU uses the PML4 entry to find the PDPT (which is the same physical page as the PML4)
- then uses the PML4 as if it was a PDPT to find the PD (which is the same physical page as the PML4)
- then uses the PML4 as if it was a PD to find the PT (which is the same physical page as the PML4)
- then uses the PML4 as if it was a PT to find the page (which is the same physical page as the PML4)
- and the end result is that virtual addresses 0xFFFFFFFFFFFFF000 to 0xFFFFFFFFFFFFFFFF end up being a map of the PML4 entries
- For virtual addresses 0xFFFFFFFFFFE00000 to 0xFFFFFFFFFFFFFFFF:
- the CPU uses the PML4 entry to find the PDPT (which is the same physical page as the PML4)
- then uses the PML4 as if it was a PDPT to find the PD (which is the same physical page as the PML4)
- then uses the PML4 as if it was a PD to find the PT (which is the same physical page as the PML4)
- then uses the PML4 as if it was a PT to find the page (which is the same physical page as a PDPT)
- and the end result is that virtual addresses 0xFFFFFFFFFFE00000 to 0xFFFFFFFFFFFFFFFF end up being a map of all PDPT entries
- For virtual addresses 0xFFFFFFFFC0000000 to 0xFFFFFFFFFFFFFFFF:
- the CPU uses the PML4 entry to find the PDPT (which is the same physical page as the PML4)
- then uses the PML4 as if it was a PDPT to find the PD (which is the same physical page as the PML4)
- then uses the PML4 as if it was a PD to find the PT (which is the same physical page as a PDPT)
- then uses the PDPT as if it was a PT to find the page (which is the same physical page as a PD)
- and the end result is that virtual addresses 0xFFFFFFFFC0000000 to 0xFFFFFFFFFFFFFFFF end up being a map of all PD entries
- For virtual addresses 0xFFFFFF8000000000 to 0xFFFFFFFFFFFFFFFF:
- the CPU uses the PML4 entry to find the PDPT (which is the same physical page as the PML4)
- then uses the PML4 as if it was a PDPT to find the PD (which is the same physical page as a PDPT)
- then uses the PDPT as if it was a PD to find the PT (which is the same physical page as a PD)
- then uses the PD as if it was a PT to find the page (which is the same physical page as a PT)
- and the end result is that virtual addresses 0xFFFFFFFFC0000000 to 0xFFFFFFFFFFFFFFFF end up being a map of all PT entries
Basically; it costs nothing to map all paging structures for a virtual address space into a 512 GiB area of that virtual address space, so that you can access everything.
Cheers,
Brendan
Re: physical memory manager
Posted: Wed Mar 12, 2014 3:38 pm
by teodori
Hello Brendan, thank you. Why not using for every page table the last entry to make it point to its own physical address? That would make it easier to access the page tables.
Re: physical memory manager
Posted: Wed Mar 12, 2014 8:05 pm
by Brendan
Hi,
teodori wrote:Hello Brendan, thank you. Why not using for every page table the last entry to make it point to its own physical address? That would make it easier to access the page tables.
You can't access a page table unless you know it's present (otherwise you'd get a page fault), so you have to check the page directory entry to see if the page table is present is first. Of course you can't check the page directory entry unless you know the page directory is present and have to check the PDPT entry, but the PDPT also might not be present so you have to check the PML4 entry first.
Also, you'd end up with a "page table mapping" every 2 MiB throughout the virtual address space if you did that. You could have a special area of the virtual address space for page table mappings to avoid having them scattered everywhere; but then you'd have to allocate extra page/s of RAM for the paging structures for that area.
However (as an alternative); you could limit processes and the kernel to 512 GiB of space each; so that there's only one PDPT for user-space and one for kernel-space; and then make an entry in each PDPT point to that PDPT. In the same way you could limit processes and the kernel to 1 GiB of space each, or 2 MiB of space each, or 4 KiB of space each. I'm not sure I'd do this though, as you never know when someone is going to want to (e.g.) memory map a large file (or memory map a large RAID array).
Cheers,
Brendan
Re: physical memory manager
Posted: Mon Mar 17, 2014 12:41 pm
by teodori
Sorry, first I didn't understand your trick to find the physical address of a page table
Not so easy the memory part of a kernel... Ok so the moment I create a new page table, I set its last entry to its own physical address. Entries 1 to 511 are used to point to other page tables or to the offset, and entry 512 is used to point to the page table itself. This seems like a pretty smart move