Page 1 of 1
Question about virtual memory mapping
Posted: Thu May 26, 2022 9:06 pm
by wux
Hello, I'm quite new to OS development and I hope this isn't a dumb question.
I am writing my virtual memory manager with PAE paging. Right now whenever a page is mapped that does not have an existing page table for it, it will create all the necessary structures (pdpt, pdt, pt) etc. on pages that are provided by the page frame allocator. However, the issue is that these pages (that the structures will be written to) have to be mapped into virtual memory in order to be written to, which obviously creates an issue as this is all happening withing the mapping function.
Am I going about this in a terrible way or? I have looked around a lot and I can't seem to find anyone running into the same issue so I might be doing something completely wrong.
Thanks!
Re: Question about virtual memory mapping
Posted: Sat May 28, 2022 8:59 pm
by nullplan
So, basically the problem is how to access memory you only have a physical address of, at least during page frame creation. There are three common approaches:
1. Recursive mapping
You can make one entry in the PDPT point to the PDPT. That way, all paging structures are always available to be tampered with. The downside is that that requires one entire entry in the highest-level paging structure. While in 32-bit non-PAE paging, this would only be the 1024th part of all virtual memory, and in 64-bit paging it would be the 512th part, in 32-bit PAE paging it is one quarter of virtual address space. Also, in 32-bit PAE paging in particular, you would have to align the PDPT to page alignment and make sure the rest of the page is kept as zero for the trick to work. You can also only ever access the current paging structures this way. Changing the structures of another process is going to be hard, and accessing any physical memory besides paging structures will still require temporary mappings, which leads me to:
2. Temporary mapping
You prepare some page mapping and define it as a CPU-local temporary address. Say you use 0xFFFF_F000 as your temp address, then you make sure the last PDPT and PDT entries are always filled with valid addresses, and you have the last PT always mapped somewhere. Then, if you need to change some physical memory that is not yet mapped, you map it to the temporary address, invalidate the TLB for the address, and access it. That way, you can, for example, zero out a page of memory before linking it in the paging structures (else you would have a mapping to god-knows-where for a moment, and could get invalid TLB entries you didn't even realize).
3. Linear mapping
You map as much of physical memory as you need to some starting address in your kernel. E.g. if you have a higher-half kernel, you map up to 1GB of physical memory to 0xC000_0000. That way, getting a virtual address for a physical one is simply an addition. The downside is you can only handle a limited amount of physical memory this way.
In case you are wondering, Linux uses number three but supplements it with a dynamic mapping approach for things where that is possible. That is, it maps the first 768 MB of physical memory to 0xC000_0000, and in that range, physical-to-virtual translation is easy. For the remaining 256 MB of virtual kernel space, the mappings are dynamically swapped out so that other memory may be used. There is however a discussion going on about removing that second part from the kernel, because it is really big an complicated (and error-prone), and these days you tend not to have 32-bit systems with lots of physical memory. You have 32-bit systems with little memory (up to 1GB) and systems with more memory tend to be 64-bit, where the same problem really doesn't arise (my current approach can handle up to 64 TB of physical memory, so there is a bit of headroom left).
I would suggest you start with number three for now. Being able to handle 1GB of physical memory is better than nothing, right?
Re: Question about virtual memory mapping
Posted: Wed Jun 01, 2022 1:53 pm
by rdos
I prefer recursive mapping for accessing the page tables. I both map the system page tables and the current process page tables, and thus use two entries in the highest level table for 32-bit paging and four entries for PAE. This is a very small part of the virtual address space (1/512th and 1/256th). Actually, I always reserve 1/256th (16 MB) for simplicity so I can support both 32-bit and PAE. I map the process page tables at 0xBF800000 and system page tables at 0xC0000000. I set the boundary between user process and kernel at 0xC0000000. Every process has the 0xC0000000 to 0xFFFFFFFF area mapped to the same area, effectively by copying the highest level page entries. The reason there is a need for the system page mapping is because it contains the allocated highest level page entries for kernel space. These are allocated on demand (in the system table), and is copied to the process table mapping when it is accessed.
Physical memory is another issue. It's a pretty poor idea to map free physical memory in the virtual address space (at least in 32-bit). A better idea is to create bitmap allocators where each bit corresponds to one 4k page. This way, only 32 bytes of physical & linear address space is required per 1MB physical memory. 32MB is enough to handle 1TB of physical memory. This also makes it relatively easy (and fast) to allocate 2MB/4MB aligned physical memory that can be mapped at the page directory level. I map the physical memory bithmap at 0xFB000000, with a size of 0x2800000 (40 MB). Thus, I can handle 1.25TB of physical memory.
There is seldom any reason to access physical memory directly, rather you can typically initiate it after you have mapped it into the virtual address space. The only exception is if it is used for memory maps for PCI devices, but then you can map the link blocks in virtual memory. I have a set of support functions that makes translations between physical and linear (and the reverse) addresses simple. It works by building a local memory allocator in a 4k page, and linking additional pages on demand. The header structure has both physical & linear base addresses for each 4k page used.