Octocontrabass wrote:The only reason I can think of to do this is to access the page tables: you must map the same physical memory as both a page and a table in order to read or write the table.
I don't think referring to a page table via a PDE or other higher-level directory is normally referred to as "mapping". The term comes from the notion of mapping a linear address to a physical address. Merely having a directory present at some level and on some physical page doesn't (at least in my view) count as a mapping. That said, you typically do want such pages mapped in to some linear address, so you can manipulate them.
Ethin, since it sounds at this point like you a misunderstanding at some conceptual level, I'll try to explain the paging mechanism the way I think of it.
First thing to note is that the mechanism is primary about mapping linear addresses to physical addresses.
2nd thing to note is that, regardless of how many levels of "paging" you have (5, 4, or less), conceptually it works in the same way: each level is a directory (regardless of what it is called) which divides up the linear address range into a number of entries where each entry specifies the physical address of the next level directory, except in the case of the final level, i.e. the page table level, which provides the physical address of the actual page the linear address will map to). For mapping a linear address to a physical address, you essentially peel off a certain number of bits of the linear address to find the index within the current level directory (starting at the top), and take the physical address of the next level directory from that entry.
Ethin wrote:Oh okay, now I understand. Your right, I'd want to have as many regions available as possible. For filling the tables, could I just generate random addresses in bits 51:12 since these are (linear) addresses?
They are
not linear addresses. They are physical addresses.
The linear address determines how (by what linear address) the memory will be accessed once it is mapped. The physical address determines what physical memory is actually accessed. Usually when you want to map some memory you either want to map a specific address (eg a framebuffer might be accessible via a specific physical address) or you want to choose an address that is available (not mapped elsewhere). I don't think there's much point randomising this. For the linear address, you may randomise it but must make sure that it's not already in use, so it can't be just completely random.
The complete mapping process is something like:
- find a suitable linear address
- find a suitable physical address
- (step 3) figure out the page table, and the entry within, for the chosen linear address
- write the chosen physical address in that entry
Step (3) is complicated by the fact that to write to the page table, you'll need to have the page table itself mapped into memory. Also "walking the hierarchy" which you referred to earlier is theoretically possible, but you need to be aware that because the entries store physical addresses, you need at each level for the directory page itself to be mapped in to the linear address space at some address so you can read it, and you need some way to determine what that address is.
Probably the easiest way to start out is:
- go with a single entry in PML4 (and PML5) as you were first thinking; this lets you map 512GB which is easily enough to start with.
- this means you need a single PDPT, but this should have the full 512 entries
- meaning you need 512 PD pages, each with 512 entries
- and therefore you need 512 * 512 PT pages - that's a full gigabyte worth, mind!
It should be obvious that this scheme potentially wastes a heap of memory, so you might choose not to have the full 512 * 512 PTs, but that means you won't be able to map the full 512GB range (at least, not without allocating more PTs later on, or using larger-sized pages). However, you can arrange all the required directory pages in a pretty straightforward layout somewhere in memory, which makes it easy to find them when you want to manipulate the mappings later on. Eg:
- At address P: 1 PML5
- At P+4k: 1 PML4
- At P+8k: a series of 512 PDs
- At P+8k+(512*4k): a series of 512*512 PTs (or less, if you don't need a full 512GB worth)
Also, in the beginning it's probably easiest to set up a one-to-one linear-to-physical mapping. I.e. map each linear address to the same physical address. This also means you can easily walk the paging directory hierarchy without worrying about the physical/linear difference, but you don't actually need to, because it's trivial to find the PT for a particular linear address - just divide it by 4096, and that's an index into the series of PTs that begin at P+8k+(512*4k). To get this set up, note that you have 512 PDs each with 512 entries, therefore a total of 512*512 PDEs, laid out so that you can consider them as a single array; the first should refer to the first PT, the 2nd to the 2nd PT, and so on.
Once you've got that set up and working, then you can think about improving it and extending it.