Xcp wrote:I'm losing my mind because of this so please be nice with me...
BTDT, so I can relate.
Xcp wrote:
Are pages mapped just "|=" with 3 (present and writable), right? Like... sorry but I'm blocked right now 0.0. I think that I did what you are describing but I was directly reserve a region from the virtual memory manager without passing by a kernel heap allocator. Should I add it?
Slow down a little. At the moment you are trying to
enable paging. To that end, it is perfectly acceptable to use a temporary environment that gets fixed up for use with the kernel later when you're trying to actually
manage paging.
Xcp wrote:
I find a physical block that best-fits and return a pointer to that region of memory. Once I did this, I make a page setting it as present and to the frame (from 0 to 4MB). Then I add it to the page table. Should I map the page table before accessing it? Always "|=" it with 3?? And before it the page directory too? I don't know what I'm talking about at this point.
That is what the recursive paging hack is all about: You always have all the paging structures mapped, at constant addresses, so yes, mapping is just "set the value at that address to that value". So, if we return to the PAE example from my last post, and eliding any error handling for clarity:
Code: Select all
#define PTE_RECURSE 2
#define PTE(x, y, z) (*(uint64_t*)(0x80000000 | (x << 21) | (y << 12) | (z << 3)))
#define PF_PRESENT (1ull << 0)
#define PF_WRITABLE (1ull << 1)
#define PF_USER (1ull << 2)
uint64_t alloc_phys_page(void); /* have fun writing that one. */
void map_vaddr(uint32_t vaddr, uint64_t paddr, size_t len, uint64_t flags) {
uint32_t al = (vaddr & 4095);
len += al;
vaddr -= al;
paddr -= al;
assert(!(paddr & 4095));
len = (len + 4095) & -4096;
while (len) {
int pdpi = (vaddr >> 30) & 3;
int pdi = (vaddr >> 21) & 0x1ff;
int pti = (vaddr >> 12) & 0x1ff;
assert(pdpi != PTE_RECURSE);
if (!(PTE(PTE_RECURSE, PTE_RECURSE, pdpi) & PF_PRESENT))
PTE(PTE_RECURSE, PTE_RECURSE, pdpi) = alloc_phys_page() | PF_PRESENT | PF_WRITABLE;
if (!(PTE(PTE_RECURSE, pdpi, pdi) & PF_PRESENT))
PTE(PTE_RECURSE, pdpi, pdi) = alloc_phys_page() | PF_PRESENT | PF_WRITABLE;
PTE(pdpi, pdi, pti) = paddr | flags;
asm("invlpg %0" : : "m"((char*)vaddr) : "memory");
vaddr += 4096;
paddr += 4096;
len -= 4096;
}
}
The flags argument is 64 bits to support the NX bit, for which you need PAE, by the way. And the invlpg is only there to support remapping an area. Still, the code might cause spurious page faults, which you might need to handle (that is, you will get a page fault and then notice that the address is actually mapped in, so you just return from the page fault handler).
Xcp wrote:Here you're talking about the process A that starts from 0x00100000. Is this because the first megabyte is always identity mapped (the pointer to the page table should always be the same as the one for the kernel)?
Nope. You need the identity map only to turn paging on and jump to the kernel (higher half). Then you can remove it. You only ever need it again for SMP trampolines (though the other processors have their own CR3 and thus can have their own paging structures) or to turn paging off again, should you wish to shutdown with APM. Most BIOSes these days don't support APM, tho.
Xcp wrote:I think that I just set up 4MB pages and not PAE, however I'm considering it. I've read something about. Before I thought it was ok to not implement it, but now I think it's good and useful, isn't it?
You are entirely correct. 0x10 is bit 4, which is the PSE bit. PAE is bit 5, which would be 0x20.
PSE defines bit 7 of the page directory entry to be the page size bit. If set, the page directory defines a 4MB translation (else it's a normal 4kB translation). And it appears (I hadn't read that part of the handbook yet) that AMD extended PSE paging to allow for a 40-bit physical address space when using 4MB paging.
So, as long as you are OK with the coarse-grained paging, there is hardly a need for PAE, now.
PAE is useful if you ever decide to go 64-bit, or you want more than 4GB of physical memory without giving up 4kB pages. Or (I don't know about this) maybe AMD's extension to the 4MB page directory entry is not portable to Intel CPUs.
Xcp wrote:
Should I map the addresses near eip so?
Sorry, I was confused. Your code should actually work out. It's just a bit of a hack.
The way it was written, you just add 0xc0000000 to every address in your kernel and are done. Meanwhile, I have my entire kernel linked to the -2GB line, and the code which initializes paging does so by parsing the ELF header. My loader also does not depend on the physical load address of the kernel. I have to have something like that since multiboot 1 does ot support 64-bit files.
But for starters your code should work.
Xcp wrote:What are you suggesting in here? I'm wasting 25% of the address space, is it sill worth it?
Depends. What else would you have put there? Remember, we are only talking about virtual space here, you're not wasting any RAM. The kernel is loaded to 0xc0000000, userspace will likely not be able to use memory above 0x80000000 anyway, so you might as well put the recursive map there. There is more than enough space for kernel vspace left after the end of the kernel. And virtual memory is a ressource that is wasted if unused. A glass half full is a glass 50% larger than it needs to be!
Xcp wrote:Finally I'm trying to enable 4kb paging, but it triple faults if I not turn on that bit before enable paging. I thought that it was a previous paging set up just to put the kernel on the higher half and then disable it. Then, if I'm not disabling it, I have to manage the BootPageDirectory mapping to make it works. I have tried something but it doesn't seem to change anything.
And all that just from me mixing up the bits in CR4.
No actually, if you put back the code you had, then your page directory needs to have a 4MB page at physical address 0 right at the start, which it does have. It needs the same entry at index 768. That is, if your kernel is below 3MB in size (it gets loaded to 1MB, right? The end of the kernel must not exceed the 4MB line). So that is:
Code: Select all
.p2align 12
BootPD: .int 0x83 /* present writable supervisor 4MB page at 0 */
.fill 767, 4, 0
.int 0x83
.fill 254, 4, 0
.int BootPD | 3
That should do it. Maps 4MB from 0 to 0 and from 0xc0000000 to 0. And puts a recursive mapping right at the end (conflicts with Xen but who cares). Works so long as the kernel is small enough (its end must be before the 4MB line). If it gets larger, just extend the second mapping (.int 0x00400083; .int 0x00800083; .int 0x00c00083; etc.) and reduce the second repeat count accordingly.