Page 1 of 3
mapping reserved memory causes page faults
Posted: Fri Nov 13, 2020 11:21 am
by sj95126
I hope I can explain this problem properly - I'm not in a position to post the code (long story).
My kernel is 64-bit using PML4 paging.
After reading the E820 memory table early in the boot process, I then iterate over it and add any regions marked "reserved" to the page tables read-only. I do this so a) those pages can't get reused, and b) so that I can read from them if necessary (say, if ACPI data is stored there). These pages are mapped into the load-high memory space (e.g. physical page X is mapped as 0xFFFF8000000000+X).
This works just fine on multiple virtual machines - QEMU, Bochs, and VirtualBox - but I end up with a page fault on every physical machine I try. The fault is generally the same on any one machine, but varies from machine to machine - one machine gets an error code of 7, another a 0, etc.
I don't think there's anything wrong with my page table manipulation code, because as I said, this works fine on virtual machines. In fact, I even had one virtual machine map the same reserved areas as a particular physical machine uses, and everything was fine, so it doesn't seem like I'm making page table calculation mistakes.
Note that I'm NOT actually reading these reserved areas, just mapping them. (I mean, yes, the idea is to possibly read them, but it's the mapping, not reading, that's causing a page fault).
Am I missing something obvious? I could even understand some kind of fault if I touched those reserved regions, but I'm only mapping them. It seems like the MMU wouldn't even be involved, if you aren't performing a page translation.
Re: mapping reserved memory causes page faults
Posted: Fri Nov 13, 2020 11:32 am
by Octocontrabass
sj95126 wrote:one machine gets an error code of 7
This error code indicates a page fault in user mode. If you're running in ring 0, that would mean either your exception handler is giving you bogus information or something other than a page fault is calling your exception handler.
Re: mapping reserved memory causes page faults
Posted: Fri Nov 13, 2020 11:33 am
by nullplan
Have you tried reading CR2 to see what access is failing? Have you tried printing the return RIP to see what instruction is faulting? That should give you a clue as to what is actually failing here. Mapping pages should not cause page faults, unless they were mapped incorrectly, so I think such a supposition on your part is just conjecture. Also, "it works on QEMU but fails on real hardware" is something quite common in this forum, and it is usually due to access to uninitialized memory or registers. But that won't help you, so focus on the issue at hand. Find the faulting code and the fault address.
Re: mapping reserved memory causes page faults
Posted: Fri Nov 13, 2020 12:59 pm
by sj95126
nullplan wrote:Have you tried reading CR2 to see what access is failing? Have you tried printing the return RIP to see what instruction is faulting? That should give you a clue as to what is actually failing here. Mapping pages should not cause page faults, unless they were mapped incorrectly, so I think such a supposition on your part is just conjecture. Also, "it works on QEMU but fails on real hardware" is something quite common in this forum, and it is usually due to access to uninitialized memory or registers. But that won't help you, so focus on the issue at hand. Find the faulting code and the fault address.
Yes, I've done all that. That's why it doesn't make sense. The location of the fault is in a section of code that DOESN'T fault on a virtual machine with exactly the same inputs.
I only get page faults when I map reserved memory. If I disable that, all the other memory-related tasks the kernel does - creating user-level processes, allocating pages for data structures, stacks, etc. etc. work just fine. And those tasks use the same routine to add to page tables that the map_reserved() call uses. The code works fine. The page faults make no sense. So I'm at a loss.
Re: mapping reserved memory causes page faults
Posted: Fri Nov 13, 2020 1:07 pm
by nexos
Are you accessing regions of type 2 or 5? On some machines that might could cause a #PF (although a #GP would seem more fitting in that case). Just a suggestion
.
Re: mapping reserved memory causes page faults
Posted: Fri Nov 13, 2020 1:14 pm
by sj95126
nexos wrote:Are you accessing regions of type 2 or 5? On some machines that might could cause a #PF (although a #GP would seem more fitting in that case). Just a suggestion
.
I'm *mapping* type 2 but not actually reading from it. I figured there might be some real hardware out there that would #PF if you *read* types 2 or 5, but I'm not actually reading it.
It's strange that the virtual machines wouldn't behave the same way. If the CPU or MMU is supposed to #PF if you come anywhere near reserved memory, they should be replicating that behavior.
Re: mapping reserved memory causes page faults
Posted: Fri Nov 13, 2020 1:28 pm
by nexos
At what point does it page fault? When you add the PTE, or when use flush it to the TLB? I personally don't see any advantage to mapping memory types 2 and 5. They probably don't contains anything useful in them. It probably works on QEMU because AFAIK, QEMU doesn't create reserved memory regions as it has no need to (i.e., no memory holes, no bad memory blocks, etc).
Re: mapping reserved memory causes page faults
Posted: Fri Nov 13, 2020 2:54 pm
by thewrongchristian
sj95126 wrote:nullplan wrote:Have you tried reading CR2 to see what access is failing?
Yes, I've done all that. That's why it doesn't make sense. The location of the fault is in a section of code that DOESN'T fault on a virtual machine with exactly the same inputs.
What was the actual address in CR2? Did it match the RIP address? Was it 0 or non-zero? What was its alignment?
sj95126 wrote:
I only get page faults when I map reserved memory. If I disable that, all the other memory-related tasks the kernel does - creating user-level processes, allocating pages for data structures, stacks, etc. etc. work just fine. And those tasks use the same routine to add to page tables that the map_reserved() call uses. The code works fine. The page faults make no sense. So I'm at a loss.
As its early in the boot process, might it just be some missing corner case in provisioning of intermediate levels of page directories? It could be that mapping in question just happens to be on a boundary between page directories, that is handled more generally once the kernel is fully initialized.
I know my bootstrap page table is limited in the amount of memory that can be mapped until the bootstrap is complete.
Re: mapping reserved memory causes page faults
Posted: Fri Nov 13, 2020 4:33 pm
by kzinti
"Reserved" memory is just that... reserved. You shouldn't try to map it or read it.
Re: mapping reserved memory causes page faults
Posted: Sat Nov 14, 2020 11:53 am
by sj95126
kzinti wrote:"Reserved" memory is just that... reserved. You shouldn't try to map it or read it.
You would think so, but on the QEMU virtual machine I'm running it on, ACPI data is stored in reserved memory (type 2) not ACPI reclaimable (type 3). The ACPI RSDP is located at 0xf58d0, which is within the range (0xf0000-0xfffff) marked reserved.
Re: mapping reserved memory causes page faults
Posted: Sat Nov 14, 2020 2:27 pm
by sj95126
nexos wrote:At what point does it page fault? When you add the PTE, or when use flush it to the TLB? I personally don't see any advantage to mapping memory types 2 and 5. They probably don't contains anything useful in them. It probably works on QEMU because AFAIK, QEMU doesn't create reserved memory regions as it has no need to (i.e., no memory holes, no bad memory blocks, etc).
The page fault happens while it's populating a PT. I'm not explicitly flushing anything to the TLB because I don't (at least at the point of the page fault) actually access any of those mapped pages.
As I said in another post, the reason I'm mapping them is because QEMU is putting ACPI data in areas marked reserved (type 2). I can't know if any of the physical machines may be doing the same because it's faulting before I get to ACPI.
I went back and looked at one of the page faults (I had to snap photos of the screen to capture the output). The value of CR2 is an address halfway into a PT page. I've populated the PT halfway, so it's not like that's an unmapped range or invalid memory. The page fault code is 9, which means reserved bit violation, but as I said, I've run this same code with the same input values on three different virtual machines, and unless they're all ignoring the reserved bit rules, I can't see how it's a problem with my page table code.
Anyway, the general consensus here seems to be that while reading reserved memory conceivably could result in a fault, mapping it should not. Whether you should map reserved memory is a different discussion. I guess I could still have some obscure bug that gets tickled by hardware that runs much faster than virtual machines, but this early in the boot process interrupts are still disabled, and I don't enable any additional CPUs yet.
I'll keep looking.
Re: mapping reserved memory causes page faults
Posted: Sat Nov 14, 2020 4:13 pm
by Octocontrabass
sj95126 wrote:this early in the boot process interrupts are still disabled,
Are you sure? That's one of the things I can think of that would explain the nonsense error codes. (Unless you really are in ring 3 already?)
Re: mapping reserved memory causes page faults
Posted: Sat Nov 14, 2020 5:06 pm
by nexos
If you are using GRUB to boot, this may be the problem:
Wiki wrote:Another problem is that the "type" field is defined as "1 = usable RAM" and "anything else is unusable". Despite what the multi-boot specification says, lots of people assume that the type field is taken directly from INT 15h, EAX=E820 (and in older versions of GRUB it is). However GRUB 2 supports booting from UEFI/EFI (and other sources) and code that assumes the type field is taken directly from INT 15h, EAX=E820 will become broken. This means that (until a new multi-boot specification is released) you shouldn't make assumptions about the type, and can't do things like reclaiming the "ACPI reclaimable" areas or supporting S4/hibernate states (as an OS needs to save/restore areas marked as "ACPI NVS" to do that). Fortunately a new version of the multi-boot specification should be released soon which hopefully fixes this problem (but unfortunately, you won't be able to tell if your OS was started from "GRUB-legacy" or "GRUB 2", unless it adopts the new multi-boot header and becomes incompatible with GRUB-legacy).
Re: mapping reserved memory causes page faults
Posted: Sat Nov 14, 2020 6:53 pm
by sj95126
Octocontrabass wrote:sj95126 wrote:this early in the boot process interrupts are still disabled,
Are you sure? That's one of the things I can think of that would explain the nonsense error codes. (Unless you really are in ring 3 already?)
That was my first thought - that's usually what causes this sort of thing.
I haven't run any ring 3 code yet at this point - the kernel is still getting up and running. And I'm definitely sure that interrupts are disabled, because the function where page tables are updated (and where the #PF occurs) starts by saving rflags and does a cli. (though it is inline asm, which gcc has a way of messing up, but I have done an objdump of the .o file and it's doing the right thing). I don't have proper locks in place yet, so I just disable interrupts and re-enable them after, if they were enabled before.
At this stage of the boot process, there is an IDT, and every handler that's valid (present=1) halts. The interrupts on the PICs haven't been remapped yet, so even if the timer fired, one of the architecture-defined exception handlers would catch it.
Re: mapping reserved memory causes page faults
Posted: Sat Nov 14, 2020 7:38 pm
by Octocontrabass
Inline assembly to mess with rflags should be okay since your kernel is compiled with -mno-red-zone as long as you do it carefully. Since you haven't remapped the PICs, the only IRQ that could reach your page fault handler without causing a page fault is IRQ6 from the floppy drive, and I suspect you're not using one of those.
I guess that leaves some fault in the page fault handler itself that causes it to report nonsense instead of the actual error?