Page 2 of 3
Re: mapping reserved memory causes page faults
Posted: Sat Nov 14, 2020 11:00 pm
by sj95126
Octocontrabass wrote:Inline assembly to mess with rflags should be okay since your kernel is compiled with -mno-red-zone as long as you do it carefully. Since you haven't remapped the PICs, the only IRQ that could reach your page fault handler without causing a page fault is IRQ6 from the floppy drive, and I suspect you're not using one of those.
No floppy drives on the physical machines, and I'm not touching a floppy controller (or any other storage controller) before the fault occurs. My bootsect loads the kernel using the BIOS in real mode before entering long mode.
My inline assembly only does this:
Code: Select all
__asm__ __volatile__ ("pushfq ; popq %0 ; cli" : "=r"(rflags) : : );
and later:
if (rflags & 0x200) {
__asm__ __volatile__ ("sti");
}
I don't like using inline asm for much more than that because of how unreliable it can be. But I have verified from objdump -S that the instructions are exactly what and where they should be.
I guess that leaves some fault in the page fault handler itself that causes it to report nonsense instead of the actual error?
I guess it's possible but because of this and other problems, at the moment my page fault handler is about as simple as you can get:
Code: Select all
movabsq $err_msg_page_fault, %rdi
movq (%rsp), %rsi
movq 8(%rsp), %rdx
movq %cr2, %rcx
xorq %rax, %rax
call kprintf
hlt
As an example, this would print:
PAGE FAULT: code 9, EIP=0xffff800000012b88, CR2=0xffff8000000202460
(this simple handler could certainly do with more detailed output)
Re: mapping reserved memory causes page faults
Posted: Sat Nov 14, 2020 11:06 pm
by sj95126
nexos wrote:If you are using GRUB to boot, this may be the problem:
No GRUB - I'm using my own boot sector, legacy BIOS booting.
But as a data point, I did boot up a really old version of ubuntu 10.04, which uses a pre-GRUB 2 boot loader. It identifies the memory regions exactly the same way I do (reserved, not ACPI reclaimable) and clearly identifies the ACPI data as being located inside those reserved regions. So, I at least can be reassured that my code is working properly in that respect.
Re: mapping reserved memory causes page faults
Posted: Sun Nov 15, 2020 1:04 am
by linuxyne
Some suggestions.
- Write a separate utility that does very little except mapping the reserved mem.
- Map using coarser granularities (2MB, 1GB, etc.).
- Avoid using the NX bit.
- Try identity map.
- Try 32-bit paging.
- Try reading the reserved mem without turning on the MMU.
Re: mapping reserved memory causes page faults
Posted: Sun Nov 15, 2020 4:37 pm
by zaval
The ACPI RSDP is located at 0xf58d0, which is within the range (0xf0000-0xfffff) marked reserved.
this is a correct range for IA-PC RSDP location and it's "read only BIOS". it cannot be ACPI reclaimable memory, but wether should it be reserved is the question too. interestingly, your loader reads it after mapping? what if you tried doing this before, on a physiscal machines?
Re: mapping reserved memory causes page faults
Posted: Sun Nov 15, 2020 6:47 pm
by Gigasoft
Why does your page fault message contain an extra 0 at the end? Besides, the lower 32 bits of the address looks like a typical value for the Eflags register when in Virtual 8086 mode, if you ignore the extra 0.
Re: mapping reserved memory causes page faults
Posted: Sun Nov 15, 2020 7:19 pm
by sj95126
Gigasoft wrote:Why does your page fault message contain an extra 0 at the end? Besides, the lower 32 bits of the address looks like a typical value for the Eflags register when in Virtual 8086 mode, if you ignore the extra 0.
Ah, typo. I typed that by hand, because the only record I have of the crash is snapping a photo of the screen. It should be CR2=0xffff800000202460.
There's nothing wrong with the final digits, despite their resemblance to flags (I don't use VM86 mode). As I start mapping regions, I have to add PDPEs, PDs, and PTs to the paging tables, and my initial page allocator starts handing out pages at 0x200000 (high-address mapped as 0xffff800000200000). The address in question is in a PT, and is the third page handed out (after a page for the PDPE and PD). Hence, between 202000 and 202fff.
Re: mapping reserved memory causes page faults
Posted: Sun Nov 15, 2020 7:27 pm
by sj95126
zaval wrote:
The ACPI RSDP is located at 0xf58d0, which is within the range (0xf0000-0xfffff) marked reserved.
this is a correct range for IA-PC RSDP location and it's "read only BIOS". it cannot be ACPI reclaimable memory, but wether should it be reserved is the question too. interestingly, your loader reads it after mapping? what if you tried doing this before, on a physiscal machines?
Fair point. I should have given a better example, which is this: the ACPI tables pointed to by the RSDP exist in the range 0x7fe0000-0x7ffffff, which is also marked reserved (type 2).
Re: mapping reserved memory causes page faults
Posted: Sun Nov 15, 2020 10:05 pm
by kzinti
Are you having this problem when mapping the ACPI memory range (0x7fe0000-0x7ffffff)?
In my case, I don't ever map "reserved" memory explicitly, but I certainly am mapping the ACPI memory on QEMU, Bochs and real hardware. I've never had problems like what you describe.
In fact I map the entire first 4 GB at boot time (R/W) and don't have any problems, even with type 2 memory regions in the first 4 GB.
Re: mapping reserved memory causes page faults
Posted: Mon Nov 16, 2020 1:45 am
by Gigasoft
What is the instruction at 0xffff800000012b88?
Re: mapping reserved memory causes page faults
Posted: Mon Nov 16, 2020 8:31 am
by MichaelPetch
As others have mentioned I'd be concerned about the fact that a possible IRQ6 came in. Since you apparently didn't remap the PIC you could possibly received such an interrupt in response to a previous disk access. I don't know which interrupts the BIOS use when you boot from USB in FDD (Floppy Disk Emulation) mode. Do you happen to be doing that on real hardware?
You could disable all the interrupts on the PICs and flush the pending ones while still in real mode with something like:
Code: Select all
; Disable IRQs on the Master and Slave PICs
mov al, 0xFF ; Bits that are 1 disable interrupts, 0 = enable
out 0xA1, al ; Disable all interrupts on Slave PIC
out 0x21, al ; Disable all interrupts on Master PIC
; Enable interrupts
sti
; Flush any pending IRQs
mov cx, 8
; Do a loop to allow pending interrupts to be processed.
; Execute enough instructions to process all 16 interrupts.
.irqflush:
dec cx
jnz .irqflush
As for your page table related problems are you assuming the memory with your PTs, PD, PDP, PML4 etc contains all zeroed in it already? Possibly on real hardware memory isn't zeroed out. Another source of problems could be if you assume memory is zero and you haven't correctly zeroed out the BSS area that may be used by the _C/C++_ code (variables expected to be initialized to zero that have non zero values could cause bizarre behaviour in your code). That could cause weird and wonderful issues. I see a lot of custom bootloaders that read kernels and forget that step. Don't ever assume the memory you may be using is zero. Maybe that isn't your problem but I am warning you just in case.
Hopefully you aren't using the area that the EBDA sits in just below 0xA0000. That should be considered reserved as well and you shouldn't put your own code or data there.
I know you can't show your code, so I can only make educated guesses.
Re: mapping reserved memory causes page faults
Posted: Tue Nov 17, 2020 4:54 pm
by sj95126
nexos wrote:At what point does it page fault? When you add the PTE, or when use flush it to the TLB?
Your comment has been rattling around in my skull for a few days, and the lightbulb finally went off. The problem isn't that I wasn't flushing data to the TLB, I wasn't flushing data FROM the TLB.
This is so embarrassing.
While booting, after jumping into a high-mapped space, I remove the old low memory mapping from the page tables, but wasn't invaliding the TLB. That entry was sticking around and obscuring a different bug in the page table update code. I don't have a fix for that yet, but disabling the removal of the low memory mapping made things work for now. Unfortunately, it then immediately crashes for a different unrelated reason, but one thing at a time.
Thanks, everyone. Sorry for chasing the wrong issue.
Re: mapping reserved memory causes page faults
Posted: Tue Nov 17, 2020 10:50 pm
by sj95126
MichaelPetch wrote:As for your page table related problems are you assuming the memory with your PTs, PD, PDP, PML4 etc contains all zeroed in it already? Possibly on real hardware memory isn't zeroed out. Another source of problems could be if you assume memory is zero and you haven't correctly zeroed out the BSS area that may be used by the _C/C++_ code (variables expected to be initialized to zero that have non zero values could cause bizarre behaviour in your code). That could cause weird and wonderful issues. I see a lot of custom bootloaders that read kernels and forget that step. Don't ever assume the memory you may be using is zero. Maybe that isn't your problem but I am warning you just in case.
Hopefully you aren't using the area that the EBDA sits in just below 0xA0000. That should be considered reserved as well and you shouldn't put your own code or data there.
The page table spaces are definitely cleared. The very first thing my bootsect does is zero out the space between the IVT/BDA and 0x7c00, so I can create pages tables and a few other things. The kernel loads at 0x10000 and is still quite small. I don't write anything else below 1 MB.
I would load the kernel a little lower but there's a most obnoxious Bochs bug that was causing me problems if I did that.
Re: mapping reserved memory causes page faults
Posted: Tue Nov 17, 2020 11:11 pm
by Octocontrabass
sj95126 wrote:I would load the kernel a little lower but there's a most obnoxious Bochs bug that was causing me problems if I did that.
Is it the bug where INT 0x13 can't load data across a 64k boundary? If so, that's a feature: some real hardware won't cross 64k boundaries either.
Re: mapping reserved memory causes page faults
Posted: Wed Nov 18, 2020 12:16 am
by sj95126
Octocontrabass wrote:sj95126 wrote:I would load the kernel a little lower but there's a most obnoxious Bochs bug that was causing me problems if I did that.
Is it the bug where INT 0x13 can't load data across a 64k boundary? If so, that's a feature: some real hardware won't cross 64k boundaries either.
It's not that one, but it's similar. It's a weird combination of factors, but basically a multi-sector LBA read to xxxx:0000, where xxxx is just below 1000h, will throw errors when you reach segment 1000h. This doesn't happen with any other value, like, say, segment 2000h.
It happened with a cdrom device; I didn't test with a HD. Bochs doesn't support LBA for floppies.
Re: mapping reserved memory causes page faults
Posted: Wed Nov 18, 2020 4:22 am
by MichaelPetch
sj95126 wrote:It's not that one, but it's similar. It's a weird combination of factors, but basically a multi-sector LBA read to xxxx:0000, where xxxx is just below 1000h, will throw errors when you reach segment 1000h. This doesn't happen with any other value, like, say, segment 2000h.
A value just below 0x1000:0x0000 (let us say 0x0fe0:0x0000) when used with a multisector read would cross the 64KiB DMA barrier at physical address 0x10000.
However if I had to guess what was going on here is that your bootloader didn't set SS:SP to some place where the stack wouldn't be clobbered. In BOCHS the stack is originally set to start at 0x0000:0x0000 which means the first word pushed on the stack will be at 0x0000:0xfffe . That would be just below physical address 0x10000 as well. If you overwrite the stack while the disk interrupt is executing it will do weird and wonderful things.