[Solved] - SMP trampoline page faulting with allocated stack
Posted: Thu Oct 01, 2020 2:52 pm
I've been working on SMP trampoline code for the past few days and I am running into an issue I can't seem to figure out.
We are talking 32 bits x86 here, but I have the exact same problem with 64 bits x86.
1) I allocate a page in low memory (typically it is at 0x1000)
2) I copy the trampoline code from the kernel image to 0x1000
3) I write some data to 0x1F00 (parameters for the trampoline, it has the CR3, stack and entry point to use)
4) Do the IIPI-SIPI-SIPI dance
5) Trampoline executes and everything is looking good until the end where I push something unto the stack. Then according to QEMU I get a page fault.
Page fault details from QEMU:
6) Something is not working with the stack I dynamically allocate. If I hardcode the stack location anywhere in the first 4GB of memory or if I place the stack inside the trampoline page (say at 0x1F00), everything works just fine. I don't get any page fault.
7) I believe it works when I place the stack at VMA < 4 GB because the first 4GB of physical memory was identity-mapped at boot time. I believe it doesn't work when I allocate the stack dynamically because of some memory cache/TLB synchronization between the two processors.
I tried memory barriers, full flushing and so on by reloading CR3 on the main processor. Nothing seems willing to make it work.
9) Accessing the stack from the BSP processor (after mapping it in VM) works just fine. Only the APs have problems.
I am using recursive page mapping at the moment (PAE). Here are the changes made to the page table when I allocate the stack:
Again accessing that memory from BSP after the allocation + mapping works fine. In fact the VM allocation function has been running fine for months for other purposes.
Accessing that same memory from the AP (namely when pushing a value on the stack) results in a pagefault (exception 0xE):
- Error 0x0b --> PAGEFAULT RESERVED + PAGEFAULT WRITE + PAGE FAULT PRESENT
- %esp is where I expect it to be: 0xff7fe000
- CR2 is at 0xff7fdffc which is what happens when a stack push triggers a page fault
I've tried debugging using bochs. Same behaviour. Manually inspecting the memory and verifying the page tables didn't provide any clue as to what is going on.
Real hardware also resets, so likely the same problem.
I have no clue why the "reserved" bit is set on the page fault error, my code never ever sets the reserved bits anywhere and I am careful to zero-out all memory pages when I allocate them.
You can find the code where I allocate the stack and wake up the APs here:
https://github.com/kiznit/rainbow-os/bl ... #L100-L102
The stack access triggering the page fault:
https://github.com/kiznit/rainbow-os/bl ... smp.S#L107
You can also see where I access the stack from the BSP before starting the APs (no page fault here):
https://github.com/kiznit/rainbow-os/bl ... u.cpp#L112
Thanks for any help or comments here.
We are talking 32 bits x86 here, but I have the exact same problem with 64 bits x86.
1) I allocate a page in low memory (typically it is at 0x1000)
2) I copy the trampoline code from the kernel image to 0x1000
3) I write some data to 0x1F00 (parameters for the trampoline, it has the CR3, stack and entry point to use)
4) Do the IIPI-SIPI-SIPI dance
5) Trampoline executes and everything is looking good until the end where I push something unto the stack. Then according to QEMU I get a page fault.
Code: Select all
# Setup stack
movl 0xF08(%ebx), %esp
# Jump to kernel
movl %ebx, %eax
addl $0x0F00, %eax # eax = TrampolineContext*
pushl %eax # Param 1: TrampolineContext* --> PAGE FAULT
call 0xF0C(%ebx)
Code: Select all
check_exception old: 0xffffffff new 0xe
0: v=0e e=000b i=0 cpl=0 IP=0008:00001082 pc=00001082 SP=0010:ff7fe000 CR2=ff7fdffc
EAX=00001f00 EBX=00001000 ECX=00000000 EDX=000006f3
ESI=00000000 EDI=00000000 EBP=00000000 ESP=ff7fe000
EIP=00001082 EFL=00000006 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
FS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
GS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT= 00001090 00000017
IDT= 00000000 00000000
CR0=80000011 CR2=ff7fdffc CR3=bffce000 CR4=000000a0
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
DR6=ffff0ff0 DR7=00000400
CCS=00000f00 CCD=00001f00 CCO=ADDL
EFER=0000000000000000
7) I believe it works when I place the stack at VMA < 4 GB because the first 4GB of physical memory was identity-mapped at boot time. I believe it doesn't work when I allocate the stack dynamically because of some memory cache/TLB synchronization between the two processors.
I tried memory barriers, full flushing and so on by reloading CR3 on the main processor. Nothing seems willing to make it work.
9) Accessing the stack from the BSP processor (after mapping it in VM) works just fine. Only the APs have problems.
I am using recursive page mapping at the moment (PAE). Here are the changes made to the page table when I allocate the stack:
Code: Select all
Stack physical address 0x0013a000, mapping it to virtual address 0xFF7FD000, flags 0x3 (present + write)
PML3 - previous 0xbffcd001, now 0xbffcd001 (unchanged, as expected)
PML2 - previous 0xbffc9163, now 0xbffc9163 (unchanged, as expected)
PML1 - previous 0x00000000, now 0x0013a103 (looks good, this is physical address with PAGE GLOBAL + PAGE WRITE + PAGE PRESENT
Accessing that same memory from the AP (namely when pushing a value on the stack) results in a pagefault (exception 0xE):
Code: Select all
check_exception old: 0xffffffff new 0xe
0: v=0e e=000b i=0 cpl=0 IP=0008:00001082 pc=00001082 SP=0010:ff7fe000 CR2=ff7fdffc
EAX=00001f00 EBX=00001000 ECX=00000000 EDX=000006f3
ESI=00000000 EDI=00000000 EBP=00000000 ESP=ff7fe000
EIP=00001082 EFL=00000006 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
- %esp is where I expect it to be: 0xff7fe000
- CR2 is at 0xff7fdffc which is what happens when a stack push triggers a page fault
I've tried debugging using bochs. Same behaviour. Manually inspecting the memory and verifying the page tables didn't provide any clue as to what is going on.
Real hardware also resets, so likely the same problem.
I have no clue why the "reserved" bit is set on the page fault error, my code never ever sets the reserved bits anywhere and I am careful to zero-out all memory pages when I allocate them.
You can find the code where I allocate the stack and wake up the APs here:
https://github.com/kiznit/rainbow-os/bl ... #L100-L102
The stack access triggering the page fault:
https://github.com/kiznit/rainbow-os/bl ... smp.S#L107
You can also see where I access the stack from the BSP before starting the APs (no page fault here):
https://github.com/kiznit/rainbow-os/bl ... u.cpp#L112
Thanks for any help or comments here.