OSDev.org

The Place to Start for Operating System Developers
It is currently Sat Apr 27, 2024 1:35 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 3 posts ] 
Author Message
 Post subject: AP Trampoline Crashing After Setting the PG Bit
PostPosted: Tue Dec 19, 2023 4:13 pm 
Offline

Joined: Tue Dec 19, 2023 3:58 pm
Posts: 1
Hello everyone! This is my first time diving into SMP. I have the initialization and startup sequence working, and the APs jumping into the trampoline code. I am using jmp $ instructions as pseudo-breakpoints in gdb, but there seems to be a bug somewhere that prevents the code from executing past the instruction to set the PG bit.

While I can't say for certain that the APs are crashing, gdb indicates they are "running" but are locked at
Code:
0x0000000000000000 in ?? ()
. I haven't been able to figure out definitively why this is occurring, but with my limited experience, I believe it is a page fault. It is especially puzzling because the PM14 being loaded into cr3 in the trampoline code is the same PM14 being used by the BSP. For context, I am copying relevant pointers to a location under 1MB with the following prep function:

Code:
void prepare_core(uint64_t stack, uint64_t pm14, uint64_t ap_entry, uint64_t idt, uint64_t gdt) {
    uint64_t* copy_addr = (uint64_t*)(0x500 + KERNEL_VIRT_BASE_ADDR);
    copy_addr[0] = 0;
    copy_addr[1] = stack;
    copy_addr[2] = pm14;
    copy_addr[3] = ap_entry;
    copy_addr[4] = idt;
    copy_addr[5] = gdt;
}


And here is the trampoline code. Execution seems to stop right after the instruction to set the PG bit.
Code:
[bits 16]

global ap_trampoline

ap_trampoline:
    cli
    cld
    jmp 0x0:(0x1000 + init_32 - ap_trampoline)

gdt_start:
gdt_null:
    dd 0x0
    dd 0x0

gdt_code:
    dw 0xffff
    dw 0
    db 0
    db 0x9a
    db 11001111b
    db 0

gdt_data:
    dw 0xffff
    dw 0
    db 0
    db 0x92
    db 11001111b
    db 0
gdt_end:

gdt_ptr:
    dw (0x1000 + gdt_end - ap_trampoline) - (0x1000 + gdt_start - ap_trampoline) - 1
    dd (0x1000 + gdt_start - ap_trampoline)

init_32:
    mov word [0x500], 1
    mov eax, cr0
    or eax, 1
    mov cr0, eax
    lgdt [0x1000 + gdt_ptr - ap_trampoline]
    jmp 0x08:(0x1000 + setup_32 - ap_trampoline)

[bits 32]
setup_32:
    mov eax, dword [0x500 + 16] ; pm14
    mov cr3, eax

    mov eax, cr4
    or eax, 1 << 5              ; PAE bit
    mov cr4, eax

    mov ecx, 0xc0000080
    rdmsr
    or eax, 1 << 8              ; LM bit
    wrmsr
    mov eax, cr0
    or eax, 1 << 31             ; PG bit
    mov cr0, eax
    jmp $
   
    jmp 0x08:(0x1000 + setup_64 - ap_trampoline)

[bits 64]
setup_64:
    lgdt [0x500 + 40]
    lidt [0x500 + 32]
    mov ax, 0x10
    mov ds, ax
    mov es, ax
    mov ss, ax
    mov ax, 0x20
    mov gs, ax
    mov fs, ax

    mov rsp, qword [0x500 + 8]  ; Stack
    mov rax, qword [0x500 + 24] ; AP entry point

    jmp rax


Any insight would be greatly appreciated! Thank you!


Top
 Profile  
 
 Post subject: Re: AP Trampoline Crashing After Setting the PG Bit
PostPosted: Mon Feb 12, 2024 5:56 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5146
Using Limine to start the APs is definitely easier than a real-mode trampoline.


Top
 Profile  
 
 Post subject: Re: AP Trampoline Crashing After Setting the PG Bit
PostPosted: Mon Feb 12, 2024 10:39 pm 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1605
So I see you load the GDT only after setting protected mode. That is pretty sus to me, as are those address calculations. Using fixed addresses limits your design such that you can only start one CPU at a time. I have a self-contained trampoline design, so I can start as many CPUs as there are free pages in low memory.

One thing I notice directly is that the PML4 address is a 64-bit number. However, with the legacy SMP startup method, you must ensure that the new CPU's PML4 is in the low 4GB, as you can initially only set 32 bits of CR3. Is your initial PML4 in the low 4GB? Because setting PG and then having all hell break loose is normally a sign that something didn't work out right in the page table. Is the trampoline identity mapped in the initial PML4? My SMP booting code creates a copy of the PML4 for use in the booting CPU, and sets up an identity map in it by setting its first entry equal to the 256th (because that is where I put the linear memory map). The AP startup code must then undo this hack and flush all global pages (which means clearing CR4.PGE and setting it again). Your code must also identity map the first page as that is where you've hidden all the data.

One more thing: Your 64-bit code is misaligning the stack. If the function pointer given as "ap_entry" is a C function, you are violating the ABI. But this can be easily fixed by replacing the jmp at the end with a call. Nobody expects that function to return, anyway, but you can put a ud2 as terminator afterwards if you're scared.

Octocontrabass wrote:
Why, Octo, if we wanted things to be easy, we wouldn't be at this hobby. Using Limine to do our dirty work will hardly teach us anything.

_________________
Carpe diem!


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 3 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: DotBot [Bot] and 38 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group