UEFI: Going back to 32-bit mode for a bootloader

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
techdude17
Member
Member
Posts: 35
Joined: Fri Dec 23, 2022 1:06 pm

UEFI: Going back to 32-bit mode for a bootloader

Post by techdude17 »

Hi!

I'm working on a small UEFI bootloader but having problems. I load the kernel image in memory at the requested load address (virtual) but now I'm confused on how to properly start execution.

My 64-bit kernel expects a 32-bit environment so it can craft page tables and transition to long mode, so here's my load process:
- Setup a 32/64-bit GDT/GDTR in low memory and load it, everything is fine
- Disable IRQs
- Use "lretq" to go to a small 32-bit code area that jumps to the kernel

Full code

Code: Select all


.global platform_bootKernelImage
platform_bootKernelImage:
    // Switch stacks
    mov %rdx, %rsp
    
    // Push address of kernel
    push %rdi

    // Load GDT
    lgdt (%rsi)

    // Disable IRQs
    cli

    // Can't disable paging

    // Go to protected mode
    push $0x8
    lea .inProtectedMode(%rip), %rax
    push %rax
    lretq

.code32

.inProtectedMode:
    movw $0x10, %ax
    movw %ax, %ds

    movw %ax, %es
    movw %ax, %fs
    movw %ax, %gs
    movw %ax, %ss

    // Load dummy variables
    mov $0x2BADB002, %eax
    mov $0xDEADDEAD, %ebp

    // Jump to kernel
    jmpl *0(%esp)
    
    cli
    hlt
This works for a short period of time before SP starts pointing to what it was in UEFI mode at a seemingly random instruction.

I have no idea what could possible be going on, but I don't disable paging beforehand so that might be the reason. Does anyone have any good suggestions on what to do or how to solve this?
Octocontrabass
Member
Member
Posts: 5754
Joined: Mon Mar 25, 2013 7:01 pm

Re: UEFI: Going back to 32-bit mode for a bootloader

Post by Octocontrabass »

techdude17 wrote: Sun Apr 13, 2025 12:15 pmMy 64-bit kernel expects a 32-bit environment so it can craft page tables and transition to long mode,
That sounds backwards. Instead of making your UEFI bootloader switch to 32-bit mode to call the kernel, why don't you make your BIOS bootloader switch to 64-bit mode? (Or, since you're using Multiboot, why not use GRUB on UEFI too?)
techdude17 wrote: Sun Apr 13, 2025 12:15 pmThis works for a short period of time before SP starts pointing to what it was in UEFI mode at a seemingly random instruction.
Have you checked QEMU's interrupt log ("-d int") to see if it might be caused by an exception?
techdude17
Member
Member
Posts: 35
Joined: Fri Dec 23, 2022 1:06 pm

Re: UEFI: Going back to 32-bit mode for a bootloader

Post by techdude17 »

1. Not sure what you mean - I already use GRUB for the kernel in Multiboot mode but it still loads in 32-bit mode so I have a trampoline to get to 64-bit mode
2. First thing I thought of as well, but interrupts are disabled. SP randomly points to what it was during UEFI code (the trampoline loads it's own stack), then there's of course a page fault because it's no longer mapped
sebihepp
Member
Member
Posts: 210
Joined: Tue Aug 26, 2008 11:24 am
GitHub: https://github.com/sebihepp

Re: UEFI: Going back to 32-bit mode for a bootloader

Post by sebihepp »

You don't jump to protected mode with the code you have shown. You are still in Long Mode but using the compatibility mode (Segment CM32)
Octocontrabass
Member
Member
Posts: 5754
Joined: Mon Mar 25, 2013 7:01 pm

Re: UEFI: Going back to 32-bit mode for a bootloader

Post by Octocontrabass »

techdude17 wrote: Mon Apr 14, 2025 6:34 am1. Not sure what you mean - I already use GRUB for the kernel in Multiboot mode but it still loads in 32-bit mode so I have a trampoline to get to 64-bit mode
Right. Why is that trampoline inside your kernel? If it were a separate binary, your UEFI bootloader could skip the trampoline and jump directly into 64-bit kernel code without switching back to 32-bit mode.
techdude17 wrote: Mon Apr 14, 2025 6:34 am2. First thing I thought of as well, but interrupts are disabled.
Exceptions can't be disabled. Exceptions are a type of interrupt. Have you checked QEMU's interrupt log?
techdude17
Member
Member
Posts: 35
Joined: Fri Dec 23, 2022 1:06 pm

Re: UEFI: Going back to 32-bit mode for a bootloader

Post by techdude17 »

I disagree with that design of having a separate executable. It overcomplicates a simple task and brings the kernel further from a standardized operating environment.

Did not realize I don't actually exit Long Mode but shouldn't it not matter? I'll update the code but my kernel trampoline successfully sets up its page tables and jumps to proper long mode.

And yes, I also meant that I checked QEMU's interrupt log with -d int. There are no IRQs after leaving the UEFI app, and the next entry is the page fault after SP points to an unmapped address
Octocontrabass
Member
Member
Posts: 5754
Joined: Mon Mar 25, 2013 7:01 pm

Re: UEFI: Going back to 32-bit mode for a bootloader

Post by Octocontrabass »

techdude17 wrote: Mon Apr 14, 2025 2:39 pmIt overcomplicates a simple task
Please explain how switching to 32-bit mode when loading a 64-bit kernel on 64-bit UEFI is not overcomplicating a simple task.
techdude17 wrote: Mon Apr 14, 2025 2:39 pmand brings the kernel further from a standardized operating environment.
What exactly is a "standardized operating environment"?
techdude17 wrote: Mon Apr 14, 2025 2:39 pmDid not realize I don't actually exit Long Mode but shouldn't it not matter?
It depends on the assumptions your trampoline code makes about the CPU state.
techdude17 wrote: Mon Apr 14, 2025 2:39 pmAnd yes, I also meant that I checked QEMU's interrupt log with -d int. There are no IRQs after leaving the UEFI app, and the next entry is the page fault after SP points to an unmapped address
What does the interrupt log say about the page fault? It might be possible to work backwards from the faulting instruction to find the problem.
techdude17
Member
Member
Posts: 35
Joined: Fri Dec 23, 2022 1:06 pm

Re: UEFI: Going back to 32-bit mode for a bootloader

Post by techdude17 »

Please explain how switching to 32-bit mode when loading a 64-bit kernel on 64-bit UEFI is not overcomplicating a simple task.
This argument is pointless and my kernel design is my kernel design - thank you for the suggestion but I'd rather operate off of Multiboot standards and use a 32-bit kernel with pmode. As well as that this loader is designed to load any Multiboot compliant kernel (including 32-bit ones, which means that I will have to fix the compatibility mode)
It depends on the assumptions your trampoline code makes about the CPU state.
Well, trampoline code DOES assume that its passed a direct 32-bit protected mode environment. It enables LME, sets up page tables, configures other parts and fully executes properly through the UEFI bootloader (followed via GDB)
What does the interrupt log say about the page fault? It might be possible to work backwards from the faulting instruction to find the problem.
This is the interrupt log:

Code: Select all

   455: v=68 e=0000 i=0 cpl=0 IP=0038:0000000006746718 pc=0000000006746718 SP=0030:0000000007f101b8 env->regs[R_EAX]=0000000000157218
RAX=0000000000157218 RBX=0000000007f10468 RCX=0000000000180ae3 RDX=000000000006c779
RSI=0000000000000000 RDI=0000000000157218 RBP=0000000007f10200 RSP=0000000007f101b8
R8 =00000000001c3991 R9 =0000000000000001 R10=0000000000000000 R11=0000000000000000
R12=0000000000000000 R13=00000000067e5618 R14=00000000066adbc0 R15=0000000006712ca8
RIP=0000000006746718 RFL=00000283 [--S---C] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0030 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
CS =0038 0000000000000000 ffffffff 00af9a00 DPL=0 CS64 [-R-]
SS =0030 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =0030 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0030 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
GS =0030 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT=     00000000079de000 00000047
IDT=     0000000007470018 00000fff
CR0=80010033 CR2=0000000000000000 CR3=0000000007c01000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=0000000000000081 CCD=fffffffffffbd152 CCO=EFLAGS
EFER=0000000000000d00
check_exception old: 0xffffffff new 0xe
   456: v=0e e=0002 i=0 cpl=0 IP=0008:0000000000112e0e pc=0000000000112e0e SP=0010:0000000007f10468 CR2=0000000007f10464
EAX=80010032 EBX=04000087 ECX=c0000080 EDX=2badb002
ESI=00000000 EDI=001b2060 EBP=001c31ec ESP=07f10468
EIP=00112e0e EFL=00000082 [--S----] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
GS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT=     00000000001c3a00 0000004f
IDT=     0000000007470018 00000fff
CR0=80010033 CR2=0000000007f10464 CR3=00000000001af000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=00000000 CCD=c0000080 CCO=DECL
EFER=0000000000000d00
check_exception old: 0xe new 0xe
As shown, the first interrupt (v=68) occurs in my UEFI bootloader (PC = 6746718, and SP = 0030:7f10468). It isn't shown, but after trampoline code executes a new stack is loaded properly and ESP is uploaded.

The next interrupt occurs right after trampoline code has executed. SP somehow has switched back to 7f10468 (with proper data segments), and since I don't map that in my kernel it page faults (then triple faulting as SP pointing to unmapped memory is unresolvable)

The first function which executes after the bootcode (again, all of which from what I can see runs properly - SP points to the proper value and nothing else happens) is arch_main. Here is its start:

Code: Select all

0000000000112e00 <arch_main>:
  112e00:	55                   	push   %rbp
  112e01:	48 89 e5             	mov    %rsp,%rbp
  112e04:	41 54                	push   %r12
  112e06:	49 89 fc             	mov    %rdi,%r12
  112e09:	bf 60 20 1b 00       	mov    $0x1b2060,%edi
  112e0e:	53                   	push   %rbx
(worth noting, RBP is zeroed out by trampoline)

The crash happens at 112e0e, when it tries to access stack. I have no idea why its crashing or where it gets the UEFI stack from.
Octocontrabass
Member
Member
Posts: 5754
Joined: Mon Mar 25, 2013 7:01 pm

Re: UEFI: Going back to 32-bit mode for a bootloader

Post by Octocontrabass »

techdude17 wrote: Mon Apr 14, 2025 5:13 pmAs well as that this loader is designed to load any Multiboot compliant kernel (including 32-bit ones, which means that I will have to fix the compatibility mode)
...So then why not use GRUB?
techdude17 wrote: Mon Apr 14, 2025 5:13 pm

Code: Select all

CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
You're trying to execute 64-bit code with the CPU in 32-bit mode. One of those instructions is decoded as a MOV to ESP, and it's just coincidence that the source operand is close to the UEFI stack pointer.
techdude17
Member
Member
Posts: 35
Joined: Fri Dec 23, 2022 1:06 pm

Re: UEFI: Going back to 32-bit mode for a bootloader

Post by techdude17 »

You're trying to execute 64-bit code with the CPU in 32-bit mode. One of those instructions is decoded as a MOV to ESP, and it's just coincidence that the source operand is close to the UEFI stack
Oops.. that was the problem :oops:
Post Reply