Page 1 of 1

Can't Get Long Mode to Work

Posted: Thu Feb 18, 2021 3:42 am
by SKC
Hi.
I'm currently working on my loader, and I'm having trouble entering long mode. Actually, I can enter long mode, but something weird is happening to the segment registers. I use the following code to enter long mode:

Code: Select all

# JumpToKernel(void*,multiboot_info_t*)
.global JumpToKernel
.type JumpToKernel, @function
JumpToKernel:
    pushl %ebp
    movl %esp,%ebp
    # Disable paging
    movl %CR0,%eax
    andl $0x7FFFFFFF,%eax
    movl %eax,%CR0
    # Enable PAE
    movl %CR4,%eax
    orl $0x20,%eax
    movl %eax,%CR4
    # Load PML4
    movl $IdentityL4,%eax
    movl %eax,%CR3
    # Enable Long Mode
    movl $0xC0000080,%ecx
    rdmsr
    orl $0x100,%eax
    wrmsr
    # Load GDT
    lgdt (gdtr)
    # Move multiboot info pointer to ebx
    movl 12(%ebp),%ebx
    # Enable paging (and long mode)
    movl %CR0,%eax
    orl $0x8000000,%eax
    movl %eax,%CR0
    # Move the kernel's entry point to ecx
    movl 8(%ebp),%ecx
    # Jump to long mode
    pushl $0x8
    pushl $code64start
    retfl

.code64
.global code64start
.align 8
code64start:
    movw $0x10,%ax # Data descriptor
    # Update the segment registers
    movw %ax,%ds
    movw %ax,%es
    movw %ax,%fs
    movw %ax,%gs
    movw %ax,%ss
    # Zero the high 32 bits of the kernel's starting address
    xorq %rax,%rax
    movl %ecx,%eax
    # Jump to kernel
    jmpq %rax
Qemu's CPU dump:

Code: Select all

check_exception old: 0xffffffff new 0xd
     0: v=0d e=0010 i=0 cpl=0 IP=0008:000000000010007c pc=000000000010007c SP=0018:0000000000102f54 env->regs[R_EAX]=00000000d88e0010
EAX=d88e0010 EBX=00010000 ECX=0020f2bb EDX=00000000
ESI=0000000d EDI=00000000 EBP=00102f54 ESP=00102f54
EIP=0010007c EFL=00000006 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 00000000 00009100 DPL=0 DS16 [--A]
CS =0008 00000000 00000000 00209a00 DPL=0 CS64 [-R-]
SS =0018 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =0018 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0010 00000000 00000000 00009100 DPL=0 DS16 [--A]
GS =0010 00000000 00000000 00009100 DPL=0 DS16 [--A]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     00107000 00000047
IDT=     00000000 00000000
CR0=08000011 CR2=00000000 CR3=00106000 CR4=00000020
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=00000008 CCD=08000011 CCO=LOGICL
EFER=0000000000000100
I don't really know what's happening thereā€¦ And I also don't know why Qemu thinks the data descriptors are 16bit. Sometimes I get GPF, other times I get BR. I did lots of testing (for the past few days), and it looks like changing SS or DS causes the exceptions. I already verified that the CPU switches to long mode, and it looks like the problem is just changing the segment registers. I also tried setting the descriptor to other values (I use 0x900000000000, as AMD suggests) and setting the registers to 0, but the same thing happens. I also read as many tutorials as I could find, but I still couldn't figure out what's wrong.

Does anyone have an idea what am I doing wrong?
Thanks in advance.

Re: Can't Get Long Mode to Work

Posted: Thu Feb 18, 2021 4:59 am
by iansjack
The first thing to determine is where in your code the exception is happening. The qemu dump shows that you are still in 32-bit mode, which explains why the segment registers still have the values set by multiboot. It looks like you haven't reached .code64.

What instruction in your code is at address 0x0010007c?

Re: Can't Get Long Mode to Work

Posted: Thu Feb 18, 2021 5:30 am
by SKC
Hi.
Thanks for the reply.
iansjack wrote:The qemu dump shows that you are still in 32-bit mode, which explains why the segment registers still have the values set by multiboot.
It did look weird to me that Qemu uses 32bit registers instead of 64bit. I guess that explains it.
iansjack wrote:It looks like you haven't reached .code64.

What instruction in your code is at address 0x0010007c?
code64start is located at 0x100070.
The instruction at 0x10007C is:

Code: Select all

movw %ax,%ss
I just tested again and placed "jmp ." right after code64start, and it caused UD. So I think this verifies that the CPU is in protected mode... But why? It loaded the code descriptor, and Qemu says "CS64" in the dump, so it should be in long mode, right?

Re: Can't Get Long Mode to Work

Posted: Thu Feb 18, 2021 6:09 am
by quirck

Code: Select all

    movl %CR0,%eax
    orl  $0x80000000,%eax
    movl %eax,%CR0
Check number of zeroes, you've set some reserved bit instead of PG.

Re: Can't Get Long Mode to Work

Posted: Thu Feb 18, 2021 6:12 am
by iansjack
It looks to me as if there is something wrong with your GDT, specifically selecter 0x10. Check that the descriptor is a valid 64-bit descriptor and that the memory referenced by SS:RSP is mapped and marked as present in your page table.

Re: Can't Get Long Mode to Work

Posted: Thu Feb 18, 2021 6:51 am
by SKC
Well, if Qemu says that the segment is 16bits, then something is wrong for sure. I just can't figure out what's wrong.
As I said, I use 0x900000000000 (bits 47 and 44 set). AMD's manuals say that the other bits are ignored. I tried setting other bits as well (the ones that are supposed to be ignored) but it looks like it didn't solve my problem. Also, AFAIK, the long mode ignores the data segment registers, so they can contains junk.
For the page table, I did make a mistake, but it's not related (I mapped only the first 2MB, and jumping to the kernel would have caused a page fault). Now, I set present, CacheDisable and ReadWrite for every table (PML4-PDE). I use 2MB pages and I map the first 1GB. Before and after I fixed my tables, Qemu didn't set CR2, so I assume no page faults occurred.

Re: Can't Get Long Mode to Work

Posted: Thu Feb 18, 2021 7:08 am
by iansjack
Long mode does not ignore ds and ss. They most definitely cannot contain junk.

Re: Can't Get Long Mode to Work

Posted: Thu Feb 18, 2021 9:32 am
by SKC
OK.
I changed the descriptor to 0x920000000000 (bits 41, 44 & 47 set), and that caused BR. Look at eip and eax. My kernel starts at 0x40F2BB. So, something trashed eax. I think that the CPU didn't complete the switch to long mode, and that it's still executing instructions in protected mode. That's probably why "jmp ." causes triple fault. After that, I replaced the instructions after "mov %ax,%ss" with "jmp ." and it caused the same exception, but with different eax and eip. I also read here that the segment registers can be set to 0, so I tried that, and I got GPF.

Maybe my code descriptor is bad? I use 0x209A0000000000 (bits 41, 43, 44, 47, 53 set).

First dump:

Code: Select all

check_exception old: 0xffffffff new 0x5
     0: v=05 e=0000 i=0 cpl=0 IP=0008:000000000000f2bf pc=000000000000f2bf SP=0010:0000000000102f56 env->regs[R_EAX]=00000000d88ef2bb
EAX=d88ef2bb EBX=00010000 ECX=0040f2bb EDX=00000000
ESI=0000000f EDI=00000000 EBP=00102f54 ESP=00102f56
EIP=0000f2bf EFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 00000000 00009300 DPL=0 DS16 [-WA]
CS =0008 00000000 00000000 00209a00 DPL=0 CS64 [-R-]
SS =0010 00000000 00000000 00009300 DPL=0 DS16 [-WA]
DS =0018 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0010 00000000 00000000 00009300 DPL=0 DS16 [-WA]
GS =0010 00000000 00000000 00009300 DPL=0 DS16 [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     00106000 00000047
IDT=     00000000 00000000
CR0=08000011 CR2=00000000 CR3=00105000 CR4=00000020
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=00000044 CCD=d88e000f CCO=EFLAGS
EFER=0000000000000100
With jmp .:

Code: Select all

check_exception old: 0xffffffff new 0x0
     0: v=00 e=0000 i=0 cpl=0 IP=0008:00000000000001c1 pc=00000000000001c1 SP=0010:0000000000102ecc env->regs[R_EAX]=00000000d88e0000
EAX=d88e0000 EBX=00010000 ECX=00400000 EDX=00000000
ESI=0000000d EDI=00000000 EBP=00102f54 ESP=00102ecc
EIP=000001c1 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 00000000 00009300 DPL=0 DS16 [-WA]
CS =0008 00000000 00000000 00209a00 DPL=0 CS64 [-R-]
SS =0010 00000000 00000000 00009300 DPL=0 DS16 [-WA]
DS =0018 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0010 00000000 00000000 00009300 DPL=0 DS16 [-WA]
GS =0010 00000000 00000000 00009300 DPL=0 DS16 [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     00106000 00000047
IDT=     00000000 00000000
CR0=08000011 CR2=00000000 CR3=00105000 CR4=00000020
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=00000000 CCD=00000145 CCO=INCW
EFER=0000000000000100
With 0:

Code: Select all

check_exception old: 0xffffffff new 0xd
     0: v=0d e=0000 i=0 cpl=0 IP=0008:000000000010007c pc=000000000010007c SP=0018:0000000000102f54 env->regs[R_EAX]=00000000d88e0000
EAX=d88e0000 EBX=00010000 ECX=0040f2bb EDX=00000000
ESI=0000000d EDI=00000000 EBP=00102f54 ESP=00102f54
EIP=0010007c EFL=00000006 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 00000000 00000000
CS =0008 00000000 00000000 00209a00 DPL=0 CS64 [-R-]
SS =0018 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =0018 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0000 00000000 00000000 00000000
GS =0000 00000000 00000000 00000000
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     00106000 00000047
IDT=     00000000 00000000
CR0=08000011 CR2=00000000 CR3=00105000 CR4=00000020
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=00000008 CCD=08000011 CCO=LOGICL
EFER=0000000000000100

Re: Can't Get Long Mode to Work

Posted: Thu Feb 18, 2021 9:50 am
by quirck
SKC wrote:

Code: Select all

CR0=08000011 CR2=00000000 CR3=00105000 CR4=00000020
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=00000044 CCD=d88e000f CCO=EFLAGS
EFER=0000000000000100
CR0 has bit 27 set, PG bit is clear.
EFER has bit 10 clear, so LMA = 0, long mode is not active.
Check the instruction writing to cr0.
SKC wrote:

Code: Select all

    movl %CR0,%eax
    orl $0x8000000,%eax # <--- Too few zeroes
    movl %eax,%CR0

Re: Can't Get Long Mode to Work

Posted: Thu Feb 18, 2021 10:01 am
by SKC
quirck wrote:EFER has bit 10 clear, so LMA = 0, long mode is not active.
I was just going to post that I found what's wrong... (I mean, I knew I made a dumb mistake, but I didn't think it will be that dumb...)

Fixed CR0 (some why I didn't see the first post) and added this to activate long mode:

Code: Select all

movl $0xC0000080,%ecx
rdmsr
orl $0x400,%eax
wrmsr
It works great.

iansjack and quirck, lots of thanks for helping me.

Re: Can't Get Long Mode to Work

Posted: Thu Feb 18, 2021 10:06 am
by quirck
Actually, when you set IA32_EFER.LME = 1, and then turn on paging via CR0.PG = 1, the IA32_EFER.LMA bit must be set by the processor automatically as a way to confirm that it is now in long mode. The LMA bit is readonly.

Re: Can't Get Long Mode to Work

Posted: Thu Feb 18, 2021 10:14 am
by SKC
quirck wrote:Actually, when you set IA32_EFER.LME = 1, and then turn on paging via CR0.PG = 1, the IA32_EFER.LMA bit must be set by the processor automatically as a way to confirm that it is now in long mode. The LMA bit is readonly.
Yes. I though that I misread the AMD manuals, so I read them again and actually misread them, and now I read them for the 3rd time. I should have noticed from the beginning that EFER is not OK...