Page 1 of 2

Higher Half kernel load, linker error

Posted: Thu Mar 18, 2021 4:55 am
by finarfin
Hi all,

what this time i'm trying to do is to load the kernel in the higher half, starting from address 0xC0000000.

First thing i did was update the bootloader to map the kernel memory into higher half (i mapped the first 4mb) and i'm using 2mb pages:

Code: Select all

    ; Now time to map the kernel in the higher half
    mov eax, hhk_p2_table   
    or eax, 0b11
    mov dword [fbb_p2_table + 0], eax

    mov eax, 0x00000000 ; This is the base address of the kernel
    or eax, 0b10000011
    mov [hhk_p2_table + 0], eax

    mov eax, 0x00200000
    or eax, 0b10000011
    mov [hhk_p2_table + 8 * 1], eax ;  Multiply by 1 just to highlight the entry number

    ; All set... now we are nearly ready to enter into 64 bit
    ; Is possible to move into cr3 only from another register
    ; So let's move p4_table address into eax first
    ; then into cr3
    mov eax, (p4_table - KERNEL_VIRTUAL_BASE_ADDRESS)
    mov cr3, eax

(i hope that i understood correctly what was needed to do, and that the code above is not totally wrong! :shock: )

Then i updated the linker script (i have to admit that despite all the tutorials i read the linker script is that thing that i have an idea of what they does... but still haven't totally understood them):

Code: Select all

ENTRY(start)

SECTIONS {
    . = 0x00100000;

    _kernel_start =.;
    .boot :
    {
        /* Be sure that multiboot header is at the beginning */
        *(.multiboot_header)
    }

    .text :
    {
        *(.text)
    }

    . += 0xC0000000;
	/* Add a symbol that indicates the start address of the kernel. */
	.text ALIGN (4K) : AT (ADDR (.text) - 0xC0000000)
	{
		*(.text)
	}
	.rodata ALIGN (4K) : AT (ADDR (.rodata) - 0xC0000000)
	{
		*(.rodata)
	}
	.data ALIGN (4K) : AT (ADDR (.data) - 0xC0000000)
	{
		*(.data)
	}
	.bss ALIGN (4K) : AT (ADDR (.bss) - 0xC0000000)
	{
		*(COMMON)
		*(.bss)
		*(.bootstrap_stack)
	}

    _kernel_end = .;
}
Ok the code aove is copied from the wiki tutorial https://wiki.osdev.org/Higher_Half_x86_Bare_Bones, probably i need to adapt it to my actual code.

Btw now when i try to compile the os i'm getting this linker error:

Code: Select all

main.c:(.text+0x59): relocation truncated to fit: R_X86_64_PC32 against symbol `tagfb' defined in .bss section in build/kernel/main.o
main.c:(.text+0x65): relocation truncated to fit: R_X86_64_PC32 against symbol `tagfb' defined in .bss section in build/kernel/main.o
main.c:(.text+0x96): relocation truncated to fit: R_X86_64_PC32 against symbol `tagfb' defined in .bss section in build/kernel/main.o
main.c:(.text+0xc6): relocation truncated to fit: R_X86_64_PC32 against symbol `tagfb' defined in .bss section in build/kernel/main.o
main.c:(.text+0xf6): relocation truncated to fit: R_X86_64_PC32 against symbol `tagfb' defined in .bss section in build/kernel/main.o
main.c:(.text+0x127): relocation truncated to fit: R_X86_64_PC32 against symbol `tagfb' defined in .bss section in build/kernel/main.o
main.c:(.text+0x158): relocation truncated to fit: R_X86_64_PC32 against symbol `tagfb' defined in .bss section in build/kernel/main.o
main.c:(.text+0x17e): relocation truncated to fit: R_X86_64_PC32 against symbol `tagfb' defined in .bss section in build/kernel/main.o
main.c:(.text+0x1b9): relocation truncated to fit: R_X86_64_PC32 against symbol `tagmem' defined in .bss section in build/kernel/main.o
main.c:(.text+0x1d5): relocation truncated to fit: R_X86_64_PC32 against symbol `tagmem' defined in .bss section in build/kernel/main.o
where tagfb and tagmem are containing multiboot header information. I think that the problem is mostly in the linker script that needs probably to be adapted, but not sure what i need to change.

Can someone help me?

And another question: if i understood correctly, having mapped the kernel in the higher half, and having mapped the first 4mb of memory now the VGA memory (i know is deprecated :), and i'm planning to remove it soon) that starts at 0xb8000 now should start at 0xC00b8000, correct?

Also another question: reading the tutorial it says that while mappin the kernel, text, and .rodata are now marked as writable, but i'm not sure how i should map them as not writable (i mean i think they are just portion of memory, so maybe they are in a segment that contains other parts that has to be writable?)
EDIT 1
Ok, did some tests, and i'm sure is my linker script that is messed up. But as said above, the more i try to understand it, the more foggy is my knowledge of it lol.
Btw what i understood is that the sections are messed up, and it has to be adapted to my kernel. and what i understood is that the portion of the code that does the reloctation must be still accessible on the lower half, so those sections should be declared in the 0x100000 part of the linker.
Actually i have the following sections:
  • section .multiboot_header -> in the multiboot header file
  • section .text -> The bootloader code
  • section .bss -> Still in the bootloader file with the page tables data
  • section .rodata -> with the gdt data
Now i tried to update the sections:
  • Section text has become .multiboot.text and i updated the linker script replacing the .text part (in the 0x100000 area) with: .multiboot.text
  • I also added a section .text that starts right after paging is enabled (that should now refer to the kernel in the higher_half),
  • label higher_half and code to load it just after the paging is enabled. This is my change:

    Code: Select all

      lea ebx, [higher_half]
      jmp ebx
    section .text
    higher_half:
      ;gdt load code
      ;and jmp to C function.
    
IT still doesn't work, but now the error is no longer in the main.c file but somewhere else:

Code: Select all

psf.c:(.text+0x55): relocation truncated to fit: R_X86_64_32S against symbol `_binary_fonts_default_psf_start' defined in .data section in build/default.o
psf.c:(.text+0x75): relocation truncated to fit: R_X86_64_32S against symbol `_binary_fonts_default_psf_start' defined in .data section in build/default.o
build/kernel/arch/x86_64/cpu/idt.o: in function `interrupts_handler':
idt.c:(.text+0x14): relocation truncated to fit: R_X86_64_32S against `.rodata'
build/kernel/arch/x86_64/cpu/idt.o: in function `init_idt':
idt.c:(.text+0x96): relocation truncated to fit: R_X86_64_32S against `.bss'
build/kernel/arch/x86_64/cpu/idt.o: in function `set_idt_entry':
idt.c:(.text+0x4ea): relocation truncated to fit: R_X86_64_32S against `.bss'
idt.c:(.text+0x4f5): relocation truncated to fit: R_X86_64_32S against `.bss'
idt.c:(.text+0x504): relocation truncated to fit: R_X86_64_32S against `.bss'
idt.c:(.text+0x50a): relocation truncated to fit: R_X86_64_32S against `.bss'
idt.c:(.text+0x511): relocation truncated to fit: R_X86_64_32S against `.bss'
idt.c:(.text+0x518): relocation truncated to fit: R_X86_64_32S against `.bss'
Is that a progress? :)
P.s i'm using Nasm to compile the assembly.
End of edit 1

Edit 2
Ok digging into the forum i found, that in order to link properly all references, i had to add the -mcmodel=large to the gcc compile flags.

Now it links correctly, but now when it boots up, is probably causing a triple fault when trying to enable paging.
Now i'm not sure what can be the problem: it can be related on how i mapped the kernel in the pagetable, or i stil lshould investigate the linker script?
End of edit 2

Re: Higher Half kernel load, linker error

Posted: Thu Mar 18, 2021 9:13 am
by nullplan
The linker error messages imply you are linking a 64-bit executable, yet the rest of the stuff shown here implies a 32-bit kernel. Are you maybe not using a cross-compiler? You really should. I suggest you install clang, that way you can switch targets with the "--target" switch.

Re: Higher Half kernel load, linker error

Posted: Thu Mar 18, 2021 9:17 am
by finarfin
Hi nullplan,

yeah i'm using a cross-compiler.

I added the flag -mcmodel=large to gcc, and now it is linking, but read the edit 2, now when i try to enable the paging, i'm getting a triple fault, so i'm guessing there is still something meesed up in the memory layout.

Btw this is my updated linekr script:

Code: Select all

ENTRY(start)

SECTIONS {
    . = 1M;

    _kernel_start =.;
    .multiboot_header :
    {
        /* Be sure that multiboot header is at the beginning */
        *(.multiboot_header)
    }

    .multiboot.text :
    {
        *(.multiboot.text)
    }

    . += 0xC0000000;
	/* Add a symbol that indicates the start address of the kernel. */
	.text ALIGN (4K) : AT (ADDR (.text) - 0xC0000000)
	{
		*(.text)
	}
	.rodata ALIGN (4K) : AT (ADDR (.rodata) - 0xC0000000)
	{
		*(.rodata)
	}
	.data ALIGN (4K) : AT (ADDR (.data) - 0xC0000000)
	{
		*(.data)
	}
	.bss ALIGN (4K) : AT (ADDR (.bss) - 0xC0000000)
	{
		*(.bss)
	}

    _kernel_end = .;
}

Another change done is in the assembly code, to map the kernel, i think the previous version was wrong.
The new page table loading looks like:

Code: Select all

mov eax, p3_table   ; Copy p3_table address in eax
    or eax, 0b11        ; set writable and present bits to 1
    mov dword [p4_table + 0], eax   ; Copy eax content into the entry 0 of p4 table

    mov eax, p2_table   ; Let's do it again, with p2_table
    or eax, 0b11       ; Set the writable and present bits
    mov dword [p3_table + 0], eax   ; Copy eax content in the 0th entry of p3

    ; Now let's prepare a loop...
    mov ecx, 0  ; Loop counter
    
    .map_p2_table:
        mov eax, 0x200000   ; Size of the page
        mul ecx             ; Multiply by counter
        or eax, 0b10000011 ; We set: huge page bit, writable and present 

        ; Moving the computed value into p2_table entry defined by ecx * 8
        ; ecx is the counter, 8 is the size of a single entry
        mov [p2_table + ecx * 8], eax

        inc ecx             ; Let's increase ecx
        cmp ecx, 512        ; have we reached 512 ?
                            ; each table is 4k size. Each entry is 8bytes
                            ; that is 512 entries in a table
        
        jne .map_p2_table   ; if ecx < 512 then loop

    ; This section is temporary, is here only to test the framebuffer features!
    ; Will be removed once the the memory management will be implemented
    mov eax, fbb_p2_table
    or eax, 0b11
    mov dword [p3_table + 8 * 3], eax

    mov eax, 0xFD000000
    or eax, 0b10000011
    mov dword [fbb_p2_table + 8 * 488], eax

    ; Now time to map the kernel in the higher half
    ; mov eax, hhk_p2_table   
    ; or eax, 0b11
    ; mov dword [fbb_p2_table + 0], eax

    mov eax, 0x000000 ; This is the base address of the kernel
    or eax, 0b10000011
    mov dword [fbb_p2_table + 0], eax

    mov eax, 0x200000
    or eax, 0b10000011
    mov dword [fbb_p2_table + 8 * 1], eax ;  Multiply by 1 just to highlight the entry number

    ; All set... now we are nearly ready to enter into 64 bit
    ; Is possible to move into cr3 only from another register
    ; So let's move p4_table address into eax first
    ; then into cr3
    mov eax, (p4_table - 0xC0000000)
    mov cr3, eax

; Now we can enable PAE
    ; To do it we need to modify cr4, so first let's copy it into eax
    ; we can't modify cr registers directly
    mov eax, cr4
    or eax, 1 << 5  ; Physical address extension bit
    mov cr4, eax
    
    ; Now set up the long mode bit
    mov ecx, 0xC0000080
    ; rdmsr is to read a a model specific register (msr)
    ; it copy the values of msr into eax
    rdmsr
    or eax, 1 << 8
    ; write back the value
    wrmsr
    
    ; Now is tiem to enable paging
    mov eax, cr0    ;cr0 contains the values we want to change
    or eax, 1 << 31 ; Paging bit
    or eax, 1 << 16 ; Write protect, cpu  can't write to read-only pages when
                    ; privilege level is 0
    mov cr0, eax    ; write back cr0
    
    hlt 
    lea ebx, [higher_half]
    jmp ebx

The hlt instruction in the 3rd last line is never reached. ( i tried to put hlt just before setting cr0 and it is reached)

Re: Higher Half kernel load, linker error

Posted: Thu Mar 18, 2021 10:26 am
by Velko
finarfin wrote:so i'm guessing there is still something meesed up in the memory layout.
Not sure about memory layout, but the way how you compile it is definitely messed up. I'll pretty much will repeat what nullplan said before

You're trying to produce a 64-bit executable. That is indicated by relocation truncated to fit: R_X86_64_PC32 error messages and your -mcmodel=large fix. Yet, the assembly code (register names) are for 32-bit mode. Also, 0xC0000000 is higher half for 32-bit, but not even close for 64 (that would be more like 0xFFFF800000000000).

So the main question is - what type of kernel you are trying to make? 32 or 64 bit? You said, you're using cross-compiler. What is the target architecture for that?

If you are making 32 bit kernel (I suggest you do, until you get a better understanding), you might get better luck using -m32 compiler option (or better yet - producing 32-bit cross compiler).

Re: Higher Half kernel load, linker error

Posted: Thu Mar 18, 2021 10:36 am
by finarfin
Velko wrote:
finarfin wrote:so i'm guessing there is still something meesed up in the memory layout.
Not sure about memory layout, but the way how you compile it is definitely messed up. I'll pretty much will repeat what nullplan said before

You're trying to produce a 64-bit executable. That is indicated by relocation truncated to fit: R_X86_64_PC32 error messages and your -mcmodel=large fix. Yet, the assembly code (register names) are for 32-bit mode. Also, 0xC0000000 is higher half for 32-bit, but not even close for 64 (that would be more like 0xFFFF800000000000).
I fixed that part now is compiling and linking without problem (as specified in the edit and my reply
Velko wrote: So the main question is - what type of kernel you are trying to make? 32 or 64 bit? You said, you're using cross-compiler. What is the target architecture for that?
64bit, gcc target architecture is: x86_64
Velko wrote: If you are making 32 bit kernel (I suggest you do, until you get a better understanding), you might get better luck using -m32 compiler option (or better yet - producing 32-bit cross compiler).
No, i decided to try to make a 64 bit kernel, and i have no itnerest in doing a 32 one (even because i already did it in the past).
Velko wrote: If you still insist on 64-bit: I'm not sure if anything can be re-used. Probably your main.c and psf.c (from what I can tell), but linker scripts and assembly code needs big changes.
That is what i'm doing... i'm trying to follow the tutorila in the wiki section, and another very similar tutorial somewhere else. And my linekr script was already hugely updated, as well as lot changes done in the assembly code, even if in theory the first part before mapping the kernel in theory doesn't need so many changes, and again now the linking part is now fine (i had to add the -mcmodel=large flag), i'm getting a triple fault as soon as i'm enablign the paging. That is now th actual problem and trying to firgure out what can be.. Probably something not mapped correctly, or some changes i'm missing somewhere.

Re: Higher Half kernel load, linker error

Posted: Thu Mar 18, 2021 2:02 pm
by finarfin
I dumped the registers when the exception is fired from the cpu, this is the state:
check_exception old: 0xffffffff new 0xe
0: v=0e e=0000 i=0 cpl=0 IP=0010:00000000001000d8 pc=00000000001000d8 SP=0018:000000000007ff00 CR2=00000000001000d8
EAX=80010011 EBX=0010e620 ECX=c0000080 EDX=00000000
ESI=36d76289 EDI=0010e620 EBP=00000000 ESP=0007ff00
EIP=001000d8 EFL=00200086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0018 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
CS =0010 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0018 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
DS =0018 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
FS =0018 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
GS =0018 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT= 00000000000010b0 00000020
IDT= 0000000000000000 00000000
CR0=80010011 CR2=00000000001000d8 CR3=0000000000108000 CR4=00000020
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=00000200 CCD=80010011 CCO=LOGICL
EFER=0000000000000500
So looks like the triple fault is caused by the unhandled exception. Now need to understand why! :)

Re: Higher Half kernel load, linker error

Posted: Thu Mar 18, 2021 2:24 pm
by Velko
Ok, 64-bit it is. For starters you have to get Compatibility Mode going - all code for this mode should still be 32-bit. Paging structures should be set up as for 64-bit mode.

You will not be able to run C code compiled for 64-bit, as you need to switch to Long Mode first (set up GDT appropriately and do a long jump to 64-bit code). You should be able to reach your hlt instruction however.

What are the addresses of your p4_table, p3_table etc.? When loading into cr3, you subtract 0xC0000000 from p4_table and error dump shows that cr3 is 0x108000. So it must be located somewhere in "upper half". Yet, when filling the tables you treat the addresses as-is.

Everything that you put in page tables should be physical addresses, and while paging is not enabled, you can not write in "high" addresses either (if you do not have some crazy setup using GDT).

Re: Higher Half kernel load, linker error

Posted: Thu Mar 18, 2021 5:28 pm
by finarfin
Velko wrote:Ok, 64-bit it is. For starters you have to get Compatibility Mode going - all code for this mode should still be 32-bit. Paging structures should be set up as for 64-bit mode.

You will not be able to run C code compiled for 64-bit, as you need to switch to Long Mode first (set up GDT appropriately and do a long jump to 64-bit code).
Sorry maybe i wasn't totally clear, all that part is done, i have a basic kernel with idt, framebuffer, i/o routines that if loaded in the lower half (at 0x100000) it works fine.
Velko wrote: You should be able to reach your hlt instruction however.
Yeah that is actually my problem if i try to move stuff to the higher half, i get this page fault.
Velko wrote: What are the addresses of your p4_table, p3_table etc.? When loading into cr3, you subtract 0xC0000000 from p4_table and error dump shows that cr3 is 0x108000. So it must be located somewhere in "upper half". Yet, when filling the tables you treat the addresses as-is.
The kernel starts at 0x100000.
Velko wrote: Everything that you put in page tables should be physical addresses, and while paging is not enabled, you can not write in "high" addresses either (if you do not have some crazy setup using GDT).
Yes i know. But probably i forgot to do something needed to move the kernel in the higher half. not sure.

Btw this is the full boot.s code:

Code: Select all

global start
extern kernel_start


section .multiboot.text
bits 32

start:
    mov edi, ebx ; Address of multiboot structure
    mov esi, eax ; Magic number
    ; For now we are goin to use 2Mib pages 
    ; We need only 3 table levels instead of 4
    mov eax, p3_table   ; Copy p3_table address in eax
    or eax, 0b11        ; set writable and present bits to 1
    mov dword [p4_table + 0], eax   ; Copy eax content into the entry 0 of p4 table

    mov eax, p2_table   ; Let's do it again, with p2_table
    or eax, 0b11       ; Set the writable and present bits
    mov dword [p3_table + 0], eax   ; Copy eax content in the 0th entry of p3

    ; Now let's prepare a loop...
    mov ecx, 0  ; Loop counter
    
    .map_p2_table:
        mov eax, 0x200000   ; Size of the page
        mul ecx             ; Multiply by counter
        or eax, 0b10000011 ; We set: huge page bit, writable and present 

        ; Moving the computed value into p2_table entry defined by ecx * 8
        ; ecx is the counter, 8 is the size of a single entry
        mov [p2_table + ecx * 8], eax

        inc ecx             ; Let's increase ecx
        cmp ecx, 512        ; have we reached 512 ?
                            ; each table is 4k size. Each entry is 8bytes
                            ; that is 512 entries in a table
        
        jne .map_p2_table   ; if ecx < 512 then loop

    ; This section is temporary, is here only to test the framebuffer features!
    ; Will be removed once the the memory management will be implemented
    mov eax, fbb_p2_table
    or eax, 0b11
    mov dword [p3_table + 8 * 3], eax

    mov eax, 0xFD000000
    or eax, 0b10000011
    mov [fbb_p2_table + 8 * 488], eax

    ; Now time to map the kernel in the higher half
    mov eax, hhk_p2_table   
    or eax, 0b11
    mov dword [fbb_p2_table + 0], eax

    mov eax, 0x00000000 ; This is the base address of the kernel
    or eax, 0b10000011
    mov [hhk_p2_table + 0], eax

    mov eax, 0x00200000
    or eax, 0b10000011
    mov [hhk_p2_table + 8 * 1], eax ;  Multiply by 1 just to highlight the entry number

    ; All set... now we are nearly ready to enter into 64 bit
    ; Is possible to move into cr3 only from another register
    ; So let's move p4_table address into eax first
    ; then into cr3
    mov eax, (p4_table - 0xC0000000)
    mov cr3, eax

    ; Now we can enable PAE
    ; To do it we need to modify cr4, so first let's copy it into eax
    ; we can't modify cr registers directly
    mov eax, cr4
    or eax, 1 << 5  ; Physical address extension bit
    mov cr4, eax
    
    ; Now set up the long mode bit
    mov ecx, 0xC0000080
    ; rdmsr is to read a a model specific register (msr)
    ; it copy the values of msr into eax
    rdmsr
    or eax, 1 << 8
    ; write back the value
    wrmsr
    
    ; Now is tiem to enable paging
    mov eax, cr0    ;cr0 contains the values we want to change
    or eax, 1 << 31 ; Paging bit
    or eax, 1 << 16 ; Write protect, cpu  can't write to read-only pages when
                    ; privilege level is 0
    mov cr0, eax    ; write back cr0
    
    lea ebx, [higher_half]
    jmp ebx
section .text
higher_half:
    ; load gdt 
    lgdt [gdt64.pointer]
    

    ; update segment selectors
    mov ax, gdt64.data
    mov ss, ax  ; Stack segment selector
    mov ds, ax  ; data segment register
    mov es, ax  ; extra segment register
    mov fs, ax  ; extra segment register
    mov gs, ax  ; extra segment register
    ; Far jump to long mode
    jmp  gdt64.code:kernel_start

section .bss

align 4096
p4_table:
    resb 4096
p3_table:
    resb 4096
p2_table:
    resb 4096
%ifdef SMALL_PAGES
; if SMALL_PAGES is defined it means we are using 4k pages
; For now the first 8mb will be mapped for the kernel.
; This part is not implemented yet 
pt1_table:
    resb 4096
pt2_table:
    resb 4096
pt3_table:
    resb 4096
pt4_table:
    resb 4096
%endif
; This section is temporary to test the framebuffer
align 4096
fbb_p3_table:
    resb 4096
fbb_p2_table:
    resb 4096

; This section is for loading the kernel in the higher half (0xC0000000)
; In the P2 Table (pdir) the entry 0 base address will be the mapped to the first 2mb of the kernel 
hhk_p2_table:
    resb 4096

section .rodata

; gdt table needs at least 3 entries:
;     the first entry is always null
;     the other two are data segment and code segment.
gdt64:
    dq  0	;first entry = 0
.code equ $ - gdt64
    ; set the following values:
    ; descriptor type: bit 44 has to be 1 for code and data segments
    ; present: bit 47 has to be  1 if the entry is valid
    ; read/write: bit 41 1 means that is readable
    ; executable: bit 43 it has to be 1 for code segments
    ; 64bit: bit 53 1 if this is a 64bit gdt
    dq (1 <<44) | (1 << 47) | (1 << 41) | (1 << 43) | (1 << 53)  ;second entry=code
.data equ $ - gdt64
    dq (1 << 44) | (1 << 47) | (1 << 41)	;third entry = data

.pointer:
    dw .pointer - gdt64 - 1
    dq gdt64
To recap:

  • GDT/IDT Done
  • Basci video output done
  • PAging and long mode and 64bit fuilly enabled



All the stuff above works fine if the kernel is loaded at 0x100000. So i'm pretty sure is something messed up with the address space/page dirs.

Re: Higher Half kernel load, linker error

Posted: Thu Mar 18, 2021 6:56 pm
by sj95126
I think you have a few things out of order. I never wrote mine to go from 32-bit to 64-bit so I may be misreading it (my bootsect goes straight from real mode to long mode). But I think you want to hold off on the jump to higher half until after you've jumped into the 64-bit code path. You should also load the 64-bit GDT before enabling long mode.

Is the reason you're using a high half address like 0xC0000000 is so that you can jump to it directly from 32-bit mode? It's probably better if you separate the two; jump into a low address to enter 64-bit mode, then trampoline into a high address (in 64-bit kernels, usually in the space above 0xffff800000000000).

Again, mine is slightly different, but I do the following:

- create page tables and set cr3
- set cr4.PAE
- set long mode bit in EFER MSR
- load GDT
- set cr0.PG and cr0.PE (you're already in PE)
- jump to new code segment
- load data segment registers
- trampoline to high load address
- adjust stack, GDT, etc. to use high load address
- remove low memory mapping from page tables
- jump to kernel

Re: Higher Half kernel load, linker error

Posted: Thu Mar 18, 2021 10:20 pm
by nullplan
sj95126 wrote:I think you have a few things out of order. I never wrote mine to go from 32-bit to 64-bit so I may be misreading it (my bootsect goes straight from real mode to long mode). But I think you want to hold off on the jump to higher half until after you've jumped into the 64-bit code path. You should also load the 64-bit GDT before enabling long mode.
Yep, that is basically it. You need a 64-bit part to your lower-half trampoline.

Due to the way GDT segments work, you only require a single GDT that contains both 64-bit segments and 32-bit segments. In protected mode, you just never reference the 64-bit segments, and everything will be fine. After the mode switch, you do a "far" jump, which won't go very far, just to the next instruction, but it loads the 64-bit code segment, meaning you will be in 64-bit mode from the next instruction onward. And then you can jump to arbitrary 64-bit addresses.

Code: Select all

bits 32
[...]
  lgdt [gdt64.pointer]
  mov eax, cr0
  bts eax, 31
  mov cr0, eax
  ; we are in Long Compatibility mode now
  jmp 8:next

bits 64
next:
  mov ax, 16
  mov ds, ax
  mov es, ax
  mov fs, ax
  mov gs, ax
  mov ss, ax
  jmp higher_half

Re: Higher Half kernel load, linker error

Posted: Fri Mar 19, 2021 1:52 am
by Velko
The kernel starts at 0x100000.
Yes, physical addresses. Logical addresses are assigned by linker, and by means of linker script you have specified that .bss contents should have 0xc0000000+ addresses. That is fine. Still, while there is no address calculation enabled (no paging or segmentation tricks), you have to do the address translation yourself.

Here's a small snippet of your code, I changed the comments to describe what is really happening there.

Code: Select all

    mov eax, p3_table   ; Load 0xC0109000 (linker-assigned address of p3_table) into eax
    or eax, 0b11        ; set writable and present bits to 1
    mov dword [p4_table + 0], eax   ; Store contents of eax into address 0xC0108000
You take logical address of p3_table, and try to store it in a place where p4_table[0] will be accessible after paging is enabled.

Hint: there's a place in code, where you did it correctly:

Code: Select all

    mov eax, (p4_table - 0xC0000000)
    mov cr3, eax
Here you took logical address of p4_table, adjusted it for higher-half offset, getting the physical address back. Then loaded it into cr3.

Re: Higher Half kernel load, linker error

Posted: Fri Mar 19, 2021 10:02 am
by iansjack
Your register dump shows that, whatever you may think, you are running in 32-bit mode.

Re: Higher Half kernel load, linker error

Posted: Fri Mar 19, 2021 12:55 pm
by finarfin
iansjack wrote:Your register dump shows that, whatever you may think, you are running in 32-bit mode.
Well, Yes! because i still haven't jumped to 64 bit code.

Re: Higher Half kernel load, linker error

Posted: Fri Mar 19, 2021 1:31 pm
by finarfin
nullplan wrote:
sj95126 wrote:I think you have a few things out of order. I never wrote mine to go from 32-bit to 64-bit so I may be misreading it (my bootsect goes straight from real mode to long mode). But I think you want to hold off on the jump to higher half until after you've jumped into the 64-bit code path. You should also load the 64-bit GDT before enabling long mode.
Yep, that is basically it. You need a 64-bit part to your lower-half trampoline.

Due to the way GDT segments work, you only require a single GDT that contains both 64-bit segments and 32-bit segments. In protected mode, you just never reference the 64-bit segments, and everything will be fine. After the mode switch, you do a "far" jump, which won't go very far, just to the next instruction, but it loads the 64-bit code segment, meaning you will be in 64-bit mode from the next instruction onward. And then you can jump to arbitrary 64-bit addresses.

Code: Select all

bits 32
[...]
  lgdt [gdt64.pointer]
  mov eax, cr0
  bts eax, 31
  mov cr0, eax
  ; we are in Long Compatibility mode now
  jmp 8:next

bits 64
next:
  mov ax, 16
  mov ds, ax
  mov es, ax
  mov fs, ax
  mov gs, ax
  mov ss, ax
  jmp higher_half
Ah ok, my bad, i was expecting that moving the kernel in the higher half on 64 bit was more or less the same of doing it in 32 bit, so i was just try to follow the instructions on the wiki tutorial.

So in the 64 bit world the move to the higher half can be done after i set everything up in 32bits world?

And i was using 0xC0000000, because i saw many example of it done at this address, but again i didn't noticed that were all 32-bit examples.
nullplan wrote:
sj95126 wrote:I think you have a few things out of order. I never wrote mine to go from 32-bit to 64-bit so I may be misreading it (my bootsect goes straight from real mode to long mode). But I think you want to hold off on the jump to higher half until after you've jumped into the 64-bit code path. You should also load the 64-bit GDT before enabling long mode.
Yep, that is basically it. You need a 64-bit part to your lower-half trampoline.

Due to the way GDT segments work, you only require a single GDT that contains both 64-bit segments and 32-bit segments. In protected mode, you just never reference the 64-bit segments, and everything will be fine. After the mode switch, you do a "far" jump, which won't go very far, just to the next instruction, but it loads the 64-bit code segment, meaning you will be in 64-bit mode from the next instruction onward. And then you can jump to arbitrary 64-bit addresses.
That means that my gdt should have 4 entries? 2 for 32 bit data and code segments and 2 for 64 bit version?

I see that in your example, you enable paging after having loaded the GDT and not before, is that strictly necessary? or i can keep paging being enabled just before the gdt?

When you do the jmp 8:next the instructions after the label, are already in the higher half?

But then if i'm going to load the kernel at 0xffff800000000000 this means that the address can't stay 32bit register, so i suppose that the linker script should be different correct ? Because, from what i understood from @velko comment one of my problem is that i am using variables in asm like if they were still placed with the old addressing (not subtracting 0xC0000000), and in 32bit world i can't subtract a 64bit address (i suppose).

Is there a good barebones example of a 64bit kernel loaded in the higher half?

EDIT
I think this can be a good example: https://github.com/charpointer/celesteo ... asm/boot.s?

There is only one thing not totally clear to me, all the initalization is done at 32 bit, ok got it, but when initalizing the data strcuture i see the subtractions like: p4_table - 0xFFFFFFFF80000000 that afaik is a 64bit value. Is that possible??

Btw reading that example looks like that i can just need one code and one data segment of the gdt, is that correct?

Another thing that i don't understand from his linker script is why he does something like:

Code: Select all

.data ALIGN(4K) :  AT(ADDR(.data) - kern_virt_offset)
	{
		sections_data = . - kern_virt_offset;
                KEEP(*(.data*))
                sections_data_end = . - kern_virt_offset;
	}
while in many other projects i see something similar to this:

Code: Select all

   .data ALIGN (0x1000) : AT(ADDR(.data) - 0xC0000000) {
       *(.data)
   }

   .bss : AT(ADDR(.bss) - 0xC0000000) {
       _sbss = .;
       *(COMMON)
       *(.bss)
       _ebss = .;
   }
Is there a big difference? (and i see that in some case labels like _sbss, section_data_end are added, and in other projects not. Is there a reason for them? (or maybe are just depending on some of the developer choices?)

Ah one last thing; the purpose of the new stack allocated, why is needed? And what about the size is there any convention?

Re: Higher Half kernel load, linker error

Posted: Fri Mar 19, 2021 11:54 pm
by Octocontrabass
finarfin wrote:There is only one thing not totally clear to me, all the initalization is done at 32 bit, ok got it, but when initalizing the data strcuture i see the subtractions like: p4_table - 0xFFFFFFFF80000000 that afaik is a 64bit value. Is that possible??
The result is correct after truncation to 32 bits so it's fine. (The linker might complain but it will still work.) Using that address for the kernel VMA has an additional advantage: you can use -mcmodel=kernel instead of -mcmodel=large.
finarfin wrote:Btw reading that example looks like that i can just need one code and one data segment of the gdt, is that correct?
It works as long as you don't load any segment registers in 32-bit mode (other than the jump to 64-bit mode).
finarfin wrote:Is there a big difference? (and i see that in some case labels like _sbss, section_data_end are added, and in other projects not. Is there a reason for them? (or maybe are just depending on some of the developer choices?)
Those two linker scripts are pretty much the same thing. The labels are used to find out where the section was loaded and how big it is. If you need to do that, you'll add labels like that to your linker script.
finarfin wrote:Ah one last thing; the purpose of the new stack allocated, why is needed? And what about the size is there any convention?
Multiboot does not provide a stack. If you want to use a stack, you must set it up yourself. The example you linked reuses the same memory for the stack in 32-bit and 64-bit mode. I don't think there's any standard for how big the stack needs to be, especially since different kernels will have different stack usage needs. (Linux uses 8kiB, though, so that might be a good starting point until you're ready to figure out in more detail exactly what your kernel needs.)