Kernel jumps back to GRUB after setting up higher half

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
tabz
Member
Member
Posts: 35
Joined: Fri Apr 20, 2018 9:15 am
Location: Cambridge, UK

Kernel jumps back to GRUB after setting up higher half

Post by tabz »

I just set my kernel up to operate in the higher half of memory and knew that I'd have to change my identity page mapping from 0 -> size of kernel, to 3GiB -> 4 GiB. However, my kernel now jumps back into GRUB code immediately after setting the kernel paging directory in the cr3 register, runs my kernel again, jumps back to GRUB after setting the paging directory again and the loop continues forever. Does anyone have any idea what I could be doing wrong? Something tells me that I may have the identity mapping physical and virtual addresses mixed up/wrong, but I've struggled to find the problem by myself. The last debug log statement I see is "Setting paging directory".

I know it's jumping back into GRUB code because the screen says "SeaBIOS <some version>" other details and "Welcome to grub!"

* Kernel source
* Kernel main C file
* Paging C file
electrodeyt
Posts: 3
Joined: Sun Jun 25, 2017 10:00 am

Re: Kernel jumps back to GRUB after setting up higher half

Post by electrodeyt »

Its triple faulting.
I'm similar to a jack of all trades. I'm a starter of projects, finisher of none.
Octocontrabass
Member
Member
Posts: 5586
Joined: Mon Mar 25, 2013 7:01 pm

Re: Kernel jumps back to GRUB after setting up higher half

Post by Octocontrabass »

tabz wrote:Something tells me that I may have the identity mapping physical and virtual addresses mixed up/wrong,
"Identity mapping" is where you set up the virtual address to be the same as the physical address. I haven't looked too hard at your code, but I didn't see any identity mapping. I only saw the part that maps the kernel to ~3GB.
tabz wrote:I know it's jumping back into GRUB code because the screen says "SeaBIOS <some version>"
SeaBIOS is not GRUB, SeaBIOS is the BIOS. If you're seeing the BIOS, the computer probably rebooted. If the computer rebooted, you probably caused a triple fault.
tabz
Member
Member
Posts: 35
Joined: Fri Apr 20, 2018 9:15 am
Location: Cambridge, UK

Re: Kernel jumps back to GRUB after setting up higher half

Post by tabz »

Thanks for the feedback, would you have any idea why it might be triple faulting? I'm guessing it might be for a whole host of reasons.

I've realised I shouldn't be identity mapping as obviously the kernel is loaded in at 1 MiB in physical memory with 0 as the physical addressing base, but uses 3 GiB as the virtual addressing base, so identity mapping would obviously be a bad idea. I've updated my code accordingly.
Octocontrabass
Member
Member
Posts: 5586
Joined: Mon Mar 25, 2013 7:01 pm

Re: Kernel jumps back to GRUB after setting up higher half

Post by Octocontrabass »

tabz wrote:Thanks for the feedback, would you have any idea why it might be triple faulting? I'm guessing it might be for a whole host of reasons.
Some virtual machines will give you some information about the cause of the triple fault. If the one you're using doesn't do that, try running your OS in Bochs and see what it says.

You'll get even more insight by using a debugger.
tabz wrote:I've realised I shouldn't be identity mapping as obviously the kernel is loaded in at 1 MiB in physical memory with 0 as the physical addressing base, but uses 3 GiB as the virtual addressing base, so identity mapping would obviously be a bad idea.
Identity mapping is a requirement of the x86 architecture. You can't enable paging without identity mapping at least one page (the page containing the code that enables paging). You can remove the identity mapping once paging is enabled.

That brings up another issue: you're trying to enable paging from within C code, but there's no way to change the base address mid-function. That's why examples like this one set up the initial page tables and enable paging before doing anything else. (That example also sets up the initial page table in assembly. There's no reason you couldn't do it in C, but it would make linking more complicated.)

And speaking of enabling paging... are you setting the right bit in CR0?
tabz
Member
Member
Posts: 35
Joined: Fri Apr 20, 2018 9:15 am
Location: Cambridge, UK

Re: Kernel jumps back to GRUB after setting up higher half

Post by tabz »

So this issue is occurring again. I can confirm that my exception handlers work and that grub isn't launching my entry symbol (start), so I know this isn't because of a triple fault. I checked this by attaching gdb to the qemu instance and setting a breakpoint (both software and hardware) on the start symbol then continuing execution, but it keeps looping and never reaches the breakpoint.

Useful links:
* Link script
* Boot assembly file

You can check the dev branch in the above links for working examples (but without the higher half).

Here are some relevant extracts:

Code: Select all

ENTRY(start)

KERNEL_ADDR_OFFSET = 0xC0000000;
KERNEL_VADDR_START = 0xC0100000;

SECTIONS
{
	/* The kernel will be loaded at 1MB, but will be mapped to 3GB + 1MB  */
	. = KERNEL_VADDR_START;

	.text ALIGN(4K) : AT (ADDR (.text) - KERNEL_ADDR_OFFSET)
	{
		*(.multiboot)
		*(.text)
	}

	.rodata ALIGN(4K) : AT (ADDR (.rodata) - KERNEL_ADDR_OFFSET)
	{
		*(.rodata)
	}

	.data ALIGN(4K) : AT (ADDR (.data) - KERNEL_ADDR_OFFSET)
	{
		*(.data)
	}

	.bss ALIGN(4K) : AT (ADDR (.bss) - KERNEL_ADDR_OFFSET)
	{
		*(COMMON)
		*(.bss)
	}

     KERNEL_VADDR_END = .;
}

Code: Select all

.align 4096
boot_page_directory:
    .fill 4096
boot_page_table1:
    .fill 4096

.section .text
.global start
.global kernel_stack
.extern kmain
start:
    mov $(boot_page_directory - KERNEL_ADDR_OFFSET), %esi
    # Start with first page
    mov $0, %esi
    # Fill 1023 pages to map 4MB. WIll need to change if the kernel's size goes beyond that
    mov $1023, %ecx

1:
    # Make sure we're not mapping beyond the kernel
    cmpl $(KERNEL_VADDR_START - KERNEL_ADDR_OFFSET), %esi
    jl 2f
    cmpl $(KERNEL_VADDR_END - KERNEL_ADDR_OFFSET), %esi
    jge 3f
    movl %esi, %edx
    # Make this page present and writable. Note this will make readonly sections wriable, which can be bad
    orl $0x003, %edx
    # Move into current page
    movl %edx, (%esi)
2:
    # Go to next page 4KB up
    addl $4096, %esi
    # Entries are 4 bytes large
    addl $4, %edi
    loop 1b
3:
	# Map VGA video memory to 0xC03FF000 as present and writable.
	movl $(0x000B8000 | 0x003), boot_page_table1 - KERNEL_ADDR_OFFSET + 1023 * 4
	# Map the page table to both virtual addresses 0x00000000 and 0xC0000000.
    movl $(boot_page_table1 - KERNEL_ADDR_OFFSET + 0x003), boot_page_directory - KERNEL_ADDR_OFFSET
    movl $(boot_page_table1 - KERNEL_ADDR_OFFSET + 0x003), boot_page_directory - KERNEL_ADDR_OFFSET + 768 * 4
    # Load the page directory
    movl $(boot_page_directory - KERNEL_ADDR_OFFSET), %ecx
    movl %ecx, %cr3
    # Enable paging and the write-protect bit.
    movl %cr0, %ecx
    orl $0x80010000, %ecx
    movl %ecx, %cr0
    # Jump to higher half with an absolute jump.
    lea start_higher_half, %ecx
    jmp *%ecx

start_higher_half:
    # Unmap the identity mapping as it is now unnecessary.
	movl $0, boot_page_directory
    mov $kernel_stack_end, %esp
    # Flush the TLB so the changes to take effect.
    movl %cr3, %ecx
    movl %ecx, %cr3
    # Push magic number from bootloader
    push %eax
    # Push multiboot header address
    push %ebx
    cli
    call kmain
loop:    jmp loop
MichaelPetch
Member
Member
Posts: 798
Joined: Fri Aug 26, 2016 1:41 pm
Libera.chat IRC: mpetch

Re: Kernel jumps back to GRUB after setting up higher half

Post by MichaelPetch »

For things like this I recommend BOCHs. It is very easy to query the page tables. I decided to set a breakpoint on the instruction that turns on paging. I stepped across the instruction that enabled the paging and issued the command info tab and got this:

Code: Select all

0x0000000000000000-0x0000000000000fff -> 0x00000010d000-0x00000010dfff
0x00000000003ff000-0x00000000003fffff -> 0x0000000b8000-0x0000000b8fff
0x00000000c0000000-0x00000000c0000fff -> 0x00000010d000-0x00000010dfff
0x00000000c03ff000-0x00000000c03fffff -> 0x0000000b8000-0x0000000b8fff
This is the mapping it sees (linear addresses on the left, physical on the right). The problem is that the instruction I was executing was at memory address 0x101xxx. You have no mapping for where your actual code is! As soon as paging is enabled your code is not mapped and the instructions can't continue. You need to identity map the region of code where paging is enabled so that instructions can continue before and after paging is enabled. After paging is enabled you can then jump into the higher half.
Last edited by MichaelPetch on Tue Jan 29, 2019 2:25 am, edited 2 times in total.
tabz
Member
Member
Posts: 35
Joined: Fri Apr 20, 2018 9:15 am
Location: Cambridge, UK

Re: Kernel jumps back to GRUB after setting up higher half

Post by tabz »

MichaelPetch wrote:For things like this I recommend BOCHs. It is very easy to query the page tables. I decided to set a breakpoint on the instruction that turns on paging. I stepped across the instruction that enabled the paging and issued the command info tab and got this:

Code: Select all

0x0000000000000000-0x0000000000000fff -> 0x00000010d000-0x00000010dfff
0x00000000003ff000-0x00000000003fffff -> 0x0000000b8000-0x0000000b8fff
0x00000000c0000000-0x00000000c0000fff -> 0x00000010d000-0x00000010dfff
0x00000000c03ff000-0x00000000c03fffff -> 0x0000000b8000-0x0000000b8fff
This is the mapping it sees (linear addresses on the left, physical on the right). The problem is that the instruction I was executing was at memory address 0x101xxx. You have no mapping for where your actual code is! As soon as paging is enabled your code is not mapped and the instructions can continue. You need to identity map the region of code where paging is enabled so that instructions can continue before and after paging is enabled. After paging is enabled you can then jump into the higher half.
I will definitely try bochs to see what info I can get. My only thought is that surely it can't be because of my higher half paging setup since, according to GDB, my code isn't even being reached.

**EDIT:** Just realised that of course you managed to get bochs to reach that my entry symbol so it must be a QEMU/GDB issue on my side.
MichaelPetch
Member
Member
Posts: 798
Joined: Fri Aug 26, 2016 1:41 pm
Libera.chat IRC: mpetch

Re: Kernel jumps back to GRUB after setting up higher half

Post by MichaelPetch »

Setting breakpoints at entry with GDB/QEMU is likely because of the way you have done your linker script and code. It is based on the higher half tutorial which I consider flawed. It is far better IMHO to separate out the code that runs in low memory from the higher half. Although for NASM (it can be adapted for GNU assembler), I wrote some commentary in another OSDev wiki: viewtopic.php?p=282158#p282158 . This method also means you don't need to do offset fixups by subtracting KERNEL_ADDR_OFFSET. Doing this also makes it suitable to set breakpoints for code that occurs before paging is enabled in QEMU.
You may be able to get a breakpoint on start if you do the breakpoint this way: b *(start-0xc0000000) . This account for the fact that the code was located in the lower memory region above 0x100000 but had a VMA in the higher half.

BOCHs is better for paging related bugs since you can easily dump the page table mappings out in a human readable format. There are issues with BOCHs in x86-64 long mode and page tables, but for 32-bit code it works very well.
tabz
Member
Member
Posts: 35
Joined: Fri Apr 20, 2018 9:15 am
Location: Cambridge, UK

Re: Kernel jumps back to GRUB after setting up higher half

Post by tabz »

MichaelPetch wrote:Setting breakpoints at entry with GDB/QEMU is likely because of the way you have done your linker script and code. It is based on the higher half tutorial which I consider flawed. It is far better IMHO to separate out the code that runs in low memory from the higher half. Although for NASM (it can be adapted for GNU assembler), I wrote some commentary in another OSDev wiki: viewtopic.php?p=282158#p282158 . This method also means you don't need to do offset fixups by subtracting KERNEL_ADDR_OFFSET. Doing this also makes it suitable to set breakpoints for code that occurs before paging is enabled in QEMU.
You may be able to get a breakpoint on start if you do the breakpoint this way: b *(start-0xc0000000) . This account for the fact that the code was located in the lower memory region above 0x100000 but had a VMA in the higher half.

BOCHs is better for paging related bugs since you can easily dump the page table mappings out in a human readable format. There are issues with BOCHs in x86-64 long mode and page tables, but for 32-bit code it works very well.
Thanks for the link. I've read through your explanation and it is a really good idea. Also, it now seems so obvious why GDB wasn't triggering the breakpoint! I really should have realised that quicker :D
I've implemented your suggestion with keeping the higher half sections separate and am now getting a triple fault when changing the page directory later on, but haven't had time to investigate why yet. I suspect it could be to do with me giving a bogus physical address (now that the kernel isn't where it thinks it is in memory) or some mis-mapping due to the kernel being in the higher half.
MichaelPetch
Member
Member
Posts: 798
Joined: Fri Aug 26, 2016 1:41 pm
Libera.chat IRC: mpetch

Re: Kernel jumps back to GRUB after setting up higher half

Post by MichaelPetch »

If you split low and high half, remember that you'll want to place the initial page table and page directory in the lower half mulitboot data section. I'd have to see your revised code to understand why it fails.
tabz
Member
Member
Posts: 35
Joined: Fri Apr 20, 2018 9:15 am
Location: Cambridge, UK

Re: Kernel jumps back to GRUB after setting up higher half

Post by tabz »

MichaelPetch wrote:If you split low and high half, remember that you'll want to place the initial page table and page directory in the lower half mulitboot data section. I'd have to see your revised code to understand why it fails.
The initial page directory (the one setup before jumping to the higher half) is in the lower half at the moment. You can check today's (29th January) commits to the f/higher-half branch here to see the changes.

I'm currently investigating what could be causing this but would of course appreciate any pointers you may have :).
MichaelPetch
Member
Member
Posts: 798
Joined: Fri Aug 26, 2016 1:41 pm
Libera.chat IRC: mpetch

Re: Kernel jumps back to GRUB after setting up higher half

Post by MichaelPetch »

Your new code at least gets into the higher half and reaches kmain. So your initial identity mapping worked and allowed you to get to the higher half.The issue is in your paging.c . It seems like your code is mapping all the pages between 0x00000000 ad 0x1ffffff (firts 32MiB) to the same physical memory address 0xfffff000. I doubt that is what you wanted. Because your higher half kernel is no longer in mapped memory it fails to work the moment you set CR3 to the new page directory.
tabz
Member
Member
Posts: 35
Joined: Fri Apr 20, 2018 9:15 am
Location: Cambridge, UK

Re: Kernel jumps back to GRUB after setting up higher half

Post by tabz »

MichaelPetch wrote:Your new code at least gets into the higher half and reaches kmain. So your initial identity mapping worked and allowed you to get to the higher half.The issue is in your paging.c . It seems like your code is mapping all the pages between 0x00000000 ad 0x1ffffff (firts 32MiB) to the same physical memory address 0xfffff000. I doubt that is what you wanted. Because your higher half kernel is no longer in mapped memory it fails to work the moment you set CR3 to the new page directory.
Yeah that's my thinking as well, I just pushed a commit that adds a TODO to fix that.

I also rejigged my heap calculation code because I had the suspicion that my kernel heap wouldn't be mapped since it expands outside of the mapped 4MiB. I've added some extra page directory entries to map 28MiB in total (4MiB for kernel code, 4MiB for temporary allocation space used before heap is set up and 20MiB for the kernel heap) so that it resembles the following:

0 MiB ---> 4 MiB: Kernel code
4 MiB ---> 8 MiB: Kernel pile (temp allocation space)
8 MiB ---> 28 MiB: Kernel heap. May need to be expanded based on demand later.

Below is the code that should achieve that:

Code: Select all

# 20 MiB kernel heap. Changing this will require changing number of kernel pages
.set KERNEL_HEAP_SIZE, 0x1400000
.global KERNEL_HEAP_SIZE
# 4 MiB temp allocation space. Changing this will require changing number of kernel pages
.set KERNEL_TEMP_ALLOC_SIZE, 0x400000
.global KERNEL_TEMP_ALLOC_SIZE
.set KERNEL_ADDR_OFFSET, 0xC0000000
.set KERNEL_PAGE_NUMBER, KERNEL_ADDR_OFFSET >> 22
# One for kernel code, one for temporary allocation space and 5 for the heap. Each covers 4MiB
.set KERNEL_NUM_UPPER_PAGES, 7

# Multiboot data put in lower memory
.section .multiboot.data, "a"
.align 4
.long  MAGIC
.long  FLAGS
.long  CHECKSUM

.align 4096
# The initial page directory used to boot into the higher half
# Allocates seven 4MB pages for kernel, one for the initial 4MB, and six in the upper 4MB for the kernel code, kernel pile and
# kernel heap
boot_page_directory:
    # Set lower half page
    .long 0x00000083
    # Fill all preceding pages with 4 bytes
    .fill (KERNEL_PAGE_NUMBER - 1), 4
    # Set higher half pages, starting from 0 and increasing by 4MiB
    .long 0x00000083
    .long 0x00000083 | (1 << 22)
    .long 0x00000083 | (2 << 22)
    .long 0x00000083 | (3 << 22)
    .long 0x00000083 | (4 << 22)
    .long 0x00000083 | (5 << 22)
    .long 0x00000083 | (6 << 22)
     # Fill all succeeding pages with 4 bytes
    .fill (1024 - KERNEL_PAGE_NUMBER - KERNEL_NUM_UPPER_PAGES), 4
tabz
Member
Member
Posts: 35
Joined: Fri Apr 20, 2018 9:15 am
Location: Cambridge, UK

Re: Kernel jumps back to GRUB after setting up higher half

Post by tabz »

So I've found an issue where the higher half is entered with 4MiB pages, but I'd like to use 4KiB pages when setting up paging after entering the higher half. These are the possible approaches:

1. Use 4MiB paging at first, disable 4MiB pages then change to the new page directory. This causes a fault since as soon as you disable 4MiB pages it can't execute the next instruction since the page directory is set up for 4MiB pages.
2. Use 4MiB paging forever. This would work but doesn't give much granularity.
3. Use 4MiB pages at first, set the page directory then disable 4MiB pages. Similar issue to #1 but causes a fault as the page directory is set up for 4KiB pages but 4MiB pages are enabled.
4. Use 4KiB pages forever. Would work but will be more of a pain when setting up the higher half, as it would also require page tables.

What do you think would be the best approach and am I missing something that would make approach 1/3 work?
Post Reply