Jump to kernel entry point fails

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
CRoemheld
Member
Member
Posts: 55
Joined: Wed May 02, 2018 1:26 pm
Libera.chat IRC: CRoemheld

Jump to kernel entry point fails

Post by CRoemheld »

I'm trying to jump into the 64-bit kernel entry point by jumping to the address provided by the

Code: Select all

elf64_ehdr->e_entry
field. I'm following the approach described in the tutorial using a separate loader. In my example, the kernel entry point is located at

Code: Select all

ffffffff80200314 g     F .text	000000000000001d _entry
. The address from the field above is the same address as the kernel entry point, but I was expecting the next instruction to be the kernel stack setup instruction

Code: Select all

mov $kernel_stack_top, %rsp
, however, what I got it the following: https://imageshack.com/a/img922/4364/wK6FZC.png

This irritates me quite a bit, so I'm wondering if I missed something or if the approach from the tutorial is correct. After the mentioned jump, the OS comes to a halt and the screen goes black.

I hope you can explain what went wrong here.
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Jump to kernel entry point fails

Post by iansjack »

Well, obviously your elf loader hasn't loaded the file correctly. But without seeing your code it's unlikely that anyone will guess what your error is.

Do you have a link to an online repository of your code? Someone may be willing to check it for you.
Last edited by iansjack on Fri Jun 01, 2018 6:50 am, edited 1 time in total.
CRoemheld
Member
Member
Posts: 55
Joined: Wed May 02, 2018 1:26 pm
Libera.chat IRC: CRoemheld

Re: Jump to kernel entry point fails

Post by CRoemheld »

I uploaded the current status of the development here: https://github.com/croemheld/cr0S-public. You will need an i686 as well as a x86-64 cross compiler for this, as the loader is compiled to a 32-bit ELF file and the kernel as a 64-bit ELF.

I am loading the kernel as a module, just as specified in the tutorial.
jnc100
Member
Member
Posts: 775
Joined: Mon Apr 09, 2007 12:10 pm
Location: London, UK
Contact:

Re: Jump to kernel entry point fails

Post by jnc100 »

Your bootstrap is compiled as 32 bits thus the kmain function pointer is also 32 bit in length. The kernel is linked to the higher half so the top 32-bits of the address is lost.

You're advised to switch to 64-bit mode, reload segments and jump to the kernel all from the same assembly function which lets you intersperse 64-bit code. As there is no direct call to 64 bit addresses you would need either an indirect call or push the address to the stack (and other appropriate bits) and iret.

Regards,
John.
CRoemheld
Member
Member
Posts: 55
Joined: Wed May 02, 2018 1:26 pm
Libera.chat IRC: CRoemheld

Re: Jump to kernel entry point fails

Post by CRoemheld »

jnc100 wrote:Your bootstrap is compiled as 32 bits thus the kmain function pointer is also 32 bit in length. The kernel is linked to the higher half so the top 32-bits of the address is lost.
True, but for this case I hardcoded the address of the entry point into the assembly function, including the upper 8 bytes (0xffffffff00000000), so the address I jump to seems to be set up correctly.
jnc100 wrote:You're advised to switch to 64-bit mode, reload segments and jump to the kernel all from the same assembly function which lets you intersperse 64-bit code. As there is no direct call to 64 bit addresses you would need either an indirect call or push the address to the stack (and other appropriate bits) and iret.
Isn't that what I'm doing?

Code: Select all

_load_cs64:
	movl 4(%esp), %ebx

	ljmp $0x08, $.reload_cs64

.code64

.reload_cs64:
	cli

	xchg %bx, %bx

	movw $0x10, %ax
	movw %ax, %ds
	movw %ax, %es
	movw %ax, %fs
	movw %ax, %gs

	mov $0xffffffff80200314, %rax
	jmp %rax
I also tried your other approach with pushing a qword value of the address onto the stack and iret. The result is the same: Screen goes blank, but bochs shows that I'm not jumping to the address of the kernel entry point but rather to the ISR stub which pushes 0x000000000000000d onto the stack and goes on with what I assume is the common interrupt handler: If what I assume is right, the the error code from the interrupt handler seems to be a general protection fault (0x000000000000000d):
https://imageshack.com/a/img922/5637/xPOKEm.png

I'm getting more confused now...
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Jump to kernel entry point fails

Post by iansjack »

As I understood the original screen print, the code just isn't loaded at the location.

I'm afraid that I haven't had time to look at your code closely, but are you sure that you are moving the code from the load location to the higher-half location? Or are you mapping the higher half to the location where the code is loaded? Have you looked at your page table to make sure they are correct.

The ultimate answer, IMO, is to single-step through your code in a debugger. I'm not familiar with Bochs, so I don't know if you can step through code this early in the proceedings, but if you were to run it under SimNow you certainly could. This would let you check that the page mappings are correct, that the code is loaded at the correct location, and that you are jumping to that location. (Obviously, at least one of these can't be true - my understanding is that the second one isn't.)

I'm a big fan of running code this way as it gives you a deeper understanding of what is happening, and exactly where code is failing, than just trying random corrections.
CRoemheld
Member
Member
Posts: 55
Joined: Wed May 02, 2018 1:26 pm
Libera.chat IRC: CRoemheld

Re: Jump to kernel entry point fails

Post by CRoemheld »

Okay, regarding the "code not loaded at the location", the error would be in the paging mechanism.
Can you tell me if, what I'm mapping to get the loader jump into the kernel, is the correct way of doings?:

In the following list, the infos are arranged in the following order:
virtual address -> physical address, size, physical page size
  • - Identity map the first 8 MB (0x0 -> 0x0, (8 << 20), 4 KiB)
    - Map of 1 GiB physical address space (0xffff880000000000 -> 0x0, (1 << 30), 2 MiB)
    - Map of 512 MiB kernel address space (0xffffffff80000000 -> 0x0, (512 << 20), 2 MiB)
    - Map of kernel module address space (as seen in screenshot, last "memmap" print) (0xffffffff8020000 -> 0x0031F000, (8 << 20), 4 KiB)
The kernel is loaded at 0xffffffff80200000:

Code: Select all

ENTRY(_entry)

KERNEL_BASE_ADDR = 0xffffffff80200000;

SECTIONS
{
	. = KERNEL_BASE_ADDR;
	_kernel = .;

	.text ALIGN(4K) : AT(ADDR(.text) - KERNEL_BASE_ADDR)
...
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Jump to kernel entry point fails

Post by iansjack »

When you say "The kernel is loaded at ...", that's what you are telling the linker. But it is up to your loader to actually place the code at that address. If loaded as a module by grub it will be loaded somewhere around the 1MB mark (the multiboot structure will tell you exactly where). You have to either ensure that that address is mapped to your load address or that the code is moved to the physical location that corresponds with the load address.
CRoemheld
Member
Member
Posts: 55
Joined: Wed May 02, 2018 1:26 pm
Libera.chat IRC: CRoemheld

Re: Jump to kernel entry point fails

Post by CRoemheld »

In this case it would be the last memmap I did: 0xffffffff8020000 -> 0x0031F000, (8 << 20), 4 KiB page size
Both the virtual and physical addresses above are taken from the elf header of the kernel module loaded by grub. So unless the code I uploaded to GitHub (see my second post) is not incorrect, I assume the problem lies elsewhere. But the information I provided is all I can say, as I don't have an idea where the actual problem resides.
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Jump to kernel entry point fails

Post by iansjack »

OK.

I've compiled your code to see the state immediately after GRUB has loaded your bootstrap code and kernel module. The kernel is actually located at 0x11e000. You find this by reading the multiboot boot information and following the link to the module. Note that on a different computer, and with changes to the source, the module could end up at a different location so you have to find it dynamically rather than hard-coding any address.

You'll note that this doesn't correspond to any of the addresses in your ELF header. A multiboot module is just a plain file - it's just loaded into memory at a location that the loader chooses, and isn't interpreted as an ELF file. So it's your responsibility to move this file to the location that you want it at. I haven't followed your code enough to see if you do that but I'm guessing from your comments that you haven't. In any case, it's certainly not in that location when you come to jump to it.

You need to:

1. Find the location of the loaded module (see the multiboot specification).
2. Relocate it to it's desired place in memory (which would be 0x200000 as things stand, since that's what you told the linker).
3. Arrange your page map so that the desired virtual address maps to that physical address.

Note that when relocating the module you will have to relocate each of the ELF sections to the correct place, so it's not a simple memmove. You have to look at the ELF headers to determin where each section should go, and how long it is.

Once that is all done the code will be in the correct place and the jump to it should work. If you run under a debugger (or under qemu with the monitor enabled and some judicious "jmp ."s inserted) you can inspect the registers and memory at various stages to see that all is going to plan.

(Ps. I'm sorry if my initial reply was a bit rude. It wasn't until I looked at your code that I realised I had seen it before and you had already linked to your repository. It's not a bad idea to repeat the link in every new thread, fpjust to allow to forgetful, old guys like me. :) )
CRoemheld
Member
Member
Posts: 55
Joined: Wed May 02, 2018 1:26 pm
Libera.chat IRC: CRoemheld

Re: Jump to kernel entry point fails

Post by CRoemheld »

Thanks a lot for your patience, I got a little bit further regarding the sections, so I want to show you what I researched and what now works:

I can now display all sections with their virtual address as well as their physical address. One point though, you said:
iansjack wrote:Relocate it to it's desired place in memory (which would be 0x200000 as things stand, since that's what you told the linker).
As you can see in the following picture, the 0x200000 is the offset from the elf64 file header to the section, not the actual physical address.
Clearing this mistake let me dump the sections along with their informations: https://imageshack.com/a/img923/5238/Fdi3uy.png
As you can see, the output in my OS matches the values from objdump. Now I wanted to check if the physical addresses are also correct, so I memdumped the physical memory and got the following screen: https://imageshack.com/a/img924/3290/cyRlI1.png

Here you see the bochs emulator on the left, dumping the physical memory at start address 0x31F000, which is mapped through 0xffffffff80200000. This is the start of the .text section in the elf file. On the right you see the sublime text editor with the disassembled elf file of the kernel and you can compare the bochs memdump and the disassembled file: The bytes match perfectly, meaning this is where the sections are loaded.

In this case, what would be my next step? So far I only retrieved all the informations regarding the sections and segments dynamically.

The code seems to be in the memory as well, so the only thing left would be mapping the virtual addresses to the physical addresses of the individual sections of the elf files with the specified size, right?

PS: What about the .symtab, .strtab and .shstrtab section though? Those three aren't listed in objdump but only in my OS.
alexfru
Member
Member
Posts: 1112
Joined: Tue Mar 04, 2014 5:27 am

Re: Jump to kernel entry point fails

Post by alexfru »

iansjack wrote: Note that when relocating the module you will have to relocate each of the ELF sections to the correct place, so it's not a simple memmove. You have to look at the ELF headers to determin where each section should go, and how long it is.
When loading an ELF program, shouldn't one look at segments instead of sections? There may be no sections at all in a valid program.
CRoemheld
Member
Member
Posts: 55
Joined: Wed May 02, 2018 1:26 pm
Libera.chat IRC: CRoemheld

Re: Jump to kernel entry point fails

Post by CRoemheld »

alexfru wrote:When loading an ELF program, shouldn't one look at segments instead of sections? There may be no sections at all in a valid program.
Well in my case all the sections are grouped into one segment, which you can see in the screenshots I provided. So would that mean I only would need to map the virtual address 0xffffffff80200000 to physical address 0x31f000 with a size of 0xc020? I'm also not sure about wether to map the individual section addresses or the segment addresses.
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Jump to kernel entry point fails

Post by iansjack »

alexfru wrote:
iansjack wrote: Note that when relocating the module you will have to relocate each of the ELF sections to the correct place, so it's not a simple memmove. You have to look at the ELF headers to determin where each section should go, and how long it is.
When loading an ELF program, shouldn't one look at segments instead of sections? There may be no sections at all in a valid program.
Oh dear! Sorry, I didn't mean to mislead.

I'll leave it up to the more knowledgeable to help.
nullplan
Member
Member
Posts: 1801
Joined: Wed Aug 30, 2017 8:24 am

Re: Jump to kernel entry point fails

Post by nullplan »

You are supposed to map each segment with type PT_LOAD to the address p_vaddr from p_offset bytes into the image onward with a total size of p_memsz, though you should clear the difference between p_filesz and p_memsz to 0. Note that it is possible that p_vaddr is not page aligned, but the misalignment of p_vaddr is equal to the misalignment of p_offset (meaning if you load the image page-aligned, you can map every address correctly). Also, for the kernel, you might want to align the break between code and data to a page-size. For one, that should get you two segments you can map with different options depending on p_flags.

Also note (later, when you are loading user programs) that the page containing the break between code and data need not be page-aligned. In that case it is necessary to load that page twice. Generally, in the user program loader you would load each segment separately (whether lazily or eagerly doesn't matter). That wastes a page but makes it impossible to overwrite the last page of code by overwriting the first page of data.

Anyway, once you're done, the entry point should be mapped correctly. You can just use the raw entry point.
Carpe diem!
Post Reply