Early stages of OS, DIY bootloader, GCC 0x08040000 address?

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
gsap
Posts: 2
Joined: Tue Sep 01, 2020 7:57 pm

Early stages of OS, DIY bootloader, GCC 0x08040000 address?

Post by gsap »

Hello OSDev! Thank you for keeping up the forums and the wiki.

I was hoping to talk someone about an issue I'm currently experiencing.

I'm writing my own operating system here https://github.com/isaprykin/os and it's early days. The OS is peculiar in that it uses its own bootloader and CMake. So far it reads loads the 512 bytes of the bootable partition which then runs the kernel C code at the fixed 0x1000 address. I was trying to follow all the typical OS tutorials on the internet and they had me implement IDT handlers next. However that's not going well because GCC generates addresses north of 0x08040000.

So if my C code that is located at 0x1000 has a global static variable such as `struct idt_interrupt_gate idt_entries[2];`, its address is going to be 0x0804xxxx. I read that it's a pretty much hardcoded behavior in GCC. I can add RAM to my Bochs config :D, but after some reading I feel that's where I'm supposed to add paging. I'm assuming that's what folks would suggest here too, although I remember reading about tricks with GDT where addresses wrap around.

I have two questions:
1) I know that I'm going to have to manipulate pages in the OS and I don't want to do anything smart in the ASM layer rather I want to keep the complexity in the C code. But C variables that aren't automatic/local don't work without paging. What's a good design that allows me to keep paging to the minimum in the ASM layer and enable the C code? I remember that I expected to figure that out after I learn about "identity mapping", however it's still not clear.

2) In my research I looked at the GRUB 0.97 version that was mentioned @ https://asghonim.wordpress.com/2013/11/ ... urce-code/. It mixes the ASM code with C code just like I'm trying to, but it doesn't do anything with paging. How does it work then?

--------------------
I wanted to write this post for around 4 weeks so I thought the points through but there's a small chance I missed a detail, although I don't think so. I also have notes of the IDT failure here https://gist.github.com/isaprykin/af53b ... 0806906cc0 from my early debugging.
xeyes
Member
Member
Posts: 212
Joined: Mon Dec 07, 2020 8:09 am

Re: Early stages of OS, DIY bootloader, GCC 0x08040000 addre

Post by xeyes »

I think entry at 128MB thing is some sort of elf convention. Not "pretty much hardcoded" and while you can make it work you probably don't want that for a kernel (one reason for example: when you start to load elf programs later, you'll notice that many of them also want to be at 128MB due to said convention)

You can give this option to gcc (ld, actually) to move it around

Code: Select all

-Ttext=<address you want, such as 0x100000>
You questions:
1.
C variables that aren't automatic/local don't work without paging.
They should work without paging, as you probably know many people only turn on paging after entry into C code, and it is unlikely that their page-tables are allocated on the call stack.

The issue you are having might be that LMA (where you loaded the code at, around 0x1000 in your case) is different from VMA (where the code thinks it should be, around 128MB in your case). You need to either change LMA (load it at 128MB) or VMA (use the -Ttext option).
What's a good design that allows me to keep paging to the minimum in the ASM layer
Not sure if actually good, and this is obviously for 32bit only, one way to do minimal paging is to map the first big chunk like 0-1GB to itself and also to a higher aperture which contains your C code VMA, for example 3-4GB for a 'higher half kernel'.

That way both ASM, C code and the CPU will all happily work regard less of whether they think the address is based at 3GB or 0GB, and you can clean up the 'mess' in C code later by un-mapping the lower aperture.

2. see answer 1.
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: Early stages of OS, DIY bootloader, GCC 0x08040000 addre

Post by Octocontrabass »

gsap wrote:The OS is peculiar in that it uses its own bootloader
Fair warning: writing a bootloader that works in one emulator is easy, but writing a bootloader that works everywhere is not.
gsap wrote:GCC generates addresses north of 0x08040000.
This tells me your linker script is either missing some sections or not being used at all. Just browsing your code I see you're missing .rodata, but you can objdump some binaries to see if anything else might need to be in there.
gsap wrote:after some reading I feel that's where I'm supposed to add paging.
It's a good idea to add paging sooner rather than later, since it has effects across your entire kernel, but you don't really need it until you want to start isolating address spaces from each other. You need a memory map before you can use paging to manage memory, though, and I notice your bootloader doesn't provide a memory map.
gsap wrote:But C variables that aren't automatic/local don't work without paging.
This also tells me your linker script is either missing some sections or not being used at all. There's nothing special about global variables that requires paging.
gsap wrote:What's a good design that allows me to keep paging to the minimum in the ASM layer and enable the C code?
Write a C program that runs on the build machine to generate some hardcoded page tables to map the memory where your kernel will be loaded to the virtual address where your kernel will be run, then include those page tables somewhere in your bootloader. Don't forget to identity-map the code that enables paging in those tables!
gsap wrote:In my research I looked at the GRUB 0.97 version that was mentioned @ [...]. It mixes the ASM code with C code just like I'm trying to, but it doesn't do anything with paging. How does it work then?
The linker is supposed to be able to put code and data at (almost) any address. Instead, you should be asking why the linker isn't doing that for you.
gsap wrote:I also have notes of the IDT failure here
Is the CPU supposed to be executing an INT3 instruction, or is the CPU executing garbage due to some other issue that may be completely unrelated to your IDT?
gsap
Posts: 2
Joined: Tue Sep 01, 2020 7:57 pm

Re: Early stages of OS, DIY bootloader, GCC 0x08040000 addre

Post by gsap »

Thank you for taking a look and considering my situation. I was stuck a bit.

Your responses mainly suggested that

Code: Select all

-Ttext 0x1000
should work. That was the first thing I tried, so I was quite confused. Today I realized that I was setting the linker options (and later applying the linker script) to the wrong step. I have two steps in this CMake setup. All the C code gets combined into an elf file. Then I convert .elf to .bin in the subsequent step that involves `ld`. I then append the .bin file to the bootsector. I was setting the linker options on the ld call during the .elf to .bin conversion instead of on compiling the .elf. Indeed that's all that Grub 0.97 is doing: just

Code: Select all

-nostdlib -Wl,-N -Wl,-Ttext -Wl,7C00
.

I fixed my code here. I can now make further progress and set more IDT handlers. I want a modern system that's going to be 64 bit and maybe RISC compatible too later.
Octocontrabass wrote:gsap wrote:
The OS is peculiar in that it uses its own bootloader

Fair warning: writing a bootloader that works in one emulator is easy, but writing a bootloader that works everywhere is not.
It'd be nice to learn. I read that one difference is that emulators initialize RAM while the hardware doesn't. What are other potential problems for making it run on the real hardware?
Octocontrabass wrote:I also have notes of the IDT failure here

Is the CPU supposed to be executing an INT3 instruction, or is the CPU executing garbage due to some other issue that may be completely unrelated to your IDT?
I was getting

Code: Select all

00014708026e[CPU0  ] interrupt(): gate descriptor is not valid sys seg (vector=0x01)
00014708026e[CPU0  ] interrupt(): gate descriptor is not valid sys seg (vector=0x0d)
00014708026e[CPU0  ] interrupt(): gate descriptor is not valid sys seg (vector=0x08)
and then Bochs would reboot. It's because the IDT address that the IDT table had was way off.

Here are some logs: before, after. Here's how that kernel.elf in the middle looks like.
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: Early stages of OS, DIY bootloader, GCC 0x08040000 addre

Post by Octocontrabass »

gsap wrote:I read that one difference is that emulators initialize RAM while the hardware doesn't. What are other potential problems for making it run on the real hardware?
Emulators may initialize registers differently as well. We frequently see examples posted here where the author forgot to initialize segment registers (including CS) or the flags register before using instructions that rely on them.

Emulators don't boot from USB. On an emulator, you never have to worry about the BIOS being "clever" and skipping your MBR or overwriting part of your code with a BPB you're not using. Recently I've discovered some Award BIOS versions that fail to boot the Windows 10 installer unless you adjust the partition table to use the second entry instead of the first; I haven't yet figured out exactly what it's doing.

Emulator BIOSes often allow things that don't work on some real hardware. For example, disk read calls on real hardware can have stricter limitations on how many sectors they can read or where the destination buffer may be located.

Emulators may not correctly emulate CPU features that aren't commonly used. For example, QEMU (without hardware acceleration) doesn't emulate segment limits.
Post Reply