OSDev.org

Posted: **Thu Jul 24, 2003 10:37 am**

I'm trying to get a list of things that must be done in booting an IBM PC x86 style, and loading a kernel. This is "pure mode" - no Grub or anything like that. Here's my list, note that everything's in order, from start to kernel main(). If any of these things are incorrect or there are things missing (even optional things) please reply with the corrections:

1. Start with 512-byte bootsector, which gets loaded at 0000:7C00h. Bootsectors start in 16-bit real mode. A pointer to the PnP check structure is located in ES:DI and the driver we booted from can be identified by examining the DL register. See:

http://www.phoenix.com/resources/specs-bbs101.pdf

XXX: If we booted from floppy, perhaps kill the motor now? See the Linux kernel sources.

2. The bootsector now enables the A20 gate so we can see more than 1MB of system memory.

3. Either slurp up the kernel into memory using int 13h or load a bigger and more capable boot programme and jump to it.

XXX: Relocation issues: what gets moved and where, and of course, why?

Or should step 4, going into 32-bit pmode, come before loading the kernel/boot programme??

4. Once in the boot programme or in the kernel just before main(), we optionally setup paging and switch into protected mode. Usually paging and protected mode are switched on at the same time by MOVing a new value into CR0 with the pmode and paging bits switched on. Disable interrupts now (??) or should we have done this earlier?

5. We are now running in 32-bit protected mode. Here somewhere we install our interrupt vectors by programming the APIC. (??)

6. After installing interrupts, re-enable interrupts and stomp all over BIOS area if using kernel's own device drivers. (?? Read Linux kernel source, it apparently claims some of the BIOS areas for itself, or have I misunderstood something?)

7. Jump to kernel main().

I hope this is a good list to work from. I could sure use with some input so we can create a reference document from which to work from, like a kernel developer's TODO list for i386+ processors. Perhaps also reproduce Ralf Brown's interrupt list in an appendix for easy reference while loading the kernel in 16-bit mode.

Posted: **Thu Jul 24, 2003 11:50 am**

XXX: If we booted from floppy, perhaps kill the motor now? See the Linux kernel sources.

My kernel is located at the sectors after the boot sector.
I need to read those first, and put them in the memory.
(I load them into 0x1000, which is ok.. as long as my kernel doesn't get bigger than 1meg)
After that I kill the motor.

(It makes no sense to turn it off, and back on again for reading)

Or should step 4, going into 32-bit pmode, come before loading the kernel/boot programme??

If you use gcc, you shuold load into 32-bit pmode.
because gcc outputs 32-bit binary.. oterwise it will crash
(i've seen that happen a million times)

Posted: **Thu Jul 24, 2003 12:06 pm**

If I were you, I would definitely write a two stage bootloader for loading kernel over 1 Meg, below 1 Meg might cause problems, remember grub does not let you load below 1 Meg.
RealMode
a) [RealMode] Load 1st stage.-> 512 bytes only loads second stage
b) [RealMode] Loads 2nd stage.
c) [RealMode] Turns on a20 line.
c) [RealMode] 2nd stage loads kernel.
BOOOM!!! Kernel jumps pmode
d) [PMODE] Kernel starts...
I think this will protect you some future problems that I had...

Posted: **Thu Jul 24, 2003 2:00 pm**

If I were you, I would definitely write a two stage bootloader for loading kernel over 1 Meg, below 1 Meg might cause problems, remember grub does not let you load below 1 Meg.

What (kind of) problems?
It's going to be a micro kernel, size shouldn't be a problem.

I'm not planning to use grub.

a) [RealMode] Load 1st stage.-> 512 bytes only loads second stage
b) [RealMode] Loads 2nd stage.

is the 2nd stage bigger than 512 bytes?
it makes no sense to load 512 bytes when you have 512 already.

Posted: **Thu Jul 24, 2003 2:27 pm**

The 2nd stage is used to write a bootloader if you need to use more than 512 bytes, so yes, the 2nd stage would be more than 512 bytes. If you try to write a bootloader and end up adding functionality that makes the size greater than 512 bytes, you make it a 2nd stage bootloader, and write a very small bootloader than just loads and executes the 2nd stage bootloader. If you can fit the whole process into 512 bytes, then obviously you just leave it as the whole bootloader and don't bother 2nd staging it.

Posted: **Thu Jul 24, 2003 2:53 pm**

James Buchanan wrote:
1. Start with 512-byte bootsector, which gets loaded at 0000:7C00h. Bootsectors start in 16-bit real mode.

watch out! this is a common mistake: you can get loaded at 0000:7c00 or at 07c0:0000. This will be the very same bytes, but you'll see that the fact CS may have different value will have an impact on how you'll access your datas. the only safe way is to choose whether you'll use one or the other and enforce your choice by a far jump at the start of your bootsector.

Posted: **Thu Jul 24, 2003 3:23 pm**

Surely 2nd stage is not 512 bytes long in fact that is why I am telling to write two stages: First one just loads second stage which is of any length to do initialization for loading kernel and then you can load kernel from second stage more easily. BTW I dont know anything about microkernels but size is always a problem I think. ;D I meant the problems like overwriting some reserved memory areas or overwriting your own stack, some silly mistakes that might make you lose time in the future.

Posted: **Sat Jul 26, 2003 6:03 am**

Thanks everyone for your input. Frank pointed out that I forgot about loading the kernel before killing the motor, duh

heh heh. Thanks Frank. Also Pype pointed out the potential pitfall with different architectures' views of 0000:7C00. Ouch. That's definately one to watch for. Pype, can you post example code to take care of this?

Here is a revised list:

1. Bootloader (sans Grub or anything like that for the moment, will do Grub and Lilo versions later). 512 bytes, 16-bit real mode, do a far jump to logical address 0000:7C00. A far jump as opposed to a near jump is necessary to force the loading of segment registers. A near jump is a jump inside a segment, a far jump loads different segment registers (inter-segment jump.) ? Is that correct?

2. Now you're at 0000:7C00, enable the A20 gate straight away. Time to load the kernel (or optionally a second stage bootloader.) Set stack pointer, arbitrary top of stack. Now, can we jump downward into lower memory and just overwrite this bootloader? We don't need it anymore.

3. Optional. Second stage bootloader. (TODO: where can this be loaded? Let's say it's a small real-time kernel but needs a second stage bootloader anyway, for whatever reason. Load at logical address 0000:1000 OK?) Erase bootloader above. Make sure stack pointer is in right position. Check those segment registers. All should have logical base 0x0000. Let's just use 16-bit offsets to slurp up the kernel and initialise before going 32-bit pmode. (?)

(Message was too long to post, continued...)

Posted: **Sat Jul 26, 2003 6:04 am**

(continued...)

4. Use BIOS calls to get the kernel. Load from floppy, hard disk, CD-ROM, whatever. How do we know the size of the kernel beforehand? If we used a linker script as Pype suggested to get the size of the kernel image, we need it before loading the kernel. How do we know how many sectors to read? Any suggestions? I suppose we just have to know beforehand a safe number of sectors to read.

5. Load kernel above the second stage bootloader. Say the second stage is 2048B (2K) and is located at 0000:1000. This means we read each sector off the boot disk/floppy/whatever and start placing them in order from 0000:1800. After the kernel is loaded (if we booted from floppy, kill motor now), we just jump into 0000:1800.

6. NOTE: Should we have re-programmed the interrupt controller before doing this? The BIOS in IBM PCs erroneously uses the first few interrupt vectors which are actually reserved by Intel for CPU exceptions. So we must reprogramme the APIC or 8259 (is this the same thing as '8259A'?) chip to relocate the BIOS's interrupts above the areas reserved for Intel CPUs, and install our own handlers for CPU exceptions. Any more info anyone?

6 or 7, depending: Now we are at 0000:1800, and let's assume that there is some startup code there we need to execute before jumping to the kernel main(), which, let's say, is at a known offset from the start of the kernel image at 512 bytes up, so it's at offset 1A00h.

7 or 8. Disable interrupts and reprogramme interrupt controller chip to relocate erroneous BIOS vectors above CPU interrupt vector no's, if we haven't done so already. Install our own dear handlers. (? How do we know where they are? Are the entry points at known offsets in the kernel code segment?)

8 or 9. After reprogramming or whatever, still before kernel main() in the kernel's assembly startup code, we now switch into 32-bit pmode, still in segmented mode for the time being. (NOTE!!! Insert stuff about the GDT, IDT and LDTs somewhere!!! Maybe around step 6??)

9 or 10. jmp _main

10 or 11. We are in the C code for kernel main(), and we can enable paging and set all segment selectors to the same values, so we can use a paged model of a flat address space shared by the kernel and all other processes. This assumes we are using paging in our small, real time kernel of course

Now we can get on with creating our processes and stuff.

How's that, a better revision? Any more input? In particular, where should we re-programme the interrupt controller to reloate BIOS interrupt vectors (and how many are there, do we just move them up in a chunk to a non-reserved area that Intel says is OK?) and where should we setup the GDT, IDT and LDT (if the LDT is used)? Now what about the TSS? We just allocate an extra segment (really the same segment selector with a different offset), somewhere in memory, where we can store task and processor state? Where should it go, some arbitrary location after the end of the kernel image? Also, we should allocate another "segment" for the kernel stack, and where should we allocate interrupt handler stacks? If we want a re-entrant kernel, we should have separate stacks for the ISRs (? Can we ever use the same stack for ISRs, regardless of re-entrancy?)

TODO: Discuss re-entrancy and fully pre-emptive kernels that can pre-empt even kernel tasks.

Thanks everyone for your input. GREATLY appreciated. My aim is to have a one-stop tutorial for bootstrapping and running a kernel on modern IBM and compatible PCs with Intel processors, with references to further reading (e.g. Intel manuals, this website and other OSdev websites, books, more tutorials, etc.)

OSDev.org

The steps in booting an x86 IBM PC

The steps in booting an x86 IBM PC

Re:The steps in booting an x86 IBM PC

Re:The steps in booting an x86 IBM PC

Re:The steps in booting an x86 IBM PC

Re:The steps in booting an x86 IBM PC

Re:The steps in booting an x86 IBM PC

Re:The steps in booting an x86 IBM PC

Re:The steps in booting an x86 IBM PC

Re:The steps in booting an x86 IBM PC