Page 1 of 1

Critique of my bootload/kernel scheme

Posted: Fri Jun 10, 2011 9:16 pm
by kdx7214
I decided I wanted to post my current boot scheme here and see what you all thought as to it's feasibility.

Current boot process is as follows:

1. PXE loads a file from TFTP server.
2. Do some PXE sanity checks
3. Do some CPU sanity checks (CPUID, etc.)
4. Enumerate E820 memory map
5. Enumerate PNP system devices (to get io/mem mapping for mobo resources) and store
6. Check for PCI BIOS (real mode remember) and search for NIC (which is obviously present)
7. Retrieve vendor/device IDs from PCI device (including multiple NICs)
8. Use PXE BIOS to load driver files from TFTP server (named: "drivers/8086100F.drv" for instance)
9. Switch to 32-bit pmode
10. Relocate NIC driver to it's final destination
11. Switch to long mode
12. Initialize NIC
13. Initialize UDP and TFTP functionality
14. TFTP final kernel and any other needed device drivers from TFTP server

Essentially what I've written right now is everything down through number 9 on this list. At that point I hit a snag.
I've not yet decided on a driver model to use for the kernel (or whether to use a HAL or not). Right now I'm considering
this code to be an "installation mode" program to retrieve needed drivers and setup the HD for system use so am not terribly
concerned about multitasking and the like just yet.

My current thought is to use ELF relocatable files for the drivers so that I can put them anywhere I need. I'm reading the
spec docs on ELF now and it looks like it's going to be quite a time investment to do that - although worthwhile in the long
term.

I guess I also have one question to ask. I've been reading the intel specs on paging along with many, many tutorials and to
my chagrin I am not having much success. I never dealt with paging in 32-bit mode for anything I've played with and I'm
finding the learning curve to be somewhat steeper than expected. Let's say you want to enable paging and use a flat memory
model in long mode. Do I have to allocate enough pte/pde entries to cover all of ram at one time? I understand that I can set
the not present bit to mark the page as not available but I mean the actual pde/pte/pml4 itself. It seems to me that this is going
to be a fairly substantial amount of ram to map pages - especially if one has to duplicate this structure for each task in user mode.

Note that the above question is based on the assumption that cr3 gets swapped on each task change - thus invalidating the tlb and
flushing caches. I hope someone can tell me I'm mistaken on this - I would love to be able to have a page table to load in cr3
that took only say 16k total - but from everything I'm reading that's not too likely.

Thanks in advance,
Mike

Re: Critique of my bootload/kernel scheme

Posted: Fri Jun 10, 2011 11:39 pm
by rdos
In a more mature OS, there is no need to get PNP and PCI information from real-mode BIOS. It is pretty simply to write code for those things.

Re: Critique of my bootload/kernel scheme

Posted: Sat Jun 11, 2011 3:30 am
by Brendan
Hi,
kdx7214 wrote:Current boot process is as follows:

1. PXE loads a file from TFTP server.
2. Do some PXE sanity checks
3. Do some CPU sanity checks (CPUID, etc.)
4. Enumerate E820 memory map
5. Enumerate PNP system devices (to get io/mem mapping for mobo resources) and store
6. Check for PCI BIOS (real mode remember) and search for NIC (which is obviously present)
7. Retrieve vendor/device IDs from PCI device (including multiple NICs)
8. Use PXE BIOS to load driver files from TFTP server (named: "drivers/8086100F.drv" for instance)
9. Switch to 32-bit pmode
10. Relocate NIC driver to it's final destination
11. Switch to long mode
12. Initialize NIC
13. Initialize UDP and TFTP functionality
14. TFTP final kernel and any other needed device drivers from TFTP server
The NIC device driver you initialise at step 12 is a waste of time. It will need to run without relying on anything provided by the kernel, which means it'd need to duplicate half the kernel to run and half the assumptions it will need to make will become invalid once the kernel starts. Why not avoid all the hassle and load the kernel at step 8?

For example:

1. PXE loads a boot code from TFTP server.
2. Do some PXE sanity checks
3. Do some CPU sanity checks (CPUID, etc.)
4. Enumerate E820 memory map
5. Enumerate PNP system devices (to get io/mem mapping for mobo resources) and store
6. Use PXE BIOS to load driver files and kernel from TFTP server (named: "drivers/8086100F.drv" for instance)
7. Switch to 32-bit pmode
8. Switch to long mode
9. Start kernel
10. Retrieve vendor/device IDs from all PCI devices (including multiple NICs)
11. Start drivers loaded in step 6

Now think about booting from other devices (e.g. hard disk, CD) in addition to PXE:

1. firmware loads a boot code from "wherever"
2. Do some sanity checks
3. Do some CPU sanity checks (CPUID, etc.)
4. Enumerate E820 memory map
5. Enumerate PNP system devices (to get io/mem mapping for mobo resources) and store
6. Use firmware to load driver files and kernel from "wherever"
7. Switch to 32-bit pmode
8. Switch to long mode
9. Start kernel
10. Retrieve vendor/device IDs from all PCI devices (including multiple NICs)
11. Start drivers loaded in step 6


Cheers,

Brendan

Re: Critique of my bootload/kernel scheme

Posted: Sat Jun 11, 2011 2:42 pm
by kdx7214
Hmmm. Well my first thought is that while that could work, the coding is going to be a right pain in the rear. I had planned to have a different boot loader for each type of device. I don't see how it would be possible to write a 512-byte boot sector that could handle all of PXE, floppy, IDE, SATA, CDROM, etc. not to mention the combination of file systems and MBR/GPT and so forth. How did you handle that issue with your OS Brendan?

I'd be keen for suggestions. I'm reading everything I can find. I've been toying around with some pieces and parts of this for about 10 years but this is the first time I've attempted to put things together beyond the toy-code stage. I'm definitely open to new ideas - but I want to attempt to avoid some of the bad design decisions I've seen in other OSs (i.e. linux with compiled in drivers).

Thanks,
Mike

Re: Critique of my bootload/kernel scheme

Posted: Sat Jun 11, 2011 7:48 pm
by Brendan
Hi,
kdx7214 wrote:Hmmm. Well my first thought is that while that could work, the coding is going to be a right pain in the rear. I had planned to have a different boot loader for each type of device. I don't see how it would be possible to write a 512-byte boot sector that could handle all of PXE, floppy, IDE, SATA, CDROM, etc. not to mention the combination of file systems and MBR/GPT and so forth.
It's not possible to write one boot loader for everything. In general, for PC BIOS I'd have at least 4 of them:
  1. PXE
  2. "no emulation" El Torito/CD
  3. Floppy (no partitions, older "int 0x13" functions)
  4. Normal Disk (MBR partitions, newer "int 0x13 extension" functions)
  5. Hybrid Disk (GPT partitions, newer "int 0x13 extension" functions)
kdx7214 wrote:How did you handle that issue with your OS Brendan?
For my project, I'm continuing to move towards a "boot abstraction layer" approach. The idea is that the boot loader is firmware specific (and for PC BIOS, boot device specific), while the boot abstraction layer isn't. The boot abstraction layer calls functions in the boot loader to get the memory map, load things from the boot device, etc.

For boot loaders, my plan is to have:
  1. PC BIOS, PXE
  2. PC BIOS, "no emulation" El Torito/CD
  3. PC BIOS, floppy (no partitions, older "int 0x13" functions)
  4. PC BIOS, normal Disk (MBR partitions, newer "int 0x13 extension" functions)
  5. PC BIOS, hybrid Disk (GPT partitions, newer "int 0x13 extension" functions)
  6. PC BIOS, GRUB-legacy
  7. 32-bit UEFI, PXE
  8. 32-bit UEFI, "EFI system disk image" (covers floppy, disk and El Torito/CD)
  9. 64-bit UEFI, PXE
  10. 64-bit UEFI, "EFI system disk image" (covers floppy, disk and El Torito/CD)
  11. Direct from ROM (where the OS is installed in ROM, and special boot code for different chipset/s is used to boot it)
  12. "Fast reboot" (where the OS acts as a boot loader to boot another version of itself - like "kexec()" in Linux)
I've done all of these before (in previous versions of my OS), except for the UEFI boot loaders and the "fast reboot". It's UEFI that I'm (slowly, meant to be) working on now.

The basic idea is:
  1. something (firmware?) loads a boot code from "wherever"
  2. boot code loads "boot abstraction layer" (BAL) from "wherever"
  3. boot code starts BAL
  4. BAL asks boot code for memory map and initialises its memory manager
  5. BAL asks boot code to load boot image
  6. BAL decompresses boot image (if necessary)
  7. BAL determines what sort of user interface it should use (video, serial port, etc)
  8. (for video) BAL asks boot code to setup a video mode and a framebuffer
  9. BAL gets a suitable "user interface" module (for video, serial port, etc) from boot image and starts it
  10. BAL gets "CPU detection" module from boot image, and uses it to detect CPU details
  11. BAL determines what type of kernel to use (32-bit, PAE, long mode), depending on supported CPU features, etc
  12. BAL gets "kernel setup" module from boot image for selected kernel type
  13. BAL passes control to "kernel setup" module
  14. "kernel setup" module gets "kernel modules" from boot image, and initialises each of them
  15. "kernel setup" module passes control to kernel
In general, I'll be able to boot the same BAL and the same boot image (containing the same kernels, drivers, etc) on everything (just with different boot code to suit the specific case). In some cases (El Torito/CD and hybrid MBR/GPT disks, UEFI) you can have multiple boot loaders and let the firmware choose the right boot loader to use, so it's possible to (for e.g.) have a single universal CD that works on everything the OS supports (rather than many different CDs with different boot code, different kernels, etc).


Cheers,

Brendan

Re: Critique of my bootload/kernel scheme

Posted: Tue Jun 14, 2011 9:09 am
by Casm
Personally I would let each device driver walk the PCI bus for itself. Perhaps with the aid of a kernel service.