Critique of my bootload/kernel scheme
Posted: Fri Jun 10, 2011 9:16 pm
I decided I wanted to post my current boot scheme here and see what you all thought as to it's feasibility.
Current boot process is as follows:
1. PXE loads a file from TFTP server.
2. Do some PXE sanity checks
3. Do some CPU sanity checks (CPUID, etc.)
4. Enumerate E820 memory map
5. Enumerate PNP system devices (to get io/mem mapping for mobo resources) and store
6. Check for PCI BIOS (real mode remember) and search for NIC (which is obviously present)
7. Retrieve vendor/device IDs from PCI device (including multiple NICs)
8. Use PXE BIOS to load driver files from TFTP server (named: "drivers/8086100F.drv" for instance)
9. Switch to 32-bit pmode
10. Relocate NIC driver to it's final destination
11. Switch to long mode
12. Initialize NIC
13. Initialize UDP and TFTP functionality
14. TFTP final kernel and any other needed device drivers from TFTP server
Essentially what I've written right now is everything down through number 9 on this list. At that point I hit a snag.
I've not yet decided on a driver model to use for the kernel (or whether to use a HAL or not). Right now I'm considering
this code to be an "installation mode" program to retrieve needed drivers and setup the HD for system use so am not terribly
concerned about multitasking and the like just yet.
My current thought is to use ELF relocatable files for the drivers so that I can put them anywhere I need. I'm reading the
spec docs on ELF now and it looks like it's going to be quite a time investment to do that - although worthwhile in the long
term.
I guess I also have one question to ask. I've been reading the intel specs on paging along with many, many tutorials and to
my chagrin I am not having much success. I never dealt with paging in 32-bit mode for anything I've played with and I'm
finding the learning curve to be somewhat steeper than expected. Let's say you want to enable paging and use a flat memory
model in long mode. Do I have to allocate enough pte/pde entries to cover all of ram at one time? I understand that I can set
the not present bit to mark the page as not available but I mean the actual pde/pte/pml4 itself. It seems to me that this is going
to be a fairly substantial amount of ram to map pages - especially if one has to duplicate this structure for each task in user mode.
Note that the above question is based on the assumption that cr3 gets swapped on each task change - thus invalidating the tlb and
flushing caches. I hope someone can tell me I'm mistaken on this - I would love to be able to have a page table to load in cr3
that took only say 16k total - but from everything I'm reading that's not too likely.
Thanks in advance,
Mike
Current boot process is as follows:
1. PXE loads a file from TFTP server.
2. Do some PXE sanity checks
3. Do some CPU sanity checks (CPUID, etc.)
4. Enumerate E820 memory map
5. Enumerate PNP system devices (to get io/mem mapping for mobo resources) and store
6. Check for PCI BIOS (real mode remember) and search for NIC (which is obviously present)
7. Retrieve vendor/device IDs from PCI device (including multiple NICs)
8. Use PXE BIOS to load driver files from TFTP server (named: "drivers/8086100F.drv" for instance)
9. Switch to 32-bit pmode
10. Relocate NIC driver to it's final destination
11. Switch to long mode
12. Initialize NIC
13. Initialize UDP and TFTP functionality
14. TFTP final kernel and any other needed device drivers from TFTP server
Essentially what I've written right now is everything down through number 9 on this list. At that point I hit a snag.
I've not yet decided on a driver model to use for the kernel (or whether to use a HAL or not). Right now I'm considering
this code to be an "installation mode" program to retrieve needed drivers and setup the HD for system use so am not terribly
concerned about multitasking and the like just yet.
My current thought is to use ELF relocatable files for the drivers so that I can put them anywhere I need. I'm reading the
spec docs on ELF now and it looks like it's going to be quite a time investment to do that - although worthwhile in the long
term.
I guess I also have one question to ask. I've been reading the intel specs on paging along with many, many tutorials and to
my chagrin I am not having much success. I never dealt with paging in 32-bit mode for anything I've played with and I'm
finding the learning curve to be somewhat steeper than expected. Let's say you want to enable paging and use a flat memory
model in long mode. Do I have to allocate enough pte/pde entries to cover all of ram at one time? I understand that I can set
the not present bit to mark the page as not available but I mean the actual pde/pte/pml4 itself. It seems to me that this is going
to be a fairly substantial amount of ram to map pages - especially if one has to duplicate this structure for each task in user mode.
Note that the above question is based on the assumption that cr3 gets swapped on each task change - thus invalidating the tlb and
flushing caches. I hope someone can tell me I'm mistaken on this - I would love to be able to have a page table to load in cr3
that took only say 16k total - but from everything I'm reading that's not too likely.
Thanks in advance,
Mike