Hi,
An analysis of "boot.asm"...
In NASM, directives have 2 forms - there's the "user-level form" (e.g. "org 0x7C00") and the "primitive form" (e.g. "[org 0x7c00]"). In general the "user-level form" is like a macro/wrapper around the primitive form where that wrapper may do additional things to ensure sanity and correct behaviour. For some directives both forms are equivalent, and you should use the "user level" form in case future versions of NASM change the behaviour of the primitive form somehow. For other directives they are not equivalent, and you should use the "user level" form unless you've got a very specific reason to use the lower level "primitive form".
Code: Select all
KERNEL_ADDRESS equ 0x100000
cli
lgdt [gdt_descriptor]
This "LGDT" instruction relies on whatever value the BIOS felt like leaving in the DS segment register. If the BIOS felt like leaving the value 0xFFFF in DS then the next piece of code might not like the GDT you've loaded.
Code: Select all
;Switch to PM
mov eax, cr0
or eax, 0x1
mov cr0, eax
jmp 0x8:init_pm
This is a design flaw. At a minimum, a boot loader must get things like the memory map from the firmware, and you can't do that if you've switched to protected mode (or long mode) for no sane reason.
(Optional) I personally think that failing to check if the CPU supports a feature (like 32-bit protected mode) before relying on that feature is a sign of poor quality code. I would check if the CPU supports protected mode/long mode and display some sort of "CPU is too old" error message if it doesn't, so that if the user tries to boot it on an old computer it doesn't blow up in their face with no indication of why.
Code: Select all
[bits 32]
init_pm :
mov ax, 0x10
mov ds, ax
mov ss, ax
mov es, ax
mov fs, ax
mov gs, ax
call build_page_tables
At this point you're using whatever value the firmware felt like leaving in ESP. This value may be something like 0xFFFF7C00 (where only the lowest 16-bits were used in real mode, but now you're using the full 32 bits).
Also I'd recommend loading an IDT with "limit = 0" before all CPU mode switches; so that if an NMI occurs at the wrong time the CPU correctly resets (which might seem bad, but there's no way to correctly handle NMI at this stage and no data to lose; and it's far better than random/undefined behaviour).
You can cut and paste the code from "build_page_tables" here to remove the call and return.
Code: Select all
build_page_tables:
;PML4 starts at 0x1000
;PML4 @ 0x1000
mov eax, 0x2000 ;PDP base address
or eax, 0b11 ;P and R/W bits
mov ebx, 0x1000 ;MPL4 base address
mov [ebx], eax
;PDP @ 0x2000; maps 64Go of RAM
mov eax, 0x3000 ;PD base address
mov ebx, 0x2000 ;PDP physical address
mov ecx, 64 ;64 PDP
build_PDP:
or eax, 0b11
mov [ebx], eax
add ebx, 0x8
add eax, 0x1000 ;next PD page base address
loop build_PDP
;PD @ 0x3000 (ends at 0x4000, fits below 0x7c00)
; 1 entry maps a 2MB page, the 1st starts at 0x0
mov eax, 0x0 ;1st page physical base address
mov ebx, 0x3000 ;PD physical base address
mov ecx, 512
build_PD:
or eax, 0b10000011 ;P + R/W + PS (bit for 2MB page)
mov [ebx], eax
add ebx, 0x8
add eax, 0x200000 ;next 2MB physical page
loop build_PD
;(tables end at 0x4000 => fits before Bios boot sector at 0x7c00)
You've forgotten to set unused PML4 entries and PDPT entries to zero. The "or eax, 0b11" and the "or eax, 0b10000011" can be shifted out of their loops. I really do doubt that you need to map 64 GiB of "who knows what" at this point.
In general, it's a bad idea to use 2 MiB pages for the area from 0x00000000 to 0x001FFFFF. CPUs typically store "final caching method" in TLB entries, and part of this area is "write back" and part is typically "uncached" (and some may be "write protected"). A good CPU will (internally) split the 2 MiB page into many "4 KiB TLB entries" to work around this. Not all CPUs are good and it's like begging to suffer from CPU errata.
On the subject of CPU errata; there are some AMD CPUs that may corrupt memory when trying to update the accessed/dirty flags. To avoid this I'd ensure these flags are already set when you create paging structure entries so that the CPU doesn't need to set these flags when the pages are accessed/modified; which avoids the problems with AMD CPU errata and also improves performance slightly on all other CPUs. Your kernel can worry about detecting CPU errata and figure out what to do with accessed/dirty flags later on (e.g. before needing these flags to implement swap space support).
Code: Select all
;Enable PAE
mov eax, cr4
or eax, 1 << 5
mov cr4, eax
;# Optional : Enable global-page mechanism by setting CR0.PGE bit to 1
mov eax, cr4
or eax, 1 << 7
mov cr4, eax
For a CPU that supports long mode (and therefore must support PGE) I wouldn't necessarily consider PGE optional. In that case; this code could be 2 instructions as there's no real reason to care what value was in CR4 beforehand (e.g. "mov eax,(1 <<5) | (1 << 7)" and "mov cr4,eax").
Code: Select all
[bits 64]
init_lm:
mov ax, 0x10
mov fs, ax ;other segments are ignored
mov gs, ax
For "other segments", only their base and limit are ignored, and their attributes are not ignored. Also the "visible value" may be important later (e.g. interrupts handlers using IST that push SS and then pop SS during IRET; where the "visible value" of SS is trash that got left over from protected mode). I'd recommend loading SS, DS and ES regardless of whether it's necessary or not.
Code: Select all
;=============================================================================
; ATA read sectors (CHS mode)
; Max head index is 15, giving 16 possible heads
; Max cylinder index can be a very large number (up to 65535)
; Sector is usually always 1-63, sector 0 reserved, max 255 sectors/track
; If using 63 sectors/track, max disk size = 31.5GB
; If using 255 sectors/track, max disk size = 127.5GB
This is several massive design flaws all rolled into one.
For storage devices; the boot device may be one of about 200 different SCSI controllers, or (more likely) SATA using AHCI and not ATA, or USB/flash, or USB hard disk, or "El Torito hard disk emulation", or possibly something more exotic (e.g.
iSCSI). Unless you're going to put several hundred device drivers in your 512 byte boot sector you must use the BIOS disk services.
For "ATA" there's about 5 different DMA modes and another 5 different PIO modes. You need to figure out what the controller supports and what the disk supports and use something that both support. You can't just assume they both support whatever (likely deprecated) PIO mode you're using.
Also for "ATA", you can not assume the disk has 16 heads and 63 sectors per track. You need to get this information from the disk drive itself. However, for "slightly modern-ish" disk drives using CHS (and not LBA) is probably a very bad idea in the first place (unless the disk drive doesn't support LBA, but that's very unlikely).
In terms of error handling; the user (and/or programmer via. bug reports from the user) needs to know what went wrong; and needs to be able to tell the difference between (e.g.) "address mark not found" and "CRC error" and whatever else. Your error handling is entirely non-existent and therefore completely inadequate.
Finally, virtually all hard disks are partitioned and you're not taking partitioning into account. The MBR should give you a "pointer to partition table entry" that must be used to determine which sector is the first sector in your partition. The MBR should also tell you the BIOS "disk device number" to use, and this may not be the first hard drive.
Note: I also have no idea why you've loaded the kernel into user-space.
Cheers,
Brendan